No, I think he's suggesting that there is no catastrophe. The Maldives are 1 metre above sea-level because they are coral atolls. When the sea-levels rise (as they have done in the past, the coral simply grows upwards - when the sea-level falls, the coral erodes, leaving them constantly about a metre above sea-level.
Well shit, maybe you should tell the leaders of the Maldives about that! I mean, they've spent, like, millions of dollars trying to find a solution. I guess if they only thought to ask you they could have saved a lot of money!
* Airlines now generally require you to check your bags 45 minutes in advance. I was once told that I could not check my bag because I arrived 43 minutes in advance. So your idea of spending 30 minutes between arriving at the airport and takeoff is bogus -- it's more like an hour. Meanwhile, trains do not require the same level of security (you can't hijack a train and drive it into the World Trade Center) nor the same ridiculous bag-checking rules. I'd guess the time between arriving at the train station and departure would be more like ten minutes.
* SFO -> LAX is 1:20 flight time (source: southwest.com).
* You are comparing the time to drive to the airport with the time to take local mass transit to the train station. It takes me 30 minutes to drive to SFO, but it takes an hour and a half (with two transfers) to get there via caltrain/BART. Meanwhile it would take me 15 minutes to get to the proposed Palo Alto / Redwood City bullet train station via Caltrain.
Of course, I'm only one data point, but since there will be many more train stations than there are airports, presumably it should always be easier to get to the train station than the airport.
Given the above points, my guess is that taking the train will actually be *faster* for most people. It will probably also be cheaper, provide more legroom, and/or have better service. Therefore, I don't see why you would *not* want to take the train.
If you were using gmail, you wouldn't have to go through all that effort -- Google would be doing it for you.:) And yeah, the chance of an earthquake leveling your house -- and thus destroying your current e-mail archives -- is much, much greater than the chance that a clueless judge will order your gmail account disabled. (And you can easily create a local backup of your gmail if you want, etc.)
And they actually acted like it was extremely urgent on the phone. It was after hours so they told me to go to urgent care instead of wait a day.
I would just assume that the person who answered the phone didn't know what she was talking about, but she actually put me on hold while she went and asked someone else what I should do. You'd think it would be pretty hard to find one, let alone two people in a family medicine department who don't know the correct response to patients who think they have swine flu, so I'd have to assume that what she told me was actually their policy.
Can't wait to find out how much they billed my insurance for me to needlessly talk to a doctor for three minutes.
I came back from PAX with a fever. When I heard about the H1N1 outbreak I thought "Eh, I'll give my doctor's office a call to see if they want to track me as a statistic or something.". So I did, and they said "If you thing you might have swine flu then you should come to urgent care right away wearing a mask!". I was confused by this but did as told. After an hour of waiting around with an uncomfortable mask on, the doctor told me that maybe I have H1N1 but it wasn't worth testing specifically since it's no different from regular flu. To which I said "That's what I thought. So why did you make me come in?". He didn't have a good answer.
Just doing my part to contribute to skyrocketing health care costs...
You say 80% of deaths have been people under 65, and then you multiply 0.8 into the final probability. This doesn't follow. 80% is the probability that a person who died was under 65. This is not the same as the probability that someone who is under 65 will die.
For example, say there were 1000 cases, 45 of which died (thus 0.45% death rate). Of those 45, 9 would be older than 65 and 36 younger, for an 80-20 split. But let's imagine that by some freak chance, those 9 elders were the *only* elders of the 1000 cases, meaning 100% of people over 65 who got swine flu died. This is the case which minimizes the probability of death for everyone else. In this case the death rate for younger-than-65s would be 36/9991 = 0.3603%. Yet your calculation of 0.45% * 80% yields 0.36%, which is even less than that minimum./me is wasting time while recovering from flu caught at PAX.
I agree. The author says that the colors of the Chrome logo were inspired by the Windows logo. That's ridiculous -- the colors obviously came from the Google logo. Google uses those four colors in almost all its logos. Obviously the author did not actually do any research.
Red, green, and blue are the primary colors of light. They are the primary colors because they correspond to the three color receptors in our eyes.
Cyan, magenta, and yellow are the primary colors of ink. They are the *opposites* of red, green, and blue, respectively. Ink works subtractively -- you start from white and remove color -- while light works additively -- you start from black and add. This is why their primary colors are opposites.
The primary colors of ink are often simplified to blue, red, and yellow instead of cyan, magenta, and yellow since children don't usually recognize colors like cyan and magenta.
There's nothing physically special about the primary colors; it's the receptors in our eyes that make them primary. Interestingly, some people have a genetic mutation that gives them an additional color receptor -- amber -- which allows them to distinguish colors better than the rest of us. To them, there are actually four primary colors, and colors on TV screens and most printed images look wrong.
As indicated in the comment at the top of that file, that code was generated by the Protocol Buffers compiler, protoc. You aren't supposed to edit that -- edit the.proto file instead and regenerate. I'm not really sure why they checked the generated code into VCS -- normally only the.proto would be checked in and protoc would be invoked at build time.
Also you fail to note that those "daily occurrences" are only there because of other, much smaller, geothermal plants next door.
Do you have a link to back that claim? Earthquakes in the 1-4 range really are a natural daily occurrence all across CA and many other places in the world. Check out the USGS real-time map:
Re:Can we please just get the US out of the UN?
on
UN Attacks Free Speech
·
· Score: 5, Insightful
The UN helps keep the world stable. A stable world is good for business. What's good for business is good for the US. Most of what the UN does is not headline-grabbing stuff, but it's incredibly important.
Besides, how ridiculous would it be for the UN to be hosted by the only broadly-recognized nation in the world that wasn't a member (which is what the US would be if it pulled out)?
That said, no one takes the UN "Human Rights Council" seriously, because it's currently stacked with nations that have pitiful human rights records. This particular vote has been anticipated for some time now.
If you want to understand better how the world works, I highly recommend reading The Economist.
Sounds like you tested _DOM_ (tree-building) xml processing, to in-place binary data extraction
I compared DOM parsing with protocol buffer parsing. They are equivalent. Protobuf parsing constructs a single message object representing the entire parsed message, and it currently does not do "in-place binary data extraction", although I've been thinking of adding that in a future version.
SAX would be comparable to using a protobuf CodedInputStream and reading fields from it manually. It would be faster but a lot less convenient to use in most cases.
Note that for a fair comparison with SAX, you would have to actually construct an object representing the document based on the SAX parsing events, not just use noops for all the callbacks.
Protocol Buffers essentially is a binary format for JSON. This wasn't an actual design goal, but I don't know that anything would be done significantly differently if it were. Note that you can trivially write a JSON encoder/decoder that works with arbitrary protocol message classes, using protobuf reflection.
To me what this demonstrates is premature optimization. Instead, first use a simple text format like JSON then if that is too large compress it. Then if that is too slow send it in binary.
The optimization was not "premature". We actually do need the speed and space.
It may very well be that most users don't need the speed, but switching formats down the road is pretty hard. It's not exactly like optimizing implementation details -- this is the format you use to communicate with other entities that you may or may not control.
Note that protocol buffers give you the equivalent of a DOM -- an object representing the parsed message. This is usually much more convenient to use than SAX parsing (depending on your use case, of course). So, I'm not sure if comparing against SAX is necessarily fair. Though I think protocol buffers would still win just because there is less to parse and parsing length-delimited chunks is faster than character-delimited.
This is 49 bytes: <person name="John Doe" email="jdoe@example.com">
The equivalent Protocol Buffer is 28 bytes. In addition to the 24 bytes of text, each field has a 1-byte tag and a 1-byte length. The example you quoted is protocol buffer *text* format, which is used mostly for debugging, not for actual interchange.
The example they give is for a small set of data, and percentages vary more dramatically as sample sizes decrease.
We wanted to give an idea of the speed without trying to boast too much or look like we were directly challenging anyone. Of course every news outlet has chosen to highlight the speed comment -- including the numbers which were intended to be ballpark figures -- more than was intended, but I guess that isn't surprising.
I agree that the tiny "person" example is not a good benchmark case. It was intended as a usage example, not a speed example, but I stuck the speed numbers in there just meaning to give people a vague idea of the difference. The "20-100 times faster" comment is based on testing a variety of formats -- both unrealistic ones and real-life formats used in our search pipeline -- against programmatically generated XML equivalents (which may or may not themselves be realistic, though they contain the same data with the same structure). libxml2 was used for parsing XML. I don't really know how libxml2's speed compares to other XML parsers, but I didn't have a lot of time to investigate. The 20x faster number comes from the largest data set (~100k-ish) while the 100x number comes from a very small message. The most realistic case was about 50x. Sorry that I cannot provide exact details of the benchmark setup since many of the test cases were proprietary internal formats.
In any case, I'm hoping that some independent source conducts some tests because I think anything we produced would probably have unintentional biases in it. Of course, I'll update the numbers in the docs if they turn out to be wildly off-base.
XML and this protocol differ in only one way: one is plain text, the other is binary.
They also differ in that XML has a *lot* more features. For example, protocol buffers have no concept of entities, or even interleaved text. Those can be useful when your data is a text document with markup -- e.g. HTML -- but they tend to get in the way when you just want to pass around something like a struct.
It's worth noting that writing alternative encoders and decoders for protocol buffers is really easy (since protocol message objects have a reflection interface, even in C++), so you can use the friendly generated code without being tied to the format.
What happens to your binary serialization if you add a new field to your class? Can you still read serialized objects created by older versions of your software? (Honest question; I don't know how C# serialization works.) Also, can you read your data in other programming languages?
Structurally Protocol Buffers are similar to JSON, yes. In fact, you could use the classes generated by the Protocol Buffer compiler together with some code that encodes and decodes them in JSON. This is something some Google projects do internally since it's useful for communicating with AJAX apps. Writing a custom encoding that operates on arbitrary protocol buffer classes is actually pretty easy since all protocol message objects have a reflection interface (even in C++).
The advantage of using the protocol buffer format instead of JSON is that it's smaller and faster, but you sacrifice human-readability.
YAML and JSON are text-based formats intended for human readability. Protocol Buffers are binary, and therefore smaller and faster, but not human-readable.
Also, the protocol buffer compiler provides friendly data access objects. You could actually use these with JSON or YAML, by just writing a new encoder and decoder (which is easy to do).
Wow! They've invented fixed position data files. What will they invent next, a cool new programming language called RPG?
The article is actually completely wrong there. The protocol buffer binary format uses tag/value pairs, not fixed positions. Parsers simply ignore any tag they don't recognize and move on to the next.
Google sets an internal threshold on search volume, and this threshold could be set for reasons that range anywhere from Google Trends is still experimental to Google not wanting to provide data on how it builds its massive search index for emerging search terms.
Or maybe for privacy reasons? Some search queries implicitly reveal the identity of the person making them. Such queries are naturally low-volume, so refusing to show low-volume queries is an effective way to protect the privacy of the searchers.
Well shit, maybe you should tell the leaders of the Maldives about that! I mean, they've spent, like, millions of dollars trying to find a solution. I guess if they only thought to ask you they could have saved a lot of money!
Your source sucks.
Here's a better one.
That's not even a remotely fair comparison:
* Airlines now generally require you to check your bags 45 minutes in advance. I was once told that I could not check my bag because I arrived 43 minutes in advance. So your idea of spending 30 minutes between arriving at the airport and takeoff is bogus -- it's more like an hour. Meanwhile, trains do not require the same level of security (you can't hijack a train and drive it into the World Trade Center) nor the same ridiculous bag-checking rules. I'd guess the time between arriving at the train station and departure would be more like ten minutes.
* SFO -> LAX is 1:20 flight time (source: southwest.com).
* You are comparing the time to drive to the airport with the time to take local mass transit to the train station. It takes me 30 minutes to drive to SFO, but it takes an hour and a half (with two transfers) to get there via caltrain/BART. Meanwhile it would take me 15 minutes to get to the proposed Palo Alto / Redwood City bullet train station via Caltrain.
Of course, I'm only one data point, but since there will be many more train stations than there are airports, presumably it should always be easier to get to the train station than the airport.
Given the above points, my guess is that taking the train will actually be *faster* for most people. It will probably also be cheaper, provide more legroom, and/or have better service. Therefore, I don't see why you would *not* want to take the train.
If you were using gmail, you wouldn't have to go through all that effort -- Google would be doing it for you. :) And yeah, the chance of an earthquake leveling your house -- and thus destroying your current e-mail archives -- is much, much greater than the chance that a clueless judge will order your gmail account disabled. (And you can easily create a local backup of your gmail if you want, etc.)
You aren't subject to court orders?
And they actually acted like it was extremely urgent on the phone. It was after hours so they told me to go to urgent care instead of wait a day.
I would just assume that the person who answered the phone didn't know what she was talking about, but she actually put me on hold while she went and asked someone else what I should do. You'd think it would be pretty hard to find one, let alone two people in a family medicine department who don't know the correct response to patients who think they have swine flu, so I'd have to assume that what she told me was actually their policy.
Can't wait to find out how much they billed my insurance for me to needlessly talk to a doctor for three minutes.
I came back from PAX with a fever. When I heard about the H1N1 outbreak I thought "Eh, I'll give my doctor's office a call to see if they want to track me as a statistic or something.". So I did, and they said "If you thing you might have swine flu then you should come to urgent care right away wearing a mask!". I was confused by this but did as told. After an hour of waiting around with an uncomfortable mask on, the doctor told me that maybe I have H1N1 but it wasn't worth testing specifically since it's no different from regular flu. To which I said "That's what I thought. So why did you make me come in?". He didn't have a good answer.
Just doing my part to contribute to skyrocketing health care costs...
You say 80% of deaths have been people under 65, and then you multiply 0.8 into the final probability. This doesn't follow. 80% is the probability that a person who died was under 65. This is not the same as the probability that someone who is under 65 will die.
For example, say there were 1000 cases, 45 of which died (thus 0.45% death rate). Of those 45, 9 would be older than 65 and 36 younger, for an 80-20 split. But let's imagine that by some freak chance, those 9 elders were the *only* elders of the 1000 cases, meaning 100% of people over 65 who got swine flu died. This is the case which minimizes the probability of death for everyone else. In this case the death rate for younger-than-65s would be 36/9991 = 0.3603%. Yet your calculation of 0.45% * 80% yields 0.36%, which is even less than that minimum. /me is wasting time while recovering from flu caught at PAX.
I agree. The author says that the colors of the Chrome logo were inspired by the Windows logo. That's ridiculous -- the colors obviously came from the Google logo. Google uses those four colors in almost all its logos. Obviously the author did not actually do any research.
Red, green, and blue are the primary colors of light. They are the primary colors because they correspond to the three color receptors in our eyes.
Cyan, magenta, and yellow are the primary colors of ink. They are the *opposites* of red, green, and blue, respectively. Ink works subtractively -- you start from white and remove color -- while light works additively -- you start from black and add. This is why their primary colors are opposites.
The primary colors of ink are often simplified to blue, red, and yellow instead of cyan, magenta, and yellow since children don't usually recognize colors like cyan and magenta.
There's nothing physically special about the primary colors; it's the receptors in our eyes that make them primary. Interestingly, some people have a genetic mutation that gives them an additional color receptor -- amber -- which allows them to distinguish colors better than the rest of us. To them, there are actually four primary colors, and colors on TV screens and most printed images look wrong.
As indicated in the comment at the top of that file, that code was generated by the Protocol Buffers compiler, protoc. You aren't supposed to edit that -- edit the .proto file instead and regenerate. I'm not really sure why they checked the generated code into VCS -- normally only the .proto would be checked in and protoc would be invoked at build time.
Do you have a link to back that claim? Earthquakes in the 1-4 range really are a natural daily occurrence all across CA and many other places in the world. Check out the USGS real-time map:
http://earthquake.usgs.gov/eqcenter/recenteqsus/
The UN helps keep the world stable. A stable world is good for business. What's good for business is good for the US. Most of what the UN does is not headline-grabbing stuff, but it's incredibly important.
Besides, how ridiculous would it be for the UN to be hosted by the only broadly-recognized nation in the world that wasn't a member (which is what the US would be if it pulled out)?
That said, no one takes the UN "Human Rights Council" seriously, because it's currently stacked with nations that have pitiful human rights records. This particular vote has been anticipated for some time now.
If you want to understand better how the world works, I highly recommend reading The Economist.
I compared DOM parsing with protocol buffer parsing. They are equivalent. Protobuf parsing constructs a single message object representing the entire parsed message, and it currently does not do "in-place binary data extraction", although I've been thinking of adding that in a future version.
SAX would be comparable to using a protobuf CodedInputStream and reading fields from it manually. It would be faster but a lot less convenient to use in most cases.
Note that for a fair comparison with SAX, you would have to actually construct an object representing the document based on the SAX parsing events, not just use noops for all the callbacks.
Protocol Buffers essentially is a binary format for JSON. This wasn't an actual design goal, but I don't know that anything would be done significantly differently if it were. Note that you can trivially write a JSON encoder/decoder that works with arbitrary protocol message classes, using protobuf reflection.
The optimization was not "premature". We actually do need the speed and space.
It may very well be that most users don't need the speed, but switching formats down the road is pretty hard. It's not exactly like optimizing implementation details -- this is the format you use to communicate with other entities that you may or may not control.
Note that protocol buffers give you the equivalent of a DOM -- an object representing the parsed message. This is usually much more convenient to use than SAX parsing (depending on your use case, of course). So, I'm not sure if comparing against SAX is necessarily fair. Though I think protocol buffers would still win just because there is less to parse and parsing length-delimited chunks is faster than character-delimited.
This is 49 bytes: <person name="John Doe" email="jdoe@example.com">
The equivalent Protocol Buffer is 28 bytes. In addition to the 24 bytes of text, each field has a 1-byte tag and a 1-byte length. The example you quoted is protocol buffer *text* format, which is used mostly for debugging, not for actual interchange.
We wanted to give an idea of the speed without trying to boast too much or look like we were directly challenging anyone. Of course every news outlet has chosen to highlight the speed comment -- including the numbers which were intended to be ballpark figures -- more than was intended, but I guess that isn't surprising.
I agree that the tiny "person" example is not a good benchmark case. It was intended as a usage example, not a speed example, but I stuck the speed numbers in there just meaning to give people a vague idea of the difference. The "20-100 times faster" comment is based on testing a variety of formats -- both unrealistic ones and real-life formats used in our search pipeline -- against programmatically generated XML equivalents (which may or may not themselves be realistic, though they contain the same data with the same structure). libxml2 was used for parsing XML. I don't really know how libxml2's speed compares to other XML parsers, but I didn't have a lot of time to investigate. The 20x faster number comes from the largest data set (~100k-ish) while the 100x number comes from a very small message. The most realistic case was about 50x. Sorry that I cannot provide exact details of the benchmark setup since many of the test cases were proprietary internal formats.
In any case, I'm hoping that some independent source conducts some tests because I think anything we produced would probably have unintentional biases in it. Of course, I'll update the numbers in the docs if they turn out to be wildly off-base.
They also differ in that XML has a *lot* more features. For example, protocol buffers have no concept of entities, or even interleaved text. Those can be useful when your data is a text document with markup -- e.g. HTML -- but they tend to get in the way when you just want to pass around something like a struct.
It's worth noting that writing alternative encoders and decoders for protocol buffers is really easy (since protocol message objects have a reflection interface, even in C++), so you can use the friendly generated code without being tied to the format.
What happens to your binary serialization if you add a new field to your class? Can you still read serialized objects created by older versions of your software? (Honest question; I don't know how C# serialization works.) Also, can you read your data in other programming languages?
Structurally Protocol Buffers are similar to JSON, yes. In fact, you could use the classes generated by the Protocol Buffer compiler together with some code that encodes and decodes them in JSON. This is something some Google projects do internally since it's useful for communicating with AJAX apps. Writing a custom encoding that operates on arbitrary protocol buffer classes is actually pretty easy since all protocol message objects have a reflection interface (even in C++).
The advantage of using the protocol buffer format instead of JSON is that it's smaller and faster, but you sacrifice human-readability.
YAML and JSON are text-based formats intended for human readability. Protocol Buffers are binary, and therefore smaller and faster, but not human-readable.
Also, the protocol buffer compiler provides friendly data access objects. You could actually use these with JSON or YAML, by just writing a new encoder and decoder (which is easy to do).
The article is actually completely wrong there. The protocol buffer binary format uses tag/value pairs, not fixed positions. Parsers simply ignore any tag they don't recognize and move on to the next.