Google Snaps Up Stats Tool from Swedish Charity
paulraps writes "A stats program that began as a teaching aid for a university lecture has just been bought by Google for an undisclosed sum. The statistics tool, Trendalyzer, was developed by a professor and his son at Stockholm's Karolinska Institute. Unfortunately for the developers, the project has been run under the auspices of a charity, Gapminder, and financed over the last seven years by public money. Maybe that seemed smart at the time, but the professor, admitting that he won't see a dime of Google's cash, now seems regretful. As for what Google has purchased: 'Public organizations around the world invest 20 billion dollars a year producing different kinds of statistics. Until now, nobody has thought of collecting all the information in the same place. That should be possible with Trendalyzer, which will be able to present that quantity of data in a clear way as well as giving the user the ability to compare many different kinds of information.'"
No one seems to care about this enough to even be the first troll.
Does this sig remind you of Agatha Christie?
Maybe that seemed smart at the time, but the professor, admitting that he won't see a dime of Google's cash, now seems regretful.
Major bummer.
The higher the technology, the sharper that two-edged sword.
Google, I dig you for now, but I'm not really sure that I care for the idea of having google own nearly all of the search data for every search done by every individual around the planet in the history of google and beyond combined with all of the world-wide traffic analysis data.
And as someone who would be targeted for this service -- why would I bother? There are plenty of free open source utilities out there that provide every ounce of data you could ever want and they're incredibly easy to configure and deploy.
No, the benefit here seems to be less for the end-users deploying the service and more for whoever google then turns around and sells the massive amounts of correlated information to. For instance, let's see every bit of data about a specific user so we can see everything from each search he does to his entire browsing trail. Bet we could sell that for a lot of money!
Hopefully you will still have a simple way as a user to prevent google from collecting this information just like you can do with their stupid Urchin service (by blocking it). And, sadly, people will still continue to use this new service because they'll sell out their mother's medical history and offer up a sample of their own blood and cholesterol ratings if it means getting something "for free".
.. targeted Google Ads! Hurrah! Oh, wait, I already use Opera to block them anyway... still, I guess this'll prove useful to actually putting ads up that vaguely interest people.
Neither article nor summary explain what Trendalyzer actually does. The animated mapping of stats at http://tools.google.com/gapminder is a little more illustrative.
From the Gapminder site:
To me, this seems to imply that the professor and his son were the original developers, not the maintainers. Or perhaps just his son is going to Google?
The professor will probably get money from them later on when Google wants to upgrade it, or is having problems with it. When Google wasn't interested in putting out the Google Tool bar for Fire Fox, there were some guys that made their own, and then all of a sudden Google saw the light switch on, and develop their version. I wrote to them about making sure that these guys were compensated in some way for their hard work, and I also wrote to the developers about what I did, and they stated that they really weren't interested in making sure that Google compensated them, but apparently Google did get in touch with them, and for the longest time there was a link to the alternative tool bar on the Google site. I'm not sure if they did in fact compensate them for all of their hard work or not, but if Google wants to live up to it's motto of "Don't be Evil," then perhaps they could at least put the professor on a retainer to help with further development...I would...why not, eh? He's the guy that actually wrote it, so he'd be the go to guy for any further developments, or bugs:-)
MeTheGeek
...why isn't it already in the public domain?
Unfortunately for the developers, the project has been run under the auspices of a charity, Gapminder, and financed over the last seven years by public money. Maybe that seemed smart at the time, but the professor, admitting that he won't see a dime of Google's cash, now seems regretful. So what about the 7 years they got funded? He developed it using public money, why does he complain?
Someone engages in work for a charity and then doesn't get a big payoff. What's the problem again?
If the code were GPL'd, you wouldn't have work that was done for charity disappearing into the proprietary maw. That kind of thing makes developers feel ripped off.
OSS is the future, programmers won't get compensated beyond their paycheck.
So he's got nothing to complain about.
Lot's of people have great ideas that never reach fruition for reasons that have nothing to do with them. And sometimes, those ideas can take off and be promoted for reasons that have nothing to do with them. Often these things offend our sense of fairness.
Yet life is not fair and often people have regrets and indulge in "what if" fantasies.
For something like this, even if the fellow gets no money, he can get publicity and recognition and might be able to leverage that into something to get him more money if that's what he wants.
The past is past and the price for obtaining "justice" and "fairness" can be quite high and more than one should have to pay; you can lose your future doing it.
Learn from the past and develop a plan to move forward and leverage on the lessons learned; the best revenge is always living well.
"I believe in Karma. That means I can do bad things to people all day long and I assume they deserve it." : Dogbert
"Do no evil" is not the same as "always do good."
Shortly thereafter, a site called Nation Master cropped up, with a bit flashier and simpler user interface, but focused on CIA World Fact Book data, rather than the States of the US. (The same folks later did State Master using similar UI technology.)
Finally, Google tested Gapminder with an even spiffier and simpler UI -- again focusing on by Nation correlations.
Aside from the usual complaints about "The Ecological Fallacy" (a fallacy that cuts both ways BTW) there are two big pitfalls for this stuff:
What I did about missing data was simply eliminate any data points where data was missing from one or both of the variables being correlated. This reduces the sample size, hence statistical significance, but it bypasses arguments over what sort of missing data should be used. The Netflix Prize is coming up with really good algorithms to compute missing data efficiently and accurately so maybe there is hope for something more effective here.
Statistical significance is more difficult to deal with. Usually one must look at tables for statistical significance of correlations under the assumption that the variables each follow a normal distribution. Unfortunately, many variables follow polynomial (like squared) or exponential distributions, so you have to do things like take the sqrt or log of one or both of the variables to try to normalize them. However, when you are looking for correlations, sometimes it its the relationship that is polynomial or exponential -- in which case you can apply sqrt or log to get the maximum correlation coefficient at the sacrifice of normality of one or both of the variables. Unfortunately, there is no simple arithmetic formula for calculating the significance level of a correlation given a non-normal distribution -- you can't just plug in the skewness, kurtosis, etc. as well as sample size and correlation coefficient, and get out a valid statistical significance. Therefore it is hard to make good statements about many very important correlations without watering them down to meaninglessness.
Also, a complaint about the "simple" user interfaces:
Some of the worst reporting from news media comes when they refuse to report statistics in terms remotely related to anything meaningful -- for example you will frequently hear statements to the effect that "California has the most orange trees in the nation." or some such. Such statistics are nonsense for the purposes of correlation studies since the size of the ecology (California state) is all you are really measuring with such statements. You have to divide by the population or divide by the total GDP or something to rationalize the ecology against other ecologies.
In Laboratory of the States, I did this with all my variables but I also left the raw variables around and allowed people to do arithmetic on them -- like dividing them -- to get their own rational comparisons if for some reason my choices were not adequate. This problem isn't as bad with Gapminder as it is with Nation Master and State Master -- but Gapm
Seastead this.
Guess he just got beheaded by the other edge of giving your software away, huh?
At least he can be content to know that Google will be the bestest, most very perfect company ever, since they come right out and say, at every opportunity, that their policy is "don't be evil".
And since they say they won't be evil, we know they can't be lying! (Please ignore how they help totalitarian right-wing regimes to identify people who speak out against them, and empower governments to clamp down on free speech)
Ah...so you remember when the Internet was an educational and military tool, back before it took off?
Back before Google, or even Yahoo. Back when a T1 cost $1500/mo or more, making entry in to the ISP business difficult. Back before multimedia content (shareware games) pushed your average home user's bandwidth above 2400 baud.
Yeah, commercialization of the Internet really destroyed its value.
tasks(723) drafts(105) languages(484) examples(29106)
Nice pictures. Lets me look at data they have cooked. Lots of nifty chartjunk http://en.wikipedia.org/wiki/Chartjunk. I seem to have missed the link that lets me enter my own data. Does anybody have a pointer to that?
Bork, bork, bork!
Nerd rage is the funniest rage.
Actually, Google could hire the guy! Why not?
No sig for now.
I think one has to see Rosling work with Trendalyzer to appreciate what that piece of software can do. He got standing ovations for his presentation at the TED conference in 2006. Very cool.
memomo: free web based language trainer DE-EN-ES-FR-IT
Rosling gave a 20min presentation of Trendalyzer at the TED 2006 conference, using it to debunk some of the prejudices we have about the world. Turns out chimpanzees beat swedish professors when making claims about the world. Worth watching, as are many of the presentations at TEDtalks.
memomo: free web based language trainer DE-EN-ES-FR-IT
Most people don't care about working for Google. Many people prefer to have freedom and ownership of their own ideas and developments instead of becoming code monkey #12,000 in a large media corporation.
If it was GPL, then we all could have benefitted from it, not just Google.
Excuse me, but please get off my Pennisetum Clandestinum, eh!
Is the bad guy the one that bought the software? Or the one that sold it? It's not like anyone was forced to do anything against their will.
And how did this software get under the control of the non-profit? Is the prof getting a salary from them?
That the summary says Google "snapped up" the software seems to suggest that Google snatched it out of their hands or something. I've got a feeling that money changed hands somewhere along the line. Somebody got paid, and I'm betting it was a bundle. Anybody who's smart enough to write an important bit of software ought be able to read a contract before he signs it. And if he thought that just because an organization is non-profit it means that it's not looking to get a pile of cash then maybe he's been vacationing on Pluto for the past few decades or doesn't read the business section of the newspaper. If he didn't write the software to make money, then he shouldn't cry because he didn't make money. If he wanted to make money from his software, then he should have asked a few questions before releasing the project.
I'm among the most anti-big corporation commentors around here, but I'm more intrigued by what's not in this article as by what's there. I'm not ready to hang an evil jacket on Google just for buying something that was for sale.
You are welcome on my lawn.
The problem with relying on rank ordered correlations alone for significance testing is the data dredging fallacy. Just by random chance a certain number of correlations with a given distribution will have a certain level of correlation. Frequently you can rely on the other correlations to give you an idea of the "random" distribution of such correlations but really to do it properly you must generate a bunch of random correlations where the variables have the kind of non-normality you want to test for significance and then see how that "Monte Carlo" sample looks. It's a real pain.
Seastead this.
I always thought statistics was boring but the video by the prof http://video.google.com/videoplay?docid=2670820702 819322251 really intrigued me, statistics makes so much sense when presented properly... the numbers not only make sense but also explains their relation with other statistics giving a much broader view.. I'm sure a tool like this would be a boon to decision makers...
This is kind of evil. I think Google should reward this guy too.
Of course then they would have less money for the gourmet food for their employees.
A while ago, these guys came to google to give a techtalk. Here it is: http://video.google.com/videoplay?docid=7996617766 640098677
... Google can only buy other products.
Talk about "overrated"!!
... and you've got something going. I know exactly where my most effective PPC ads (Google/MSN) are for my small business: my Perfect Customer is
* late 20s/early 30s
* female
* elementary school teacher
* teaches English/ESL
* looking for an activity for teaching sight words
* needs it for this Friday
* searching during either her lunch hour, a prep period, or from home after she puts the kids to bed
I sell Bingo Card Creator (http://www.bingocardcreator.com), which conveniently has sight words bingo built into it. There are perhaps 100 people in the country who fit my Perfect Customer profile in any given week and I sell to about 3-4 of them.
Not that I'd do it, because its creepy, but if there were some way I could send a message to every one of my Perfect Customers on Tuesday saying "Are you too busy this week? I've got your Friday lesson done already. $24.95 and you'll be done in 3 minutes.", oh boy, the things I could do with that.
(P.S. What a wonderful age we live in where a small businessman like me can get hyper-targetted advertising for less than $100 a month, and KNOW it to be effective at driving sales.)
Help poke pirates in the eyepatch, arr.
Probably not something Google has paid billions of dollars for.
It crams five axes into a single window, using the "usual" two (x & y) axes, plus color, size and animation for the other three axes. Works fine when you use something size related for size, time for animation, and something discrete for color, as in the example.