Slashdot Mirror


Google Launches a Data Prediction API

databuff writes "Google has released a data prediction API. The service helps users leverage historical data to make predictions that can guide real-time decisions. According to Google, the API can be used for prediction tasks ranging from product recommendations to churn analysis (predicting which customers are likely to switch to another provider). The API involves three simple steps: upload the data, train the model, then generate predictions. The API is currently available on an invitation-only basis." Google also recently announced several other API additions, including Buzz, Fonts, and Storage.

70 comments

  1. well? by snmpkid · · Score: 0, Insightful

    Does it use users Wifi sniffer captures to aid in this prediction?

  2. I Predict ... by WrongSizeGlass · · Score: 2, Insightful

    ... that Google will do their own analysis on your data. They're nothing if not thorough.

    1. Re:I Predict ... by Anonymous Coward · · Score: 0

      Think of all the great things Google could do knowing that you have a sequence of numbers, and the next one may or may not be 100!

  3. Smells like... by PhongUK · · Score: 0

    neural networks to me!

  4. In Soviet Amerika: by Anonymous Coward · · Score: 0, Funny

    Data prediction launch YOU!

    Yours In Astrakhan,
    Kilgore T.

    P.S.: Maybe Google can use scrape invitation-only users predictions for its bond trading floor.

  5. Three simple steps? How about four. by olsmeister · · Score: 2, Funny

    1. Upload the data.
    2. Train the model.
    3. Generate predictions.
    4. PROFIT!!!!

  6. Let's see if predict the end by noidentity · · Score: 1

    of thi

  7. Psychohistory ? by Vapula · · Score: 3, Interesting

    What about feeding it with historical events, train with the outcome from these events and try to get a glimpse at which way the future will evolve ?

    1. Re:Psychohistory ? by 0100010001010011 · · Score: 3, Interesting

      Or use the last half of your data set as blind data. Train the model on 1900-1990 and see if it can predict 1990-2000.

      How far can you predict? 1%, 10%, 50%?

      If you want to really see how good it is feed it stock market data and see how well it predicts that.

    2. Re:Psychohistory ? by Coren22 · · Score: 1

      There is a 97.5% chance that this could all end badly.

      Good book :)

      --
      APK likes to ask for responses to the same things over and over. Maybe he just likes the responses?
    3. Re:Psychohistory ? by MozeeToby · · Score: 1

      Stock market data by itself is insufficient to predict the stock market because of all the external variables. It would be impossible to predict the post 9-11 crash for instance because there is nothing in the markets that changed leading up to it. It would be difficult to predict the more recent meltdown because it was caused by a combination of lax oversight, repealed laws, semi-legal trading techniques, and a culture of over borrowing. It's possible that you may be able to predict the minute to minute changes, maybe even day to day. But long term trends are almost impossible.

    4. Re:Psychohistory ? by E+IS+mC(Square) · · Score: 1

      Nothing new there. The risk analysis used by most of the wall street firm to calculate their risk exposure is doing just that. And we all know how that turned out to be.

    5. Re:Psychohistory ? by Kilrah_il · · Score: 3, Interesting

      The nice thing about the stock market is that when everything is fine the analysts say that their models are great, but when something unexpected happens they go all "but we couldn't have foreseen that. Except for this unexpected incident, our models are great!". The problem is that these "unforeseen incidents" are what drives most of the extreme changes in the stock market, and more generally, in our entire society.
      Just look at 9/11 (to use your example): It not only affected the economy, it affected (and still affects) our entire lives - from airport searches, to US PATRIOT acts to wars in Iraq and Afghanistan.
      These extreme events are called Black Swans ( http://en.wikipedia.org/wiki/Black_swan_theory ) and I do recommend the book by the same name by Nassim Nicholas Taleb. Fascinating reading (if a bit repetitive sometimes :) ).
      The bottom line: Trying to predict the future from past events is fine, until it breaks up, and it does so more than we care to imagine.

      --
      Whenever in an argument, remember this.
    6. Re:Psychohistory ? by fusiongyro · · Score: 2, Interesting

      +1! "Past performance is not a predictor of future success." Taleb is my hero. Everyone should read Fooled by Randomness, which I didn't find repetitive at all.

    7. Re:Psychohistory ? by S-100 · · Score: 2, Insightful

      Like I heard a seasoned stock trader once say: "Technical analysis works great, until it doesn't".

  8. Dear Google, by Anonymous Coward · · Score: 0

    plz predict for me if this turing machine haltz, kthksbye.

  9. Dear google.. by cntThnkofAname · · Score: 3, Funny

    Given my family history... is there a girl for me?

    1. Re:Dear google.. by 0100010001010011 · · Score: 3, Funny

      Every male in your family tree has had sex at least once, so the odds look good for you.

    2. Re:Dear google.. by asukasoryu · · Score: 1

      He is the only recorded male in his family tree. All women in his family were artificially inseminated. All the sperm donors were virgins.

      --
      There are more things in heaven and earth than are dreamt of in your philosophy.
    3. Re:Dear google.. by cntThnkofAname · · Score: 1

      that may be relevant if I was a male...

    4. Re:Dear google.. by Anonymous Coward · · Score: 0

      Then my guess: insufficient data. Must supplement with additional observation.

    5. Re:Dear google.. by uburoy · · Score: 1

      Mandatory xkcd reference : http://xkcd.com/605/

    6. Re:Dear google.. by 0100010001010011 · · Score: 1

      Odds don't look so good, there was this one woman a LONG time ago. Kid but no sex. Check ancestry.com, I think her name was Mary.

    7. Re:Dear google.. by cntThnkofAname · · Score: 1

      lol jp guys, I'm a dude, come on this is /. ffs :P

    8. Re:Dear google.. by Anonymous Coward · · Score: 1, Funny

      Past performance is not a guarantee of future results.

    9. Re:Dear google.. by Anonymous Coward · · Score: 0

      talk about a biased sample

  10. I can do that too by suso · · Score: 0, Offtopic

    Despite history (and having really good access to historical information), will people keep making stupid choices, voting for someone that screws them in the end and buying products that they think will make them happy but end up at the next garage sale for 90% off.

    Yes

  11. Re:Three simple steps? How about four. by dingen · · Score: 3, Funny

    Holy shit, you... you... you figured out step 3!!

    --
    Pretty good is actually pretty bad.
  12. Gambling API by psbrogna · · Score: 2, Funny

    I can't wait to take my Droid to Vegas once this launches!

    1. Re:Gambling API by Anonymous Coward · · Score: 0

      With the money you make, you can hire a doctor to extract it from wherever the casino security guard leaves it.

  13. This will be a success! by Thanshin · · Score: 1

    Taking into account they released it, and they probably used it to predict its own success; this will either:

    - work, and be a success
    - not work, and fail.

    The future is here!

  14. Data mining by JayJayEm · · Score: 3, Informative

    When I used to work in the financial services industry we used to call this "data mining". The result is usually at best worthless and at worst dangerous as it is so often misused.

    It's worth remembering the saying with data: "if you look hard enough, you can find anything you want to".

    1. Re:Data mining by LizardKing · · Score: 3, Interesting

      It's worth remembering the saying with data: "if you look hard enough, you can find anything you want to".

      A friend of mine works as a quant at one of the big investment banks. He admitted that the models his team creates are useless at predicting the unexpected (as you'd probably expect). Adding in a degree of randomness rarely produces better models, as there are too many possible sources of such unpredictability and the reactions to them depend on many unquantifiable forces. This results in models that are OK at telling traders what they want to know - that they're doing the right thing by all doing the same thing. As soon as something undesirable or unexpected happens, then all hell breaks loose and the traders panic. Having mulled this over for a bit, I suggested his job was pointless, to which he agreed, but pointed out that the pay's great. So much wasted mathematical genius.

    2. Re:Data mining by E+IS+mC(Square) · · Score: 2, Interesting

      Often misused, definitely. But that does not invalidate the importance of any tool, including data mining.

      One good example is Netflix recommendation engine. I know it's far from perfect (as there is nothing perfect about prediction), but is it useful? Hell yeah. It's the best recommendation engine I have used and have benefited greatly from.

      Problem is when it's applied to areas where stacks are higher - like risk analysis by the investment banks.

      And that brings me to mention an interesting (old) and related read - "Fooled by randomness" by Naseem Taleb.

    3. Re:Data mining by Bakkster · · Score: 1

      It's worth remembering the saying with data: "if you look hard enough, you can find anything you want to".

      I was just thinking that this automation will save unscrupulous scientists all the trouble of fudging the models to make the prediction fit their expected results.

      --
      Write your representatives! Repeal the 2nd Law of Thermodynamics!
    4. Re:Data mining by afeeney · · Score: 1

      I prefer the more direct: "Numbers are like people. Torture them enough and they'll say whatever you want to hear."

      More seriously, though, a solid predictive system usually needs both the qualitative and the quantitative analyses. These tools can inform decision-making, but can't make the decisions for anybody, unless the decisions are in the same discrete closed system. There aren't that many entirely closed systems in the world.

    5. Re:Data mining by laddiebuck · · Score: 1

      The way one of my co-workers puts it: If you torture the data long enough, it'll tell you anything you want to hear.

  15. Re:Three simple steps? How about four. by L4t3r4lu5 · · Score: 1

    What is listed as step 4. is actually step 5. There wasn't much of a wait involved at all, so we skipped it to keep things simple.

    Think of the step you're thinking of as being more of an extension of step 3... "3.b) ..." I you will.

    That's the power of Cloud Computing.

    --
    Finally had enough. Come see us over at https://soylentnews.org/
  16. Available to US developers only by inkhorn · · Score: 2, Informative

    Google require you to have a current Storage-For-Developers account, which is only available for US parties currently.

  17. ACNielsen is baning their collective heads... by Anonymous Coward · · Score: 0

    Could be bad news for the already ailing ACNielsen. Based on my experience there, I'd guess that many companies that use the services of ACNielsen would also be willing to plug their data into an API like this and not only compare the output, but compare it to the outcome. If the API does a satisfactory job, they'll drop Nielsen like a ton of hot bricks.

    Nielsen has some slick and useful software, but making their own API like this is the kind of ingenious thing that they could really use about now.

    1. Re:ACNielsen is baning their collective heads... by Anonymous Coward · · Score: 0

      Perhaps I'm mistaken, but I thought Nielsen's claim to fame was their ability to match up household attributes (ethnicity, income, age) to consumer behavior. Anyone with a suitable BI platform can do the rest, but it's not worth nearly as much without that household data, which is their bread and butter.

    2. Re:ACNielsen is baning their collective heads... by Anonymous Coward · · Score: 0

      My experience was that their "bread and butter" was once they had *your* data, they were also allowed to also use that data to get better results for any competitors that might also be using Nielsen.

      The short version: The data I worked with *was* regional based, but not influenced by household income. However, it *was* influenced by competitor data for the same region. The exception to this was when sales data was side-by-side with regional income data and then the only influence was how the analysts read it.

      With the data that I worked with, it all boiled down to, "If I release a coupon in region A on date x/y/zzzz..." - or - "If I start an ad campaign in region A on date x/y/zzzz, how will it affect my sales?"

      All that being said, I only worked with a slice of their operation.

  18. Prediction Battle Prediction by Ukab+the+Great · · Score: 1

    I predict that within the next year someone's blog or the Wall Street Journal will feature a cage match between Google's Prediction API, a chimp with a dartboard, and a magic 8-ball.

  19. I Predict ... by L4t3r4lu5 · · Score: 0

    a riot!

    --
    Finally had enough. Come see us over at https://soylentnews.org/
  20. Eye trick by adeft · · Score: 1

    I know that word is "churn" but the first 3 times I read it as "chum" Anyway, is this similar logic to how google is able to advertise based on what is discussed in your email?

  21. Great, day trading here I come! by kalirion · · Score: 1

    What could possibly go wrong?

    1. Re:Great, day trading here I come! by dward90 · · Score: 1

      Given that actively managed funds often outperform the market at large, I would actually expect you to do decently with this strategy, even if Google only gives you random stocks.

      --
      My other sig is clever.
    2. Re:Great, day trading here I come! by cmiller173 · · Score: 1

      Well, you could lose everything and kill yourself ... worst case.

    3. Re:Great, day trading here I come! by dward90 · · Score: 1

      What an awful mis-type there. Managed funds UNDERperform the market.

      --
      My other sig is clever.
  22. Filler sentence by Anonymous Coward · · Score: 0

    The service helps users leverage historical data to make predictions that can guide real-time decisions.

    This sentence hurts my brain in how vague it is. You could say the same thing about Excel, Lotus 1-2-3, your kid's history homework, my filing cabinet, or the library. If it was removed from the summary, no meaning would be lost.

  23. Cue priavcy outcry in 3... 2... 1... by MagikSlinger · · Score: 1

    Please post your privacy concerns in the form of an outraged screed. :-)

    --
    The bitter lessons of a veteran coder: http://bitterprogrammer.blogspot.com
  24. Data mining certainly not worthless by gnieboer · · Score: 2, Informative

    It's absolutely data mining, but it's far from worthless.

    Every time you go to Amazon and it recommends something to you, guess what, that's data mining using basically the same techniques that this service will use. And as you might expect, that equates to big $$$ for them (or else they wouldn't be bothering).

    Many many fields use the technology, particularly the medical fields for analyzing the relationships between a large number of input variables (which may or may not be correlated) and some desired output variable. Spam filters, Google Search itself... all data mining algorithms. Nah, no money to be made there...

    Now, the reality isn't as simple as 'upload the data, training the model, and generate predictions' normally. It takes time to figure out what factors to include, ETL'ing the training data from the actual source(s), plugging in algorithm parameters, and carefully validating your output model. Most models I've worked have taken several iterations to get right as you learn more about your input data relationships as you use the model.

    And your second sentence is sadly true, if management wants a certain output, then the endeavor is pointless. But when used appropriately (and it's on the experts to explain the limitations of the tech to the users), this stuff is really powerful.

    But will a lot of businesses be willing to send their 10 year history of accepted/declined credit card transactions with all the related demographic data to the cloud? Or their medical scenarios with the medical details of each patient? I think not. The type of data most mining projects use is critically sensitive. So I predict this will be limited to experimental users 'playing around', nothing more.

    1. Re:Data mining certainly not worthless by JayJayEm · · Score: 2, Insightful

      OK - I'll admit it - I did engage in a little bit of hyperbole.

      But you have to admit that "at best worthless" has a better ring to it than "at best, when combined with a qualitative analysis of the model itself, and some testing with out of sample data, can be a useful tool in decision making".

      You are right that no investment bank will go anywhere near this.

  25. This Has Already Been Done... by Aaron.SD · · Score: 1

    ... "Ask the Magic 8-Ball" http://av.vet.ksu.edu/flash/8ball/

  26. Here's my FREE data prediction API: by proc_tarry · · Score: 1

    predict(data)
    {
        delete data
        prediction = random()
        return prediction
    }

    1. Re:Here's my FREE data prediction API: by mujadaddy · · Score: 1

      Quick!

      Patent it in Germany!

      Actually, change the last line to "return whatClientWantsToHear" and you've really got something!

      --
      Populus vult decipi, ergo decipiatur...
      "Force shits upon Reason's back." - Poor Richard's Almanac
  27. Hmm... I smell a internet-scale prank opportunity by pdxp · · Score: 2, Funny

    Google probably wants to use the data for their own analysis. So, I suggest all of Slashdot team together and forge a large volume of the most bullshit data that will convince Google that, without a doubt, they need to make every first search result named "Frosty P1ss!" linked to goatse in order to make their customers happy.

  28. Amazon has one too. by Animats · · Score: 1

    Now I see why the Amazon Cloud people have been so insistent on people in Hacker Dojo's machine learning class run problems on their "cloud".

    This stuff is actually fairly routine by now. It's much the same technology that's behind spam filters.

  29. Psychohistory? by ufbrett · · Score: 1

    the post made me think of Asimov's old foundation series books.

  30. Acting on prediction by gmuslera · · Score: 1

    The API predicts that will be an empty niche/opportunity in a day, then everyone that uses it jump there, so the prediction fails because becomes overcrowded. Is very easy to turn predictions for everyone to predictions for none if all try to take advantage of that knowledge.

  31. What to Expect when the Price of Gas Falls below 0 by KPexEA · · Score: 1
  32. Re:Three simple steps? How about four. by Barryke · · Score: 1

    mod parent parent up.
    mod parent up.

    --
    Hivemind harvest in progress..
  33. More information is not necessarily better by Stupid+McStupidson · · Score: 1

    It's interesting to see this coming, as in google becoming a digital Harry Seldon. But while it's good to have plenty of info to which base decisions on, it's becoming what in the Army is referred to as "paralysis by analysis". At some point, you need to trust your instincts, and do it. Pouring over the amount of data google can provide, filtering what is relevant (google isn't perfect), and then deciding what to do would likely take longer than going with your gut, or the smaller amount of available data, and then adjusting from there.

  34. "Shall I compare thee..." by ivoras · · Score: 1

    Well? Can it predict the rest? :)

    --
    -- Sig down
  35. Classification algorithms as web service by msbmsb · · Score: 1

    The use of the word "predict" is for ease-of-understanding for the business market and those not familiar with machine learning. Many of the comments here are getting lost in that word. The algorithms behind the API are most likely the same basic ones that have been around for a long time: naive bayes, svm, knn, etc. The actual novelty of this service is that it puts these methods in easy reach for people who otherwise wouldn't know where to start looking, or wouldn't know how to use one of the many available libraries already around, or much less implement something themselves.

    See also: http://mlcomp.org/ for a service that allows you to try out different classification algorithms on your own data sets.

  36. Wow, a RESTfull API by mzechner · · Score: 1

    that's actually a pretty nice idea. The thing seems to have some caveats though: only categorical labels are allowed, training sets are limited to 100mb and no sparse features can be used. There's also no info on whether things like cross-validation are done and what algorithm will be chosen. I also wonder about how fast the prediction phase will be. Still pretty neat.

  37. Completely useless by Bitmanhome · · Score: 1

    I asked the Google Prediction API what the next Google API would be, and it said "Google Prediction API".

    --
    Not that this wasn't entirely predictable.
  38. And by mahadiga · · Score: 1

    Past performance does not guarantee future results.

    --
    I'd like to buy homeland for our 10 million people. http://twitter.com/mahadiga