Slashdot Mirror


Google Launches a Data Prediction API

databuff writes "Google has released a data prediction API. The service helps users leverage historical data to make predictions that can guide real-time decisions. According to Google, the API can be used for prediction tasks ranging from product recommendations to churn analysis (predicting which customers are likely to switch to another provider). The API involves three simple steps: upload the data, train the model, then generate predictions. The API is currently available on an invitation-only basis." Google also recently announced several other API additions, including Buzz, Fonts, and Storage.

18 of 70 comments (clear)

  1. I Predict ... by WrongSizeGlass · · Score: 2, Insightful

    ... that Google will do their own analysis on your data. They're nothing if not thorough.

  2. Three simple steps? How about four. by olsmeister · · Score: 2, Funny

    1. Upload the data.
    2. Train the model.
    3. Generate predictions.
    4. PROFIT!!!!

  3. Psychohistory ? by Vapula · · Score: 3, Interesting

    What about feeding it with historical events, train with the outcome from these events and try to get a glimpse at which way the future will evolve ?

    1. Re:Psychohistory ? by 0100010001010011 · · Score: 3, Interesting

      Or use the last half of your data set as blind data. Train the model on 1900-1990 and see if it can predict 1990-2000.

      How far can you predict? 1%, 10%, 50%?

      If you want to really see how good it is feed it stock market data and see how well it predicts that.

    2. Re:Psychohistory ? by Kilrah_il · · Score: 3, Interesting

      The nice thing about the stock market is that when everything is fine the analysts say that their models are great, but when something unexpected happens they go all "but we couldn't have foreseen that. Except for this unexpected incident, our models are great!". The problem is that these "unforeseen incidents" are what drives most of the extreme changes in the stock market, and more generally, in our entire society.
      Just look at 9/11 (to use your example): It not only affected the economy, it affected (and still affects) our entire lives - from airport searches, to US PATRIOT acts to wars in Iraq and Afghanistan.
      These extreme events are called Black Swans ( http://en.wikipedia.org/wiki/Black_swan_theory ) and I do recommend the book by the same name by Nassim Nicholas Taleb. Fascinating reading (if a bit repetitive sometimes :) ).
      The bottom line: Trying to predict the future from past events is fine, until it breaks up, and it does so more than we care to imagine.

      --
      Whenever in an argument, remember this.
    3. Re:Psychohistory ? by fusiongyro · · Score: 2, Interesting

      +1! "Past performance is not a predictor of future success." Taleb is my hero. Everyone should read Fooled by Randomness, which I didn't find repetitive at all.

    4. Re:Psychohistory ? by S-100 · · Score: 2, Insightful

      Like I heard a seasoned stock trader once say: "Technical analysis works great, until it doesn't".

  4. Dear google.. by cntThnkofAname · · Score: 3, Funny

    Given my family history... is there a girl for me?

    1. Re:Dear google.. by 0100010001010011 · · Score: 3, Funny

      Every male in your family tree has had sex at least once, so the odds look good for you.

  5. Re:Three simple steps? How about four. by dingen · · Score: 3, Funny

    Holy shit, you... you... you figured out step 3!!

    --
    Pretty good is actually pretty bad.
  6. Gambling API by psbrogna · · Score: 2, Funny

    I can't wait to take my Droid to Vegas once this launches!

  7. Data mining by JayJayEm · · Score: 3, Informative

    When I used to work in the financial services industry we used to call this "data mining". The result is usually at best worthless and at worst dangerous as it is so often misused.

    It's worth remembering the saying with data: "if you look hard enough, you can find anything you want to".

    1. Re:Data mining by LizardKing · · Score: 3, Interesting

      It's worth remembering the saying with data: "if you look hard enough, you can find anything you want to".

      A friend of mine works as a quant at one of the big investment banks. He admitted that the models his team creates are useless at predicting the unexpected (as you'd probably expect). Adding in a degree of randomness rarely produces better models, as there are too many possible sources of such unpredictability and the reactions to them depend on many unquantifiable forces. This results in models that are OK at telling traders what they want to know - that they're doing the right thing by all doing the same thing. As soon as something undesirable or unexpected happens, then all hell breaks loose and the traders panic. Having mulled this over for a bit, I suggested his job was pointless, to which he agreed, but pointed out that the pay's great. So much wasted mathematical genius.

    2. Re:Data mining by E+IS+mC(Square) · · Score: 2, Interesting

      Often misused, definitely. But that does not invalidate the importance of any tool, including data mining.

      One good example is Netflix recommendation engine. I know it's far from perfect (as there is nothing perfect about prediction), but is it useful? Hell yeah. It's the best recommendation engine I have used and have benefited greatly from.

      Problem is when it's applied to areas where stacks are higher - like risk analysis by the investment banks.

      And that brings me to mention an interesting (old) and related read - "Fooled by randomness" by Naseem Taleb.

  8. Available to US developers only by inkhorn · · Score: 2, Informative

    Google require you to have a current Storage-For-Developers account, which is only available for US parties currently.

  9. Data mining certainly not worthless by gnieboer · · Score: 2, Informative

    It's absolutely data mining, but it's far from worthless.

    Every time you go to Amazon and it recommends something to you, guess what, that's data mining using basically the same techniques that this service will use. And as you might expect, that equates to big $$$ for them (or else they wouldn't be bothering).

    Many many fields use the technology, particularly the medical fields for analyzing the relationships between a large number of input variables (which may or may not be correlated) and some desired output variable. Spam filters, Google Search itself... all data mining algorithms. Nah, no money to be made there...

    Now, the reality isn't as simple as 'upload the data, training the model, and generate predictions' normally. It takes time to figure out what factors to include, ETL'ing the training data from the actual source(s), plugging in algorithm parameters, and carefully validating your output model. Most models I've worked have taken several iterations to get right as you learn more about your input data relationships as you use the model.

    And your second sentence is sadly true, if management wants a certain output, then the endeavor is pointless. But when used appropriately (and it's on the experts to explain the limitations of the tech to the users), this stuff is really powerful.

    But will a lot of businesses be willing to send their 10 year history of accepted/declined credit card transactions with all the related demographic data to the cloud? Or their medical scenarios with the medical details of each patient? I think not. The type of data most mining projects use is critically sensitive. So I predict this will be limited to experimental users 'playing around', nothing more.

    1. Re:Data mining certainly not worthless by JayJayEm · · Score: 2, Insightful

      OK - I'll admit it - I did engage in a little bit of hyperbole.

      But you have to admit that "at best worthless" has a better ring to it than "at best, when combined with a qualitative analysis of the model itself, and some testing with out of sample data, can be a useful tool in decision making".

      You are right that no investment bank will go anywhere near this.

  10. Hmm... I smell a internet-scale prank opportunity by pdxp · · Score: 2, Funny

    Google probably wants to use the data for their own analysis. So, I suggest all of Slashdot team together and forge a large volume of the most bullshit data that will convince Google that, without a doubt, they need to make every first search result named "Frosty P1ss!" linked to goatse in order to make their customers happy.