Slashdot Mirror


Boiling Down Books, Algorithmically

destinyland writes "A year ago, Aaron Stanton harangued Google over his new project, a web site analyzing patterns in books to generate infallible recommendations. In March he finally finished a prototype which he showed to Google, Yahoo, and Amazon, and he's just announced that he's finally received a big contract which 'gives us a great deal of potential data to work with.' The 25-year-old's original prototype examined over 200 books, plotting 729,000 data points across 30,293 scenes — but its universe of analyzed novels is about to become much, much bigger."

9 of 177 comments (clear)

  1. Just one more errosion.... by zappepcs · · Score: 5, Insightful

    The difference between now and 100 years ago becomes more apparent each day. Then, owning books was a sign of affluence, of intelligence. Now? Everything is up to question, and should be. Analyzing books and other public material is just another step in putting intelligence out there for everyone, not just those that can afford it. I applaud it, and all the dangers it brings. Such hurdles are necessary, but we must assault them to overcome barriers that should no longer exist.

    1. Re:Just one more errosion.... by Anonymous Coward · · Score: 5, Insightful

      Knowledge, not intelligence.

    2. Re:Just one more errosion.... by blahplusplus · · Score: 5, Insightful

      What really hits a nerve with me is why the scientific community hasn't opened up all their journals for others to read. I imagine many retired and amateur scientists, engineers, hobbyists, etc, would have a lot of insight into many engineering and scientific problems and also make many discoveries as well. Intelligence is not limited to the credentialed, those of high status or currently employed, many discoveries happen simply by exposure to as many minds as possible, and finding connections and errors in others works..

    3. Re:Just one more errosion.... by Sir+Holo · · Score: 5, Informative

      blahplusplus: What really hits a nerve with me is why the scientific community hasn't opened up all their journals for others to read.

      We scientists would absolutely love to have all of the journals opened up for free access to everyone. But, you see, the publishers own the copyright to our articles. The system requires us to give them the copyright, in order to get our stuff published. Then you, me, and everybody else has to pay to read recent research.

      Thankfully, some established journals are going open-access.

      That's very promising. But the fact remains that publishers such as Elsevier own the copyright to many decades-worth of scientific literature. And they're not about to give any of it away.

    4. Re:Just one more errosion.... by smittyoneeach · · Score: 5, Insightful

      If you wish to spend your nights reading information from 2+ years ago, that is your problem. The rest of us want today's information, and now. Good luck with the personal library.

      It's getting to the point that you need a 2+ year filter just to dampen the noise in the signal.
      And let's give a shout out to all of the library homiez. While I'm affluent enough to afford the occasional impulse book at the store with the built-in coffee shop, I do recall many an hour of random wandering in the public library in my youth.

      --
      Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
    5. Re:Just one more errosion.... by Z34107 · · Score: 5, Funny

      I dunno, man. Pretty much every point you covered is Wiki-able. [Citation needed]

      --
      DATABASE WOW WOW
    6. Re:Just one more errosion.... by Virtual_Raider · · Score: 5, Interesting

      the idea of finding books you should read but don't know about seems a problem particularly poorly suited to an automated solution.

      Er... -1,Wrong* : You don't seem to be considering the impact of statistical analysis and Very Large Sets of Data (C)(TM). It's becoming increasingly possible not only to know that 125K other people all over the world bought books B, C and D along with book A that you purchased, but now you can also index and analyse their content so it will be even easier to fine tune.

      Imagine this: On the first iteration (first purchase) it can only out-of-the-blue recommend to you those books more consistently purchased along with the one you chose. But on subsequent transactions it can remember what you bought and compare the contents of the books. Now if you bought The Silmarillion, Kontakto and The Unfolding of Language over time, it would be possible to suggest that you read Shakespeare's works in their original Klingon once it realizes that you are equally interested in languages as in fictional civilizations.

      I agree with you that the day an algorithm can make value judgements on the artistic merits of any work is still far ahead, but there was just recently a story about this FireFox plug in that sumarizes user reviews. Combine the two and...

      * Didn't we have this conversation before, or is it just a popular .sig? If there was a "-1,Wrong" moderation, you would be told that the info is wrong but you would lose any insight provided by a direct reply of somebody that bothers to correct you AND post the right facts. With Slashdot being a discussion forum, it's on its best interest to actually promote discussion so you most likely will never see that mod option implemented.

      --
      +Raider of the lost BBS
  2. If you already read, you don't need this... by thereofone · · Score: 5, Insightful

    ...and if you do not read, you won't want this.

  3. Re:Newspeak by log1385 · · Score: 5, Informative

    From the FAQ:
    "Does 1984 really match the U.S. Patriot Act?
    No, that is an easter-egg. A bit of a joke on our part."

    --
    Seek and ye shall find.