Slashdot Mirror


Google To Offer Free Database Storage for Scientists

An anonymous reader writes "Google has revealed a new project aimed at the scientific community. Called Palimpsest, the site research.google.com will play host to 'terabytes of open-source scientific datasets'. It was originally previewed for scientists last August . 'Building on the company's acquisition of the data visualization technology, Trendalyzer, from the oft-lauded, TED presenting Gapminder team, Google will also be offering algorithms for the examination and probing of the information. The new site will have YouTube-style annotating and commenting features.'"

8 of 107 comments (clear)

  1. mining for ads by spud603 · · Score: 5, Funny

    So will they be mining the data for contextual ads?
    I'd be curious what their algorithms think my data says I want to buy...

  2. OMG WTF THIS SUX by User+956 · · Score: 5, Funny

    The new site will have YouTube-style annotating and commenting features.

    And hopefully the commentary will be just as insightful and poignant!

    --
    The theory of relativity doesn't work right in Arkansas.
  3. Are they insane? by Hognoxious · · Score: 5, Funny

    Why would you want to store a scientist in a database?

    --
    Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    1. Re:Are they insane? by jma05 · · Score: 5, Funny

      > Why would you want to store a scientist in a database?

      So that these geeks can have normal relationships.

  4. Fantastic for Students and New Researchers by cheesethegreat · · Score: 5, Informative

    If this actually happens, and researchers are willing to make their data-sets open source, it would be a huge boon for budding researchers. It would allow students to do more than just work with a sample dataset out of a textbook. Graduate students learning how to do advanced modeling would be able to work with real datasets, vastly improving their skillset and employability. Just consider these two lines on a CV, and ask yourself which one jumps out at you.

    "Designed a model for the dataset on the CD-ROM included with the Modeling Organic Systems textbook"

    "Designed a model for the WISK-III heart output dataset published in 2006."

    New entrants to a field would have instant access to enormous amounts of data very quickly and easily. Although the big kudos comes when you can do totally original work (new data, new analyss), a researcher who could come up with a new critique of older papers and studies would definitely get themselves noticed.

    Overall, this is a really positive step for everyone on the lower rungs of the scientific ladder, and especially positive for those with limited resources.

    1. Re:Fantastic for Students and New Researchers by cortex · · Score: 5, Insightful

      As a neural engineering researcher who routinely generates terabyte size datasets, I have to say that I both like this idea and think it is unlikely to succeed. I would love to have a place to store large datasets and access them from wherever I am at. However, since these datasets will be open sourced, I will be extremely unlikely to put any dataset on google until I am certain I have extracted all of the publishable findings from it. I think that most researchers after putting in years of effort and a lot money into acquiring a dataset will also think twice about open sourcing their data. If the TOS where to include some means for controlling publications which resulted from analysis of the data, then it might be more likely to succeed.

    2. Re:Fantastic for Students and New Researchers by Gromius · · Score: 5, Insightful

      As a researcher myself (particle physics), I echo others comments in this thread that a) its a nice idea but b) isnt going to happen. There are three main problems, the first two are solvable, the third isnt

      1) trivially, 3TB is no where near enough to store my data

      Bit of a non issue for the overall concept but if google wants my data, they really are going to have to up the storage by a few orders of magnitude.

      2) as others stated, we work really really hard to acquire our data, research is about 10% inspiration, 90% perspiration. We are not giving up our data till we have milked it for all its worth.

      This again is solvable, we release our data after we have all the publishable results we can think of and them let others have a crack. Somebody might find something useful and if not, well its great for younger scientists as you say. At the very least, people can reconfirm results at a later date easier. Main reason I like it.

      3) The deal killer, for my field and I suspect others, it is really really difficult to understand our data and its really easy to misinterpret it.

      New particles have been "discovered" so many times by grad students (and some professors who should know better) in particle physics data that I'm terrified of what somebody with no training outside the system might conclude from the data. At CDF (a fermilab expt) it took us (800 physicists) about 2-3 years to understand the data from the experiment enough to get proper physics results out of it. Even now, it takes a new comer about a year to get upto speed and thats with help from all the experts. But its very easy to think you understand things after a few weeks when infact your missing some incredibly subtle point and so I'm sure we would be flooded by bogus results due to misinterpretations from the data if we release it.

      Anyway this all comes from a particle physics view point but I suspect quite a few other fields will be similar.

  5. Re:Horrible Idea - What are the TOS? by hostguy2004 · · Score: 5, Informative

    Google are offering this service to store PUBLIC DOMAIN data. If people don't want to release the data as public domain, then this aint the service for them. See http://en.wikipedia.org/wiki/Public_Domain

    --
    In Soviet Russia ^H^H^H America, The bank finances YOU!