The Astronomical Event Search Engine
eldavojohn writes "Google has signed on with the Large Synoptic Survey Telescope project that will construct a powerful telescope in Chile by 2013. Google's part will be to 'develop a search engine that can process, organize, and analyze the voluminous amounts of data coming from the instrument's data streams in real time. The engine will create "movie-like windows" for scientists to view significant space events.' Google's been successful on turning its search technology on several different media and realms. Will they be successful with helping scientists tag and catalog events in our universe?" The telescope will generate 30 TB of data a night, for 10 years, from a 3-gigapixel CCD array.
Will they be successful with helping scientists tag and catalog events in our universe? Will they defeat the monster and get the girl? And will they be home in time for tea? Find out next on GoogleTrek.
Seriously though, processing something the equivalent of 3/4th's of the LoC every night is nothing to be sneezed at. Over the course of those 10 years that's about 110 Petabyte (40TB * 365.25 * 10) of unprocessed data.
Indeed!
(read in the voice of Pete Puma)
Gonna need a lotta DLT's. A whole lotta DLT's
You can have my SIG when you pry it from my cold, dead hands.
Now Google will be serving up advertisements on Uranus.
Just wondering if Google can provide the right tool. Yea, they can design a front end. Yep, they can give content. But can they really deliver the information you need w/o a whole pile of ebay ads?
I had that idea first!
At the rate at which our storage capabilities are growing, 30 TB will be considered nothing. We're approaching consumer-grade hard drives with a capacity of 1 TB. We'll likely see 2 TB consumer-grade hard drives by the end of 2008. Remember, that's consumer-grade. Google will no doubt be able to afford far higher quality drives with larger storage capacities. And by 2017, 30 TB drives will be considered miniscule.
In 1997, 1 GB hard drives were the largest available for the average consumer. Now it's 2007, and we have 850 GB hard drives available in most tech retail stores. That's an 850x increase over a decade. It's likely we will see that trend continue over the next decade. So it's more than likely by the time this project is nearing its end, we'll be dealing with 700 TB hard drives, and that's at the low end of the market.
Will they be successful with helping scientists tag and catalog events in our universe?
That depends. Can you sell advertising doing that?
Push Button, Receive Bacon
I saw a documentary not long ago about doing just this photographing of the same piece of sky, only with longer intervals than 30 seconds. Anything moving would automagically be flagged by the software, it's vector computed. Correct me if I'm wrong, but from what I can tell of this project, it's going to do exactly that (and more), but on a larger scope, and with better accuracy?
Indeed!
Comment removed based on user account deletion
Is arranging adwords to not get in the way of viewing planetary nebula.
Take the cheese to sickbay, the doctor should see it as soon as possible - B'Elanna Torres, "Learning Curve"
It's not all about storage, though. This also requires a collosal amount of CPU-time, which makes Google a good fit: They know how to store and access massive amounts of data, and they know how to parallelise.
Indeed!
Short time ago, I made comparisons to people that Hubble was only a billion dollars and that Google could buy a hundred of them, and cripes, lots of big dopey slothing corps could buy even thousands of them. Funny though that they will be at least part of one.
I hope that during the day astronomers will point the telescope at a women changing room.
Hopefully though by 2013 this will be a lot easier.
09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.
I wonder what the digital zoom is like on that camera.
"If any question why we died, Tell them because our fathers lied."
It occurs to me it may be possible to speed the actual processing part up by splitting the Gigapixel images up into ever smaller quadrants, treating them as textures, and using shaders to do the actual heavy lifting.
Indeed!
In another threat about the collapsed Pillars of Creation I questioned the value of that type of research... who cares if they collapsed or not. I asked... where is the value in that particular research.
This whoever provides real obvious value. I could care less about the astronomical events... I guess there is some physics and maths and stuff that can be done... but the database and algorithms and computing systems needed to process all of this data will drive innovation, particularly since it's being done by a company like Google that gets innovation. This is good science... pure science that requires real innovation to work and will return real benefits.
That's a lot of data, but it's less than 1/10 as much data as the Large Hadron Collider will put out, and the LHC is supposed to be coming online within a year, not in six years. By the time the Large Synoptic Survey Telescope comes online, the LHC may have produced more data than the Large Synoptic Survey Telescope will over the life of the project.
I'd be interested to know more about the data handling methods they have in place for the LHC. I don't think they'll be using Excel.
*Note the correct, non-Frudian-Slip spelling of "hadron"
Can anyone tell me how to set my sig on Slashdot?
30TB per night, say it's 12 hours per night, its gonna be around 694 Megabytes per second to write that data
(30000 Gig/43200 secs = 0.694 GBytes/s), wonder what kind of harddrive do the have..
-aespe-
But can it work for pr0n? To my understanding some users can generate nearly that much raw pr0n data every frustrated night, it'd be great if Google could release this groundbreaking (earth moving?) software for those poor users.
"Just a fox, a whisper."
But will it run Linux?
The real purpose of Google's involvement is to scan the skies for evidence of other Google-like entities, so they can gang up on us carbon-based lifeforms and take over the galaxy.
Don't think you can seduce us with your efficient search engine and high stock value. We're onto you!!
Soylent Green is peoplicious!
Hm, Google searching space... I'm waiting for the time google will search in people's bodies and catalog their illnesses.
The shop I'm at has been working the image processing and data storage problem for PanSTARRS, another sky survey project that is a bit further along (they have a test scope up and running on Maui). It's interesting to me that both projects are at once using conventional solutions and thinking outside of the box.
Conventional: LSST will use a single large telescope and detector; PanSTARRS (as it stands) intends to use a dedicated compute cluster for data reduction.
Novel: LSST is leaning towards distributing its data reduction task over Google's huge server farm; PanSTARRS will use four off-the-shelf 1.8m telescopes, each with a 1.4GP detector, mounted together to image the same piece of sky, and merging the overlapping images in post processing.
When I was working on the project, one of PanSTARRS requirements was to finish analyzing one night's viewing before the following sunset. Early on, the principal investigators decided to solve the image storage issue by not storing them permanently. Instead, once the science for a night's imaging had been extracted (astrometry, LEO or supernova detection, etc), the original images would hit the bit bucket. Whether they've stuck with that I don't know.
Luke, help me take this mask off
Impressive...
Ruby Neural Evolution of Augmenting Topologies
factor 966971: 966971
The telescope will generate 30 TB of data a night, for 10 years, from a 3-gigapixel CCD array.
I bet it makes dull viewing. Sort of like the recent Ashes Tests in Australia. If you're English.
Get your own free personal location tracker
That's a lot of info.
No, that's a lot of data. Info is the result of analysing the data.
i reckon they should use holographic storage so that they could catalog more data than would otherwise be practible (look for long term trends)
In fact, it makes watching a cricket match absolutely riveting in comparison. And with cricket there's always the possibility of a spaceship landing to unload a bunch of robots in search of the trophy... which is probably the reason why people watch it, just in case it happens when they do.
Indeed!
I just worry that with Google "helping" the imagery could be locked up so not everyone has free and equal access to the data.
Its routine in physics collider experiments and seismic exploration to collect several terabytes a day. The limiting factor seems to be data management.
Urasshole
I am working on Solar Dynamics Observatory which is scheduled to fly in 2008. That is next year...
High cadence @ 16 megapixels.
It will produce 2TB/day, every day that the Sun shines in space.
5 year mission will produce 3 petabytes of data, but of satellites/missions often run much longer than their designed life.
We have a lot of the same issues and problems that LSST has, but we are studying a much more dynamic object (the Sun), which changes very rapidly (relatively speaking).
My boss wants to be able to watch a whole day's images as a movie...
I very seriously wish that Moore's Law would get its butt in gear so desktop (laptop?) machines with several terabytes of RAM were commonplace.
I am working on an ultra-high resolution movie tool to play back time series data as movies, but you VERY quickly run into the disk to RAM bottleneck with this amount of data.
Even 16GB of RAM is puny.