The Astronomical Event Search Engine
eldavojohn writes "Google has signed on with the Large Synoptic Survey Telescope project that will construct a powerful telescope in Chile by 2013. Google's part will be to 'develop a search engine that can process, organize, and analyze the voluminous amounts of data coming from the instrument's data streams in real time. The engine will create "movie-like windows" for scientists to view significant space events.' Google's been successful on turning its search technology on several different media and realms. Will they be successful with helping scientists tag and catalog events in our universe?" The telescope will generate 30 TB of data a night, for 10 years, from a 3-gigapixel CCD array.
I actually did a small, insignificant portion of LSST's computation feasability study at LLNL during my internship there a couple summers ago. And yeah, the computational requirements were nothing to sneeze at. I'm not sure where they are at now, the specs changed seemingly every month, but when I left the CCD array was up to 3 gigapixels of 16 bit greyscale. I believe the observing cadence (at that time, again everything was changing on a regular basis) was two of those for the same piece of sky every 30 seconds. Wish I could have stayed... ahh well. I did get a really nice full-color research poster (that I had to design) out of it though!
Education is a better safeguard of liberty than a standing army.
Edward Everett (1794 - 1865)
The LHC will produce more data, but we also don't care about most of it. The vast majority of it is junk. The "interesting" physics (particles like W and Z bosons, top quarks, higgs, etc) are about 10^-9 of the events. It is a huge needle in a haystack problem and we throw out most data. We have many experts and professors who design "triggers" which, based on a subset of information that can be delivered to them in a reasonable time, decide whether a given proton-proton collison contains new physics. Many theorists these days are making dents in walls with their heads trying to think of ways these triggers might be missing important information, so that we can suggest changes before it's too late. This is a lot of dedicated silicon, FPGA's, VME crates, etc. Slashdotters should drool. Anyway, we throw out the vast majority of information.
By comparison, LSST is trying to store everything. Scroll up for an interesting comment about calibrating ambient brightness and seeing. I can't answer which will deliver more information, but both are incredibly interesting challenges.
Data challenges abound. We have designed the LHC Grid to distribute this information. There will be several data warehouses located around the world at national labs and universities. Even after the triggers decide what is "interesting", more sophisticated algorithms, with access to all the data in a single proton-proton collision are applied. Then, humans are applied to the data and we will try to dig out new signals from this.
In all this we expect to find (among other things) the origin of mass and Dark Matter, and we're working hard to prepare for the onslaught of data. :)
-- Bob
1^2=1; (-1)^2=1; 1^2=(-1)^2; 1=-1; 1=0.
The shop I'm at has been working the image processing and data storage problem for PanSTARRS, another sky survey project that is a bit further along (they have a test scope up and running on Maui). It's interesting to me that both projects are at once using conventional solutions and thinking outside of the box.
Conventional: LSST will use a single large telescope and detector; PanSTARRS (as it stands) intends to use a dedicated compute cluster for data reduction.
Novel: LSST is leaning towards distributing its data reduction task over Google's huge server farm; PanSTARRS will use four off-the-shelf 1.8m telescopes, each with a 1.4GP detector, mounted together to image the same piece of sky, and merging the overlapping images in post processing.
When I was working on the project, one of PanSTARRS requirements was to finish analyzing one night's viewing before the following sunset. Early on, the principal investigators decided to solve the image storage issue by not storing them permanently. Instead, once the science for a night's imaging had been extracted (astrometry, LEO or supernova detection, etc), the original images would hit the bit bucket. Whether they've stuck with that I don't know.
Luke, help me take this mask off