The 1-Petabyte Barrier Is Crumbling
CurtMonash writes "I had been a database industry analyst for a decade before I found 1-gigabyte databases to write about. Now it is 15 years later, and the 1-petabyte barrier is crumbling. Specifically, we are about to see data warehouses — running on commercial database management systems — that contain over 1 petabyte of actual user data. For example, Greenplum is slated to have two of them within 60 days. Given how close it was a year ago, Teradata may have crossed the 1-petabyte mark by now too. And by the way, Yahoo already has a petabyte+ database running on a home-grown system. Meanwhile, the 100-terabyte mark is almost old hat. Besides the vendors already mentioned above, others with 100+ terabyte databases deployed include Netezza, DATAllegro, Dataupia, and even SAS."
The LHC will generate several PB of data per year, as will the Large Synoptic Survey Telescope. These projects aren't all that uncommon.
"Seven Deadly Sins? I thought it was a to-do list!"
1 Petabyte = 1,000 Terabytes
1 LoC = 10 Terabytes
100 LoC = 1,000 Terabytes
======
100 LoC = 1 Petabyte
http://labs.google.com/papers/bigtable.html
WalMart's data warehouse is already 4 petabytes: http://storefrontbacktalk.com/story/080307walmart.php
Just because I can hook a shark from a boat, I do no offer to wrestle it in the water.