Internet Archive Gets 4.5PB Data Center Upgrade
Lucas123 writes "The Internet Archive, the non-profit organization that scrapes the Web every two months in order to archive web page images, just cut the ribbon on a new 4.5 petabyte data center housed in a metal shipping container that sits outside. The data center supports the Wayback Machine, the Web site that offers the public a view of the 151 billion Web page images collected since 1997. The new data center houses 63 Sun Fire servers, each with 48 1TB hard drives running in parallel to support both the web crawling application and the 200,000 visitors to the site each day."
one would assume that something like this does regular off-site back-ups, which must add up to a hell of a-lot, could someone with experiance in such matters shed a little insight into the logistics of backing up such a vast system
I have no idea how much 4.5 PB is until it's given in units of Libraries of Congress.
Does lusting after all their space make me a peta-phile?
Life==Jeopardy. All the answers are right in front us - the hard part is coming up with the correct question.
so all one need to do to "own the internet" is to drive a big rig and ... lift the container off their parking lot?
I can now theoretically steal "the internet" with a flatbed truck and a lift. There's something to be said for conventional data centers: They're rather hard to load onto a truck and drive off with.
#fuckbeta #iamslashdot #dicemustdie
Are there any resources the let us see websites from 1996, 95, 94, or 93? I would love to revisit the web as it appeared when I first discovered it (1994 at psu.edu).
"I disapprove of what you say, but I will defend to the death your right to say it." - historian Evelyn Beatrice Hall
The Internet Archive also works with about 100 physical libraries around the world whose curators help guide deep Internet crawls. The Internet Archive's massive database is mirrored to the Bibliotheca Alexandrina, the new Library of Alexandria in Egypt, for disaster recovery purposes.
Incidentally: FileFront is closing in five days, taking with it any files that aren't hosted elsewhere.
I am told that many of the Half-Life mods hosted there are not available anywhere else, so get while the getting is good...
... of a 4.5 petabyte datacenter in a shipping container in transit.
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
... one would assume that something like this does regular off-site back-ups, which must add up to a hell of a-lot,..
As I recall from one of Brewster's talks: Part of the idea was that you can install redundant copies of this data center around the world and keep 'em synced.
You can ship 4.5 petabytes over a single OC-192 link in about 71 days.
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
TFA says "...eight racks filled with 63 Sun Fire x4500 servers with dual- or quad-core x86 processors running Solaris 10 with ZFS. Each Sun server is combined with an array of 48 1TB hard drives." (emphasis mine)
I would guess this means there's a x4500 with 24TB in local disks, and 48TB in attached storage per machine. (24+48)*63 does give us the quoted number
Blessed are the pessimists, for they have made backups.
This seems to be an exact use case for the X4500-type system, which as far as I'm aware is pretty unique.
Indeed. Sun is on a density kick. Check out the X4600, which does for processing power what the X4500 did for storage.
In both cases, there actually are competing products that are sort of the same. The most conspicuous difference is that the Sun versions cram the whole caboodle into 4 rack units per system, about half the space required by their competitors.
More absurdly-dense Sun products:
http://www.sun.com/servers/x64/x4240/
http://www.sun.com/servers/x64/x4140/
The point of these systems is that they take up less expensive rack space than equivalent competitors. They're also "greener": if you broke all that storage and computing power down into less dense systems, you'd need a lot more electricity to run them and keep them cool. That not only saves money, it gives the owner the ability to claim they're working on the carbon footprint.
They're keeping the offsite backup distributed around the Internet, using the World-Wide Web to store it in real time.
Part of it may even be on *your* machine! We've really got to stop Brewster from leaching all your storage and make him store his backup himself - this business of using the originals to back up the backup just isn't sustainable!
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks