Internet Archive Gets 4.5PB Data Center Upgrade
Lucas123 writes "The Internet Archive, the non-profit organization that scrapes the Web every two months in order to archive web page images, just cut the ribbon on a new 4.5 petabyte data center housed in a metal shipping container that sits outside. The data center supports the Wayback Machine, the Web site that offers the public a view of the 151 billion Web page images collected since 1997. The new data center houses 63 Sun Fire servers, each with 48 1TB hard drives running in parallel to support both the web crawling application and the 200,000 visitors to the site each day."
one would assume that something like this does regular off-site back-ups, which must add up to a hell of a-lot, could someone with experiance in such matters shed a little insight into the logistics of backing up such a vast system
I have no idea how much 4.5 PB is until it's given in units of Libraries of Congress.
Does lusting after all their space make me a peta-phile?
Life==Jeopardy. All the answers are right in front us - the hard part is coming up with the correct question.
so all one need to do to "own the internet" is to drive a big rig and ... lift the container off their parking lot?
I can now theoretically steal "the internet" with a flatbed truck and a lift. There's something to be said for conventional data centers: They're rather hard to load onto a truck and drive off with.
#fuckbeta #iamslashdot #dicemustdie
Are there any resources the let us see websites from 1996, 95, 94, or 93? I would love to revisit the web as it appeared when I first discovered it (1994 at psu.edu).
"I disapprove of what you say, but I will defend to the death your right to say it." - historian Evelyn Beatrice Hall
The Internet Archive also works with about 100 physical libraries around the world whose curators help guide deep Internet crawls. The Internet Archive's massive database is mirrored to the Bibliotheca Alexandrina, the new Library of Alexandria in Egypt, for disaster recovery purposes.
Incidentally: FileFront is closing in five days, taking with it any files that aren't hosted elsewhere.
I am told that many of the Half-Life mods hosted there are not available anywhere else, so get while the getting is good...
... of a 4.5 petabyte datacenter in a shipping container in transit.
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
So wehre does the 4.5PB come in to this?
... one would assume that something like this does regular off-site back-ups, which must add up to a hell of a-lot,..
As I recall from one of Brewster's talks: Part of the idea was that you can install redundant copies of this data center around the world and keep 'em synced.
You can ship 4.5 petabytes over a single OC-192 link in about 71 days.
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
63 servers * 48 disk of 1 TB = 3024 TB. According to the announcement on the archive.org 3 Petabytes would be right.
The new data center houses 63 Sun Fire servers
That's not very specific. "Sun Fire" is a brand that for a while got applied to all of Sun's rack-mount servers (except for NEBS-compliant servers, which were and are called "Sun Netra"). A little confusing, of course, which is why they've started calling new SPARC boxes "Sun SPARC Enterprise" to differentiate them from those mangy x64 "Sun Fire" systems. Except that there are still SPARC systems called "Sun Fire", so I guess the confusion factor didn't get any better...
Anyway, the specific server being used here is the Sun Firex X4500, a system with no less than 48 1 TB disks in a 4U space. Notice that this model is EOLed; presumably iarchive got a deal on some remaindered machines.
The shipping container is something we've seen before.
They're keeping the offsite backup distributed around the Internet, using the World-Wide Web to store it in real time.
Part of it may even be on *your* machine! We've really got to stop Brewster from leaching all your storage and make him store his backup himself - this business of using the originals to back up the backup just isn't sustainable!
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks