New Linux Petabyte-Scale Distributed File System

← Back to Stories (view on slashdot.org)

New Linux Petabyte-Scale Distributed File System

Posted by samzenpus on Wednesday May 5, 2010 @12:14PM from the check-it-out dept.

An anonymous reader writes "A recent addition to Linux's impressive selection of file systems is Ceph, a distributed file system that incorporates replication and fault tolerance while maintaining POSIX compatibility. Explore the architecture of Ceph and learn how it provides fault tolerance and simplifies the management of massive amounts of data."

10 of 132 comments (clear)

Min score:

Reason:

Sort:

History by Alcoholic+Dali · 2010-05-05 12:22 · Score: 4, Informative

Ceph was designed by Sage Weil (of WebRing fame), who is also one of the founders of DreamHost. They will likely be using it internally soon, if they aren't already. http://en.wikipedia.org/wiki/DreamHost
1. Re:History by TooMuchToDo · 2010-05-05 12:35 · Score: 4, Informative
  
  http://www.dreamhost.com/jobs.html
  
  FILE SYSTEMS SOFTWARE ENGINEER
  Los Angeles, CA
  New Dream Network has a vacancy for a Senior File Systems Software Engineer in Los Angeles, CA. Minimum requirements – Master’s degree in Computer Science or Computer Engineering, minimum of 2 years experience in storage programming, and background in Linux kernel programming, file systems development, network programming and Operating Systems design.
  Qualified applicants should send a plain text resume to cephjobs@dreamhost.com
Is it ready for primetime? by Meshach · 2010-05-05 12:30 · Score: 5, Informative

The headline in the Ceph wiki: Ceph is under heavy development, and is not yet suitable for any uses other than benchmarking and review.

--
"Maybe this world is another planet's hell"
Aldous Huxley
Re:Is data integrity really necessary for large da by CoderJoe · 2010-05-05 12:36 · Score: 5, Informative

Google's BigFile/BigTable architecture is a distributed filesystem. if a node goes down, the data that was on that node gets copied to other nodes to keep the replication count up.
Facebook is using apache cassandra, which adopts similar designs.
Re:Is data integrity really necessary for large da by ProfMobius · 2010-05-05 13:44 · Score: 5, Informative

First, Facebook & Google data are not possible to regenerate, as they are personal things, like emails, messages, posts, etc.
Second, you have other sectors producing large amount of data beside your favourite networking website. One example is the LHC. It is going to produce terabytes of data per DAY (15 petabytes per year). Another are space telescopes. Those data can't just be 'regenerated'. 1 day worth of data is incredibly expensive to produce.
Distributed file systems are already there, and people use them. Maybe not on your level of computer usage.
When you don't know what you are talking about, I think it is better to just keep quiet.

--
EULA : By reading the above message, you agree that I now own your soul.
Re:Linux® by tomhudson · 2010-05-05 15:32 · Score: 2, Informative

Definitely looks weird. I always write it in all-lowercase. But apparently the trademark is either all-caps ("LINUX®") or the standard capitalized form ("Linux®")
Someone should remind them to register "linux®" (all lowercase), before Darl tries to. A capital first letter just doesn't look right.
Re:Linux® by John+Hasler · 2010-05-05 15:55 · Score: 2, Informative

A word mark is always registered as all upper case. Lower and mixed case are still covered.

--
Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
Nope by avm · 2010-05-05 16:33 · Score: 2, Informative

Nothing special at all. It only means Taco used sequential instead of randomised integers for user ids, which in turn can be viewed as a very loose chronology of user registrations.
In other words, no.
Re:"Enterprisey" design? Yet no scrubbing? by Anonymous Coward · 2010-05-05 17:35 · Score: 2, Informative

Did I miss it, or did they really forget that crucial part?
You missed it. There is a scrubbing mechanism in ceph.
Re:pet-a-byte? by SlothDead · 2010-05-06 00:02 · Score: 2, Informative

Tera -> Tetra -> 4 -> 1000^4
Peta -> Penta (like Pentagram) -> 5 -> 1000^5
Exa -> Hexa (like Hexagon) -> 6 -> 1000^6
Zeta -> Setta (like 7 in many languages) -> 7 -> 1000^7
Yotta -> Otta -> 8 -> 1000^8
Or use 1024 if you don't like IEEE/IEC norms...