PVFS2 - a High-Performance Parallel File System
neillm78 writes "As part of the development team, we're announcing PVFS2 version 1.0 here in Pittsburgh at the SC2004 conference! PVFS2 is a GPL/LGPL based parallel file system for cluster-based applications. It logically groups any number of storage servers into a coherent file system for use by client nodes, specifically tailored to handle efficient access to large shared files. PVFS2 supports access via an MPI-IO interface for high-performance parallel applications, but you can still mount it like a regular GNU/Linux file system for traditional serial applications and managment. The PVFS2 project is conducted jointly
between The Parallel Architecture Research Laboratory at Clemson University and The Mathematics and Computer Science Division at Argonne National
Laboratory. Please feel free to give it a try!"
PVFS (in its first incarnation) despite some instability (more so due to the fact that our first cluster was COTS cheap-o hardware), really helped drive down the load on our clusters by removing the need to perform NFS writes to a single head node for scratch space. The set up is extrememly simple and the code base was really small.
I plan on evaluating PVFS2 for our new clusters along with Lustre and GFS although I have heard nothing about the latter two operating over the MPI-ROMIO subsystem (which would definitely offer a performance increase).
The kernel is called Linux. Yea, you may compile against GCC but come on people! it's a Linux specific kernel module. Leave the GNU/ out of it.
That said, Nice job! I love to see the capabilities of Linux expanded in new directions like this. Cool work. I wish I had time to work on cool projects like that.
set softtabstop=4 shiftwidth=4 expandtab nocp worlddomination
I found that gigabit NFS was usually much faster with files smaller than 1MB. I guess because either way, you still had to go through one server to set up each FS operation. NFS had been around longer; the Sun implementation was hard to beat.
Has the meta-data server been speed up at all, or made distributed with some kind of coherency-syncro backend?
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
This is exciting and all, but the really importing thing about PARL is that they were the only ones at Clemson willing to host our site.
</SELF-PLUG>
Direct away from face when opening.
..but..uhh..not to appear too lame because I probably don't understand it.. but...could this be used in conjunction with someting like bittorrent so that big files like ISOs or whatnot could be shared easier cross platform? Do you understand what I am asking? An Esperanto for computers with large numbers of people working all over?
Sorry.
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
I've been skimming the documetantion for this.
Does anyone use this for big, transparent file storage networks.
I've been looking for something better than "a bunch of nfs servers with some code to redirect each client to his storage". This is a pain to manage as well as having lots-'n-lots of pof's...
I've noticed that that metadata is not in a single node anymore, but it's not replicated yet either. I could live with this reliability problem if it could give me the transparency to just add a server when needed and not worry about wasted space in old servers...
Am I the only one who think they overlook this design? After all, accidents happen, and you can't possibly expect everyone has backup copies of everything. The recycle bin idea simply does not work, because it does not preserve the directories structure of the files prior to deletion, and it can actually make system less secure by having to delete everything twice.