PVFS2 - a High-Performance Parallel File System

← Back to Stories (view on slashdot.org)

PVFS2 - a High-Performance Parallel File System

Posted by timothy on Tuesday November 9, 2004 @01:49PM from the good-nodes-are-still-available dept.

neillm78 writes "As part of the development team, we're announcing PVFS2 version 1.0 here in Pittsburgh at the SC2004 conference! PVFS2 is a GPL/LGPL based parallel file system for cluster-based applications. It logically groups any number of storage servers into a coherent file system for use by client nodes, specifically tailored to handle efficient access to large shared files. PVFS2 supports access via an MPI-IO interface for high-performance parallel applications, but you can still mount it like a regular GNU/Linux file system for traditional serial applications and managment. The PVFS2 project is conducted jointly between The Parallel Architecture Research Laboratory at Clemson University and The Mathematics and Computer Science Division at Argonne National Laboratory. Please feel free to give it a try!"

4 of 26 comments (clear)

Min score:

Reason:

Sort:

I hope the meta-data performance improved... by Ayanami+Rei · 2004-11-09 14:29 · Score: 2, Interesting

I found that gigabit NFS was usually much faster with files smaller than 1MB. I guess because either way, you still had to go through one server to set up each FS operation. NFS had been around longer; the Sun implementation was hard to beat.

Has the meta-data server been speed up at all, or made distributed with some kind of coherency-syncro backend?

--
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
Re:Been following it for a while... by mikefe · 2004-11-09 19:29 · Score: 2, Interesting

So it works over the network without needing a network block device layer?

That would mean it should compete on the level of OpenAFS, Intermezzo and CODA for fault tolerant network filesystems -- except it would have internode locking which the others don't at the moment.

That would also mean it doesn't directly compete at the same level as GFS (which is targeted at configurations of servers connected by a SAN or similar).

Is this project set on integrating with the mainline kernel? What has/will happen on that front?

This also looks perfect for an active/active LinuxHA failover cluster -- if it has redundancy, which any clustering filesystem should have. Right now the LinuxHA project is integrating GFS into their stack of interwoven sub-projects.

After looking at the site, it looks like it would be good for server to server connections, and not good for server to workstation connections. For instance, it doesn't look like it has any caching functionaility like OpenAFS does and it looks like each node needs to have a copy of some of the cluster data (or does that end at the meta-data nodes?). PVFS2 looks like it has a similar archatecture to Lustre, except PVFS2 is developed openly.

--
There: Something at a specific location.
Their: Owned by someone.
Please make sure your english compiles.
Re:Been following it for a while... by rizzy · 2004-11-10 02:47 · Score: 2, Interesting

That would mean it should compete on the level of OpenAFS, Intermezzo and CODA for fault tolerant network filesystems -- except it would have internode locking which the others don't at the moment.

That's an interesting thought, but at no time have we ever thought of ourselves as a replacement for those file systems. The ones you mention are general purpose file systems whereas PVFS2 is meant to be a fast file system for parallel applications.

except it would have internode locking which the others don't at the moment.

I'm not sure what you mean here. We have no locking anywhere -- which is exactly why we can deliver such high performance. Scientific applicaitons often don't need a locking subsystem getting in their way.

Is this project set on integrating with the mainline kernel? What has/will happen on that front?

There really isn't much for us *to* integerate into the kernel. We do have a VFS interface, but it acts primarily as a way to convert kernel system calls into userspace PVFS2 calls. Yes, there are lots of "file system in userspace" projects, but by making something that works just for PVFS2, we can get better performance.

This also looks perfect for an active/active LinuxHA failover cluster -- if it has redundancy, which any clustering filesystem should have. Right now the LinuxHA project is integrating GFS into their stack of interwoven sub-projects.

Funny you should mention LinuxHA. I spent some time this summer setting it up with PVFS2. If you really care about redundancy, you can invest in shared storage solutions (SCSI and firewire drives can be shared between two hosts simulaneously -- if you buy the really expensive stuff). With shared storage, you've got a way to tolerate node failure. You're still screwed if something eats your big expensive hard drive, granted. We're working on software replication.

PVFS2 looks like it has a similar archatecture to Lustre, except PVFS2 is developed openly.

Thanks for noticing! While I understand why CFS has taken the approach they have, we really feel that the HPC community (and Linux in general) needs a file system that's free software.
Re:Redundancy ? by REggert · 2004-11-10 04:09 · Score: 2, Interesting

I use Andrew File System (specifically, http://www.openafs.org/) for my files, since I was used to using it at school, and I'm fond of its access control system. It allows you to designate redudant sites for your volumes for backup or load balancing purposes. However, its major downside is that it's optimized for reads but not for writes (PVFS would probably work better if you need optimal write performance), and it can be a real bitch to set up for the first time. I've also yet to figure out how to get it to work through my NAT, though it's supposed to be possible. It beats the hell out of NFS (v2, at least, I haven't really taken a look at NFS v3) in terms of reliability, security, and scalability, though.

--
cp /dev/zero ~/signature.txt