Looking for Portable MPI I/O Implementation?
rikt writes "I am trying to implement MPI I/O for our CFD product. I am facing a problem with the portability of the generated data files. MPI2 interface describes a way to achieve this either by using 'external32' or user defined data representations. The problem is that ROMIO, the most widely available MPI I/O implementation, has not implemented support for any data representation other than 'native'. Do you know of any MPI I/O implementation that supports this, and is available on various platforms? I know IBM and Sun supports this, but I am looking for a solution on Linux and Windows (both 32 & 64 bit) as well."
I'm a geek who does administration and programming in Windows and Linux realms, am fairly aware of my acronym soup and yet this left me, um, cold. For those who don't feel like doing the research:
MPI: Message Passing Interface, a standard for parallel processing environment message passing.
MPI-2: Extended version of MPI.
MPI-IO: Parallel input/output extensions for MPI, included in MPI-2
ROMIO: An implementation of these extensions.
CFD: Computational Fluid Dynamics (a good candidate for parallel processing, thus the interest in the above).
Of course, the fact I had to look them up means I have no idea about implementations, but at least others won't have to wonder what all that was about.
Sig under construction since 1998.
Go away and read.
/. This type of question is great, it usually means I'll go off and read up on this stuff.
I hate the drivel question asked on
Think of it this way.
typedef struct MessageStruct {
int messageType;
int messageSize;
} Message;
Message msg;
msg.type = messageType;
msg.size = sizeof(Message);
msg.data = someData; /* repeat for each part of the message struct*/
SendMessage(&msg, msg.size);
MPIMessage msg;
if (msg.size == sizeof(Message)) {
Message *msg = &msg;
}
I think something along these lines should work. Just make a struct for each type of message your app has. Then check the size and type elements of the structs to determine which type of message you have recieved. You can also just make a struct with just a type and size field and copy the first 8 bytes of the message into that and use that to determine the type of message. I'm sure I am missing some implementation details, but something like this should handle your problem.
"Those that start by burning books, will end by burning men."
I take it you've aleady read section 7.5 in MPIv2. If you haven't, now's the time!
Unfortunately, I know of no other MPI I/O implementations, other than ROMIO, that can simply be plugged into an existing MPI stack. You might want to ask around at the new project OpenMPI, a new-from-the-ground-up MPI implementation that is currently in development. I'd be curious to learn the level of MPI I/O support that they claim!
Assuming you are stuck with a MPI stack that only supports the "native" representation, the problem you face becomes one of data representation in general. As you know, there's bajillions of different ways of storing floating-point numbers, and if you write them to disk, the files will be only valid for exactly that CPU.
As a last resort, a brute-force solution is to write the numbers as human-readable text, and then parse them in again accoringly. It's a waste of file space, but there's no ambiguity in the datatype representation, and it is very tolerant of floating point differences between machines.
-1.2345234523452345
2.345634563456365e+13
-3.2121212121e-24
And so on.
This shouldn't be much of a hotspot in your code, since ideally it would only be done at start, stop, and checkpoint time. Also, if you need paralellism, and don't care about wasted file space or future precision improvements, you could use a fixed-length string for each number (with much padding), thus allowing you to read your numbers random-access instead of sequential.
Hope this helps!
Josh
Dr. Demento On The 'Net!
Now, we move onto the portable I/O. The vast majority of scientific software (which is, in turn, the bulk of MPI-based software) uses the Heirarchical Data Format. There are two versions worthy of mention - HDF5 and Parallel HDF. Both support MPI in operations. Compile HDF5 with MPI support, and you have something that will support platform-independent atomic and compound data types.
Of all the options, HDF5 (from the NCSA) is the most widely used. I would say that the majority of scientific and distributed software out there that uses platform-independent typing uses HDF. So does the grid computing system Globus. The other platform-independent complex data typing libraries, CDF (from NASA) and NetCDF (from UniData), are rarely used. Indeed, the next generation of NetCDF - version 4 - will be built on top of HDF5. There's a link to the development site and the source code on Freshmeat.
Less-widely used, but still very significant, is the Transparent Parallel I/O Environment. I am not 100% sure if this supports MPI, it's been a while since I've used it and I never put in the dependencies on Freshmeat for it.
Depending on what is being done, PETSc may also be worth checking out. This supports MPI-based differential equations.
Globus can use MPI for communication and then handle the I/O directly. This means you only have to write your interface for one API, not one API per type of operation. Main problem is that Globus has a fairly large footprint, so you might not want to do that unless the project is large enough to warrant that kind of sophistication.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
I was going to suggest string representations, too... I am working an MPI project that deals with passing a lot of stuff around and found that the method of structure passing in MPI caused us to have to represent the structure specifically byte-by-byte anyway, so we have just stuck with doing everything as character arrays in specific formats...
The main benefit for us was that our message passing code became generic and we got the side effect of passing large values between machines without respect for endianess or word size.
hope that helps,
dave
Have you tried reversing the polarity?
"It's too bad that stupidity isn't painful." - Anton LaVey
http://www.verarisoft.com/
I work there and I worked on our MPI-IO implementation. I'm sure we'd like to find a way to help you out if you aren't against paying for the software.
so it's a place for socially impaired, would be geeks wasting their early adulthood modding cases with cold cathode tubes? Or is it a place for outrageous infomercial placements between the odd "geekly spinned" drivel on some old hashed, long gone by, fad. Or is it a place were, once again, would be IT geeks, dispense their judgements on the latest linux install screenshots... bitterly chastising whoever dares to perturb their self assessed expertise with questions and topics they can't even fathom? Your attitude is what makes up generalist TV, nothing too complicated to challenge the audience as it might be offended by it.
Mi domando chi à il mandante di tutte le cazzate che faccio - Altan
- We properly wrapped it such that I/O requests are of type MPI_Request, not MPIO_Request. Hence, you can actually progress IO requests, generalized requests, and point-to-point requests in a single MPI_WAITALL (or MPI_TESTALL, or any of the other variants)
- Our MPI-2 IO support is based on a component framework -- so replacing ROMIO is not only easy, it's encouraged! We had always intended ROMIO to be a stopgap soltuon until we could implement "something better" (as yet to be defined). We would love to have someone with expertise in this area to a) help define what a "better" component interface should be (our ROMIO interface is a simple one-to-one function mapping), and b) write one or more components to implement this in a generic and/or proprietary way.
That's a long way of saying: "E-mail me and let's talk."