Seeking Multi-Platform I/O Libraries?
An Anonymous Coward asks: "I'm just getting ready to plunge into a new project, and joy of joys have been given complete freedom when it comes to the implementation language - so long as the program will build and run on both x86 Linux and Windows. Now, I don't need a GUI, this is systems stuff only (processing binary executables in fact, so lots of bitfiddling and big nasty algorithms over hairy data structures) so pretty much all I need are standard IO libraries. C is currently at the top of my list..but what other language should I be looking at? I'm happy to learn a new one, and have the go ahead to do it..like I say, they want absolute speed. Can someone suggest a better language? C++ is out, it does come with a speed hit (using C++ properly anyway, not as a
souped-up C). If I'm gonna take the speed hit, I
may as well consider something like Ocaml which might let me claw the speed back with better algorithms and data structures.."
[ in my best announcer voice ]:
Let's get ready to RUMBLE!!!
--
Evan
"$30 for the One True Ring. $10 each additional ring!" -- JRR "Bob" Tolkien
> Now, I don't need a GUI, this is systems stuff only (processing binary executables
> in fact, so lots of bitfiddling and big nasty algorithms over hairy data structures)
Sooo, you are writing a Virus Scanner ...
You will likely find that algorithmic improvements will gain you more speed than IO library efficiency, as long as you avoid VB. Heck, I'd even strongly look at Java with a good JIT. Don't write off anything 'till you've tried it.
meh.
I know it's a bit of a stretch, but consider Python. Prototype the heck out of the system in Python, profile the application, then recode the bottlenecks in C. Use SWIG to generate your interfaces. Easier to program, easier to extend, easier to read/maintain. Shorter programming time, too.
You'll be happier, your fellow programmers will be happier, your successor programmers will be happier, and the chewy parts of your code will still be really fast. Think about it.
Hi, yes! I have a similar question: What is the best language for What I Want To Do? It needs to be able to handle floating point numbers. I don't want to use perl, because it's slower than C but messier than Smalltalk. Also, should I use vi or emacs to edit my source?
But seriously.. Every language provides standard support for file IO, unless it's totally half-assed*.
If you actually want to get helpful answers, you might provide a little more information. For instance: How much analysis will you be doing on the files? How much data are you dealing with? Is this probably going to be blocking on input all the time, or does run speed actually matter? How large and/or complicated will your program be? Does cost of deployment really matter?
* Or halfway totally assed, or whatever.
Trees can't go dancing
So do them a big favor
Pretend dancing stinks!
Yes and no. Python is a nice language, and potentially useful. However, where I work we have a "legacy" system written in python with SWIG (interacting with C++).
/. and would be very unhappy with me discussing this.]
The problem is that under large workloads (which is normal for us) you end up with python spending more time marshalling and unmarshalling objects. It's a PITA. I blame this mostly on SWIG (which I am NOT a fan of. Don't get me started on what the maintainers consider good development practice.)
Python's a great choice if you can do it all natively. It's also a great language to prototype in and then "translate" to another language like C++ or C or Java. (depending on task and preference.) But I wouldn't do the python+swig thing.
[Note: I'm only posting anonymously to protect my identity. There are certain political factions at work that read
Yes, I would really recommend O'Caml. Here's why:
If you just write the same program you would have written in C, the speed will be quite good, probably about 20% CPU-slower than C. (And if your program is IO-heavy, you might not notice this at all.)
If you have any sort of limited time or interest (as most projects do), you'll be able to write a much better program in O'Caml than you would in C, because:
- Because it's safe, you won't need to ever spend time tracking down or debugging core dumps or memory leaks. Because it's statically typed, a large percentage of bugs are caught at compile-time.
- If your program is interacting with the network, you won't need to worry about buffer overflows, format string bugs, or most of the common security problems.
- O'Caml has a much richer core language than C, with support for algebraic datatypes, pattern matching, higher-order functions, threads, modules, and objects. You can do a lot of great stuff with these.
- O'Caml has a nicer (though not as nice as, say, SML) module system, which keeps your program from getting unmanageable, and helps isolate faults to a particular module.
And by better, I also mean faster -- development wisdom says that algorithms and data structures are what matter most, not just the instruction-level efficiency of your code.
Of course, if you don't know the language, then it will have a higher startup cost for you. But I think it's worth it; you'll learn a different programming style that can help you think in new ways even when you're writing code in Old School languages. =)
Of course, it has a portable IO lib - just because the corresponding module for more low level stuff is called "Unix" doesn't mean that it isn't available on Windows as well, with some restrictions.
Programming can be fun again. Film at 11.
I'm really very curious why you decided that c++ is out. I understand that the common (mis)perception is that c++ is slower - but let me ask this: Have you ever benchmarked it? If not, then I strongly suggest that you don't discount c++ out of hand. It has the cross-platform io facility of which you speak (streams), already has all the (completely debugged) algorithms and advanced data structures. Look, nothing is going to be faster than c (except for hand-tuned assembly) - If you absolutely need every little bit of performance, then don't bother with a language other than c. But, if you're looking for a language nearly as fast, with a complete template and streams library, that's portable, then you ought to seriously consider c++. (btw, I've written extensive projects in c++ (25000+ lines) - There isn't much performance difference, and the benefits to using it far outweigh any other penalties.)
Try using Ocam with a transputer board or 2 pluged into your PC
Wouldn't it be nice if schools got all the money they wanted and the army had to hold jumble sales for guns
Java is very portable and can do all that bit fiddling just as well as C. The syntax is very similar to C, so it shouldn't take long to adapt.
Once you have written the progam for Linux, the exact same code would work on Windows. Write the program once, not twice. Save yourself some time.
You won't have to worry anywhere near as much about messing up a pointer somewhere or about allocating the wrong amount of memory.
Performance? If you're worried about performance, then you have not used a recent copy of Java. Find Java 1.3 or 1.4 and try it for yourself. I've got a Java program that scans through about 6,500 Novell user accounts in under two minutes. Performance is not a problem unless you want speedy GUI.
Since you're not needing a GUI, I think Java would be an excellent choice.
Ouch! The truth hurts!
If you know C best, use C. If you know Java best, use Java. Ditto for Perl.
Really.
The better you know a language, the faster you will be able to write your app, the more optimized it will be, fewer bugs, etc. This is common sense.
(I was going to have a really smart-assed comment on Logo, but I'll reserve that for later....)
Use a threaded compiled FORTH implementation, if you fancy learning a new language. In fact, you can start off by writing yourself a little FORTH nulceus in C to bootstrap itself. You can easily add new primitives (ie machine code) by simply writing a new function and plugging it in to the dictionary, or even a c function with inline assembler, if speed is that important. You can write your own words to allocate and initialise memory for all the data structures you need etc. It'll be a great learning experience.
I'm out of my tree just now but please feel free to leave a banana.
This is more than just a language question. It looks like you're starting to get the standard responses already for Java, C++, etc.
But all of these opinions presume that you're fairly experienced in these languages. Ignore them.
Language experience/familiarity is THE factor here, so don't discount it. Someone who has been eating and breathing Java would likely produce speedier code than someone who is just learning C, for example.
Your employer/client wants SPEED. This project involves hairy and complicated bit fiddling. I would suggest NOT using this project to learn a new language, for the risks outweigh the rewards in this situation.
If you choose to use a new langauge for this critical job, you're setting yourself up for disappoint. Do not forget that you're going to have to go through the all the growing pains associated with a new langauge. You're going to spend weekends tracking down (and learning from) all the newbie mistakes one makes with a new langauge. You are going to encounter new and unfamiliar bugs at all levels - logical design, physical design, semantic, syntactic.
Do you really want to spend your nights and weekends figuring out what the heck is throwing some particular JAVA exception seamingly at random? Why your C++ function template specialization is being ignored?
Learning a new language is exhilarating, but that will quickly turn to FRUSTRATION when you run into that weekend-long show-stopper bug.
With your product being measured by performance, and with deadlines looming... When it comes down to crunch-time, I think the choice is OBVIOUS!!
Choose a different, fun project to learn a new language. But for this product you're delivering, I would encourage you to stick with the tools you know and love.
Best,
Captain Abstraction
stdio and UNIX I/O (depending on what you need) will work on both Windows and UNIX.
From the C language perspective, and ignoring GUIs, Windows and UNIX are very similar. Windows has a UNIX compatability layer so for the most part you should be okay with I/O functions.
Even though both NT and linux are POSIX compliant, there are enough quirks in the implementations, especially with regard to multi-threading libraries. As long as you use C or C++ (or any language that does not provide both a rich threading interface and good runtime support), consider using the NSPR libraries that are meant to provide a rich set of cross-platform interfaces.
#include <stdio.h>
Says it all really.
Cheers,
Ian
I had a lot of good experiences using memory-mapped files. If you need random access to file body as opposed to sequential (streams), pick whatever has MM files in it. That would be C, C++, or Java SDK 1.4.
Nevertheless, C++ can be fast, powerful, and simple as well. People have problems with C++ if they don't understand it well or if they work with people who don't understand it well. That is a real problem (most commercial and open source C++ programs and libraries are awful), but don't blame the language.
Ok, so Java isn't the greatest at performance, but it is cross-platform.
Apache 2.0 is based on an excellent platform independent IO library (and many other cross platform data types, data structures, etc), the Apache Portable Runtime. It's written in C, and it's fast.
http://apr.apache.org/
Ummmm... no.
IMHO Python has 3 flaws that make it unacceptable.
1. Using indentation instead of brackets to define blocks. I used to think this was a good idea then I actually looked at code written this way. Most editors have a "find matching" feature which I find handy to navigate code. No brackets and that feature is almost useless. Now where did that functions start?
2. A lack of variable declarations. First varaible declarations are free docmentation. They're a really nice way to know what varaibles someone else's code uses. Secondly typos in long vaiable names can create annoying bugs. If the compiler/interpreter will find them for me so much the better..
3. Function definitions in the class definition. This makes class definitions tend to be multiple screens which obscures the layout of the class.
Your "speed" priority, and the binary processing bit, got me almost sold, and then
I saw O'Caml!!
You quiche eating wanker, how COULD you forget assembly? Isn't that what programming is
all about? And WHY are you comparing C to O'Caml, a fine assembly macro language, to
shity ML dialect used by equally hard-wanking mathematicians and abstractly thinking
creatures? If these wankmaticians knew how the world operated, they would not
have invented recursion let alone APPROVED of inductions as a sane, corner stone
princible in their so called "art". Induction is only possible as long as the
the "counter" register can hold your index, and recurssion is the crackwhore narcessistic
twin sister of iteration (there is nothing she does, iteration can't do with
a well placed label and a jump.)
Listen to me son, read Quine, Boole and DeMorgan, get the manual to your processor,
and "script" at the level of the ONE TRUE ABSTRACTION LAYER.
How can you use it improperly? C++ is an object capable language, not a strict object oriented language. If you want to use objects, then fine. If not, then please don't.
Object oriented development is a tremendous thing, useful for many things, and a marvel of overcoming complexity through abstraction.
BUT, OOP is not the solution for everything. There are many problems that don't need an object structure, and should be written another way. Above all, drop the notion that C++ should be used only a certain way to be proper. The latest cool feature of C++, the Standard Template Library, isn't even object oriented - it's GENERIC, because that type of programming just was the right thing to do for that library.
If tits were wings it'd be flying around.
Have you actually tried using Python? If you have, it's probably for not enough time or using the wrong tools
1. Using indentation instead of braces kills the religious "coding convention" wars before they have a chance to start. It's easy to read, it makes what you read and what the parser read consistent (Never chased a mismatched indentation/braces case, have you?), and it just plain works. Where did that function start? Any editor worth its while can tell you that, most of them already have a macro that does this. If you ever used Scintilla/SciTE you'd probably never go back to "find matching" only style editors unless you were forced to - collapsing functions makes a lot of sense even in the curly brace world (more so in Python's indentation world).
2. There are add-ons that can enforce that, but that would be missing the point. The Python interpreter and language specification goes to some length to catch this kind of errors, and although it's a long way from e.g. C or Java, it caters for the common cases. Typos in long variable names may create annoying bugs, but ones that are _always_ easy to identify and fix. True, they wait for run time rather than compile time; personally, the number of bugs of this kind that I get is consistently low enough for this not to matter (and, since Python code tends to be an order of magnitude shorter than any other language except Lisp or APL, it's more than worth it. Plus, there's a Lint for Python if you insist). Variable declarations are NOT free documentation. "Object my_object = new Blah();" is not more informative than "my_object = Blah()". It's the variable's name that's the documentation, rarely it's type.
3. Oh jesus. C++, Java, SmallTalk, LISP and just about any other language does this too. What language are you using? Plus, try scintilla and you'll be amazed at what a GOOD language sensitive editor can do (for any of the above languages).
Try TCL.
For me, using TCL my performance increased by 60%
(especially when using its [Incr TCL] OO Extension)
TCL works on most unices, Windows, Mac, VMS, Palm Pilot...
Tk graphical library is so successful that other languages
(perl, prolog, python) are using it.
NASM is a great multi-platform assembler with useful macro syntax, where you could do defines for Windows and Linux so you call the same macro names for both. Do the low-level stuff in ASM as .dll's, .so's, and you can write the rest of your code code in your language of choice and call your libs.
"As flies to the wanton boys are we to the gods; they kill us for sport." - William Shakespeare, King Lear
If all this should have a reason, we would be the last to know.
QBasic.
You may want to try AT&T's sfio, coauthored by David Korn of the shell by the same name fame.
I stand somewhat corrected.
1. I don't have that kind of trouble with braces. Though I will accept that a good editor can handle that.
2. Variable declarations make fairly decent free documentation since all the variable names tend to be in one place.
3. However C++ does not require it.
As far as editors go. I use vi/vim. There may be a mode for collapsing things. I haven't noticed it yet.
A smart C++ programmer can use template metaprogramming in a library like Blitz++ to automatically build code optimised for the job. To write the equivalent code in C is possible but it's much more laborious and harder to maintain.
There are good reasons not to use C++. Performance isn't one of them.
-- SIGFPE
If you're really looking for performance, then you should look no further than C#. I wrote a C application and a C# application to compare the performance. The C app was over 1500% slower than the C# app. Then of course, I did have some infinite loops in the C app ;-)
1. Bad signature
2. ?????
3. Profit
Write it in Intel assembler. All the bit fiddling you want or not. Then simply cobble up some I/O code for the target platforms you want to run on. Should take you about an afternoon - if you know what you are doing, which it sure doesn't sound like from your dumb question.
The best tool for low-level IO and cross-platform stuff is VB. A lot of people don't realize it's cross platform, but it can be used to develop applications for windows 3.1, 95, 98, Me, XP, and NT 4/5.
You can choose to use a general-purpose language which has a good spread of capabilities, or you can go with a best of breed language in the area you are trying to work in.
For general projects, I use a mix of Python and C++. I'd say the best of breed languages for text would be Perl, math would be Haskell, and for getting down to the metal would be Assembler.
For what you are trying to do, the no-brainer choice would be souped-up C, i.e. C which uses a few C++ features to make your life easier.
"Well, put a stake in my heart and drag me into sunlight."
As your going for Windows, Linux and x86. I think Borland's Kylix would be a nice fit.
http://www.borland.com/kylix
Some of theK programming maxims are that memmap is better than read/write (the native file I/O is memmap), operating over bulk data is better than scalar data (the language is built around bulk operators), and terse code is good.
There is a warning, though. K is very elite and may be too elite for you (it was for me at first), but it is very eay to learn.
I coded in QBasic for 6 years before I learned Pascal, with never an indention. Now that I've moved on to "real, professional" coding in C, C++ and Java, I can't help but look back and notice that the stuff I could do in QBasic was a lot cooler than the stuff I can do in C and Java today.
:(
I'm a worse programmer today, and the worst part is, I can't remember any of it...
ACE is an OO Network Programming Toolkit in C++
In the company I work for our server products are just a recompile away from multiple UN*X flavors, including GNU/Linux, and Windows, using ACE.
Maybe it can help you.
No one's mentioned Borland's tools, but I think they'd fit the bill. Borland has great compiler technology, and it will compile and run cleanly across Linux and Windows (possibly with a few {$IFDEF}s). It has an I/O library that's as capable as C's (maybe a bit more wordy sometimes). Developing and debugging in Kylix is *much* quicker, in my experience, than using gcc/gdb. It's truly compiled, the compiler is lightning fast, and the integrated debugger is quite a bit more efficient than gdb based solutions.
FORTRAN 95.
Vim 6 has very nice folding functionality.
john
Take a look at this guy's page, some interesting benchmarks between a number of computer languages for a number of well known algorithms.
Use a Common Lisp compiler. As fast as C, as easy as, well, Lisp.
Lisp's reputation for inefficiency stems from the 60s/70s. It's actually blindingly fast.
Given the stated requirements, Ada 95 should be in the trade space. Only downside I can think of is that while there are several vendors for the Windows side, I am only aware of a single vendor for a Linux Ada 95 compiler (www.gnat.com).
You can download a non-supported version (windows and Linux) from
ftp://ftp.cs.nyu.edu/pub/gnat
or wait a few weeks for gcc 3.1 to be released (since the Ada 95 GNAT backend will now be included)
--- Liberty in our Lifetime
Don't bother with K - just a bastardized FORTH. Give me a break. How much do you make selling the commercial languange 'K'?
Too elite for you or K.
It aint just another rave drug. You too can be a coder...
It has a few things going for it.
** Faster than a turtle
** Anyone can code it
** Doesn't show up during random drug testing