Python-to-C++ Compiler
Mark Dufour writes "Shed Skin is an experimental Python-to-C++ compiler. It accepts pure, but implicitly statically typed, Python programs, and generates optimized C++ code. This means that, in combination with a C++ compiler, it allows for translation of pure Python programs into highly efficient machine language. For a set of 16 non-trivial test programs, measurements show a typical speedup of 2-40 over Psyco, about 12 on average, and 2-220 over CPython, about 45 on average. Shed Skin also outputs annotated source code."
Until he addresses mixed types in n-tuples, this won't be useful for very many people.
"Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
As a UNIX admin, I was saddled with one of these kinds of things years ago, a DEC-BASIC to C compiler for UNIX. The output code quality was incredibly bad: machine generated variable and function names, bizarro nested struct/union/struct data structures, 400-line functions peppered with calls to 1-line functions. Completely unreadable. Thank $DEITY that project died quickly.
No folly is more costly than the folly of intolerant idealism. - Winston Churchill
Why not pure assembler ?
See, it's all well and good to compile python to speed it up. The problem is, people are now saying that they can write efficient code in python just because it magically translates to C++, and because this translator is faster than other python compilers.
This won't be meaningful until a converted python script is compared to efficient code written natively in C++ in the first place.
StoneCypher is Full of BS
This is a good step to make Python run a bit faster, but I don't think it'll really make a huge difference.
The best way to get some speed and still keep the nice Python functions and layout is just to export the most heavily used functions to native code (C/C++).
I don't know if its possible to take the C++ output and optimize it seperatly, that way you will have a good start to make native code though.
In short: Better, fast and easy, but not the best (if you can write native code)
My blog: http://www.redcode.nl
Among python programmers, I'm curious - how many use psyco (another python performance enhancement tool) for their projects? I fiddled with it a while ago (it didn't work because of a C module that it didn't like), but never had a compelling reason to go back to it. Performance optimization has never been important enough for my applications to merit the effort.
It's not wasting time, I'm educating myself.
I will have to explore it more, but it will be intriguing to see how they handle things like pointers and structs that are not in python.
Uh, why would they have to? This goes from Python to C++, not vice versa. If there are no pointers or structs in the Python code, why would they have to handle them? Certainly, it's quite possible that some Python variable types will be converted to pointers or structs in the output code, but that's orthagonal to the issue of Python not having them natively.
If you were trying to go from C++ to Python, then you'd have to convert C++ pointers and structs to some sort of Python data type, and your comment would make sense. As it is, I'm not sure what you were trying to say.
"The legitimate powers of government extend only to such acts as are injurious to others." Thomas Jefferson.
Well, it's not quite as bad as it sounds. He's seemingly only really forbidding incompatible mixed types in the same variable, a usage that isn't exactly extremely common.
A more significant roadblock, IMO, is that he can't handle mixed types in 3+-tuples, which is very common.
"Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
Why would one ever need to do that? The goal is not to write C++ in Python, it's to compile Python to machine code via an intermediate Python -> C++ compilation.
"The way we can tell it's C# instead of Haskell is because it's nine lines instead of two." -- wadler
bzerodi's point, made with Zen-like simplicity, is that language choice should be made to minimize programmer time, not machine time. I am at least a factor of ten more productive with Python than with C or C++. I am also far more confident in the correctness of what I write per line of Python than with what I write per line of C/C++.
Yes, I have have wasted some time staring at the shell waiting and waiting for it to return from some complicated Python routine. I know that compiled C would faster, and hand-rolled assembler would be faster still. But I say to myself: hey, I wrote this code in a single afternoon, how many weeks of hair-pulling would it take to re-engineer this - and make it bug-free - in C? When I put it that way, I don't mind waiting the extra minutes for Python to do my dirty work.
As a previous poster mentioned, the ability to handle tuples of mixed-types is critical. I look forward to seeing great things from Shed Skin in the future.
Dictionaries are for loosers.
surely the best way to speed it up is to compile it straight to object code... c++ has to be compiled and just adds an intermediate step which will make things harder to debug...
Donald 'Duck' Dunn: We had a band powerful enough to turn goat piss into gasoline.
After four hours of tweaking, our expert C++ programmer was finally able to write something that beat our ten lines of Python code that took under five minutes to write. And it didn't beat it by much, whereas the first pass at a C++ version was an order of magnitude slower.
Which is why languages like python were written in the first place. They pretty much just make the underlying C calls anyways, but do so in a way that handles buffer overflows, pointers, etc., that pretty much make C/C++ so troublesome, hazardous, and hard to learn. I like java (alot really), but nothing beats a good scirpting language, like perl or python, to handle tasks like text manipulation. Python is especially good at using libraries, such as the imaging library, which are written in C anyways. How much faster can you get calling a C library from C than from python? I honestly don't know, but I can't imagine it's that much more. But when you add in speed of development, safety, and even portability, it's powerful.
Python's OOP is also a feature that makes it far more attractive than perl for me. Perl does OOP, but it's not as clean as python's, and I don't think it supports all the OOP features either. Doing GUI's is not the strength of any scripting language, but it depends on what you need to do. You can write a native frontend and embed python into a C or even a java application.
My problem? I was perfectly gruntled, until some numbnuts came by and dissed me.
Why? Read the linked page? Says it all. Violates most any Python code of any complexity out there. So if it doesn't convert Python code from the real world, what is it for? Making Python coders learn enough about C++ to remember the limitations and write/rewrite Python code to use it?
What the Python C/C++ interested people REALLY need is a book written by a group of Python AND C/C++ masters which teaches the two simultaneously showing complimentary methods of doing any given thing working from beginner to advanced and I DON'T mean "How to turn your n00b Python code into C/C++ hotness" sort of viewpoint. I mean both taught simultaneously in synch showing how they can interchange and compliment.
Software tricks for converting? Ultimately worse than not having them because it leads to horrible obfuscation because we don't know exactly what is going on when 13,412 lines of Python is turned into C++ because WE DIDN'T WRITE IT AND WE NEVER LEARNED C/C++. "Say Mike, that's great but you're the company code cowboy and you don't do C++ natively and I sure as hell don't read it being management so exactly what happens if this needs to be fixed? We've gone from importing open source code you couldn't read to writing our own open source code you can't read."
If my grammar and spelling are off, I am [distracted/tired/careless] (take your pick)
I love Python, but I hate the dynamic typing. It can be handy at times, but 99% of the time you make a variable to hold one kind of thing. Having the static typing would both improve performance (because the interpreter knew what you were up to) but would also eliminate bugs (because it would complain when I tried to set a double to "And now press...").
I'd love to see Python get optional static typing.
Comment forecast: Bits of genius surrounded by a sea of mediocrity.
"Doing GUI's is not the strength of any scripting language..."
;)
This is why projects like pyGTK exist
"A truly wise man realizes he knows nothing."
Sorry, but without more details it would seem to me that
your "expert" C++ guy wasn't an expert. Can you describe the
problem a little better.. if what you say is true, I as
a long term C++ programmer would consider switching, but
I've looked at python, and I simply don't believe you.
I'll grant that C++ is a nightmare for beginners with more pitfalls
than an indiana jones movie, but once you know them, writing
poorly performing code is unlikely.
http://rareformnewmedia.com/
Well, C# has unsafe arrays, while VB.NET only exposes them quite indirectly through the marshalling API. Some other language implementations also uses some dose of reflection/late binding to implement certain features. You can sometimes avoid use features, but this will sometimes result in code that is "non-idiomatic" in that language. I like the .NET framework, but it's no panacea for a language-agnostic future.
MSIL is machine code for a virtual machine rather than a physical one. This distinction makes no difference to the point the GP was making.
In this world nothing is certain but death, taxes and flawed car analogies.
It is worth mentioning that one of the the original implementations of C++ (if not the very first) was "cfront", a C++-to-C converter. I see this as a much easier way to get a new language implemented quickly, as you can take advantage of the common functionalities already implemented in the target language of the converter. Although Python is not a new language, using it as a compiled language is new, and thus I believe it is comparable to being a new language for this argument. C++ and Python have a lot in common, which makes C++ a very suitable target language for a Python-to-[compiled_language] converter.
If this converter proves to be successful, I believe that a GCC frontend will be written eventually. There are probably potential optimizations that would be difficult or impossible to implement any other way.
Some may think that the dynamic nature of Python may preclude its inclusion in GCC. Technically, all that would need to be done is to have a runtime to handle dynamic things, similar to how Objective-C (for which there is GCC support) has a runtime to handle message passing and late binding. However, a large portion of the potential efficiency of a compiled version of the language would be lost to these dynamic capabilities; luckily, a compiler can detect when things are implicitly static (in fact, this converter is limited to implicitly static constructs), and optimise them to be truly static at compile-time.
As another poster already said, file I/O is a bottleneck regardless of ANY language. So, try something different. Real-time h264 decoding for example.
This sig does not contain any SCO code.
"Times faster" is a unitless quantity.
http://outcampaign.org/
Python is a terrific prototyping language (and lots of other things besides.) As a C++ coder I've been using it for prototyping stuff that will eventually be integrated into a larger application and therefore MUST be translated to C++. So what I'd like to see is a tool (written in Perl, just for the fun of having a linguistic threesome) that just does a light gloss on Python syntax to get me most of the way to human-readable C++. That would be far more useful (to me) than thsi thing, which sounds more like f2c, whose output could case brain damage in humans and cancer in rats, or possibly the other way around.
Blasphemy is a human right. Blasphemophobia kills.
"boo", a .NET language, allows dynamic typing by specifying 'duck' type. It achieves near-c# speed because all other data are statically typed.
.NET world.
It's a great language -- combining the benefits of Python, Ruby, and C# -- and it's wonderful for proto-typing in the
Assume that it takes:
- 4 hours to write a given program in python, 32 hours to write same program in C++
- 10 seconds to run the python program, but just 2 seconds to run the faster C++ program
- the program is run 20 times a day
- assume the developer time costs as much as the the time of the person that runs it
Ok, so it'll take 630 days of running this program for the faster C++ program to make up for the extra time to develop it. So, if you can wait two years for a payback then C++ is the way to go, otherwise code it in python.
There that was easy. Ok, any other simple problems out there? Which editor you should use? What's just the right amount of comments per program? Which is better - cvs or subversion?
As have I, but I'd certainly rather manage in languages that support first order data structures, "for each" loops for iterations, proper disjunctive types, pattern matching, and so on. C++ is better than it used to be, but all the data structures and algorithms in the standard library barely hold a candle to the expressive power of many functional programming and "scripting" languages.
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
Indeed, VB.net and C# have very similar features and capabilities, and if there are big performance differences between them, it's because the authors of one of the compilers screwed up.
But the other posters were arguing that their performance and capabilities should be identical because they both compile to MSIL, and in fact that any language that does so would have equal performance and capabilities. Which is just silly; hence my silly IRock.net example. For a less silly example, Managed C++ certainly has different capabilities than VB.net or C#.
VB.net and C# produce very similar performance, because they are very similar to begin with. Not because their existing compilers target the same virtual machine.
Oh, hell.
That'll teach me to hit submit without checking the preview. I lost a big and important chunk of the reply after operator< because I forgot to write out the entity for <. Here's a repaste; yay form buffers, boo no edit button for the first five minutes of a post.
-----------------
That's the wrong comparison to make, because it assumes that the C++ programmer has unlimited time to make his C++ code efficient and correct.
Well, yes and no. I actually got into this else-thread; there are a hell of a lot of programming jobs where the programmer actually does have time to make their code correct. Not everything behaves like the web; when you're writing a video game, an operating system, a database, embedded or realtime control software, or in fact many many other things, performance just isn't sacrificable.
(And, actually, in most situations, a programmer has all the time in the world to make their software correct; the number of software houses which will willingly release software containing flaws is vanishingly small. Most released flaws are the result of bad development practice and insufficient testing methods, not short schedules.)
In real life, programmers have time constraints, and under given time constraints, the Python program will often be faster than the C++ program.
Yeah. If you'd take a look at the numbers, though, the vast bulk of software is actually embedded software. Embedded software can't tolerate execution delays. Behind embedded software, the next largest group is in-house software; that kind of software generally can't tolerate production delays. Those two groups are a wonderful example of the extreme divergence in needs in development - python is exactly the right thing to do for the second group, but exactly the wrong thing to do for the first.
As far as the Python program often being faster than the C++ program, that sounds an awful lot like an expectation, rather than experience.
In fact, even without time constraints, C++ code often ends up far less efficient than the optimum possible, simply because using the optimal algorithm or memory management strategy is so hard in C++ that programmers can't do it.
Yeah. Now it's time for me to call bullshit. This is such a weirdly interesting myth, that algorithms and memory management in C++ are harder than they are in other languages, that I don't really know what to say.
See, implementing algorithms and memory management in C was really, really hard - you had to, well, keep track of a pointer and an offset. (Because most people confuse the overflows that come from bad engineering practices with something that's just magically too hard to do, presumably because they've never seen anything harder, and they can't imagine a world wherein garbage collection and pointers aren't the alpha and omega of memory management. Wait'll you try COBOL, FORTRAN or machine assembly, all of which are still common languages - in fact, all moreso, if you count all assembly languages as one, than python.)
That said, in C++ it's nowhere near that difficult. If you want garbage collection and can't be bothered to write new in the constructor and delete in the destructor, bust out a smart pointer. Get a container; it'll self allocate just fine, and ridiculously efficiently. Need something more complex, like pooling? No problem - pooling is two lines of code in operator new, or you could just use the policy class in Boost. Algorithms are in fact so fundamental to the design of C++ that there's a specific section on them in the standard, defining how they are to be implemented such that they magically and correctly attach to any correct container. They are trivially easy to implement; a naive bubble sort which is correct can be implemented in a single line of code, in a way that will work for any user defined type implementing operator< and on any ordered container.
Exactly what memory management
StoneCypher is Full of BS