The Evolution of Python 3
chromatic writes to tell us that O'Reilly has an interview with Guido van Rossum on the evolutionary process that gave us Python 3.0 and what is in store for the future. "I'd like to reiterate that at this point, it's a very personal choice to decide whether to use 3.0 or 2.6. You don't run the risk of being left behind by taking a conservative stance at this point. 2.6 will be just as well supported by the same group of core Python developers as 3.0. At the same time, we're also not sort of deemphasizing the importance and quality of 3.0. So if you are not held back by external requirements like dependencies on packages or third party software that hasn't been ported to 3.0 yet or working in an environment where everyone else is using another version. If you're learning Python for the first time, 3.0 is a great way to learn the language. There's a couple of things that trip over beginners have been removed."
and make everyone happy with Python 5.6
Shouldn't that be intelligent design? Otherwise we'd have way more python flavors.
I don't think will be a problem any more
I just started learning Python a few weeks ago when I got laid off from my QA job. I imagine I'm not to the point yet where the language differences between 2.x and 3.x are going to matter?
-- http://ninthagenda.com/
There's a couple of things that trip over beginners have been removed.
Ah, yes, I remember python tripping over me. It's actually pretty impressive that a snake figured out how to trip, why take out that feature? It seems like you're knocking it back a notch in the evolution toward legs.
The name Python originally came from Monty Python, so you're about 18 years late on that joke.
Spelling mistakes, grammatical errors, and stupid comments are intentional.
For those interested, IBM is running a primer series on the new language/runtime features.
There's also this older (but still relevant) PEP that explains things that did not change between the 2.x series and 3.0.
Personally, I'm not looking forward to migrating existing code bases (especially non-trivial ones) to 3.0, but I'm planning to do all new development against it (of course assuming that the various packages I use have ports).
For Python trivia lovers, here the the actual moment in time when 3.0 was let loose on the world. I'm such a sentimental geek :)
Web2.0: I love when people Flickr my cuil and digg my boingboing until my google is reddit and I start to yahoo
So if ( you are ( not ( held back by ( external requirements like ( dependencies on packages ) or ( third party software that hasn't been ported to 3.0 yet )))) or ( working in an environment where everyone else is using another version )).
The above sentence fragment is apparently a verbal quotation where Guido van Rossum forgot he used the word "if" when he was somewhere in the middle.
Oh, say does that Star-Spangled Banner entwine / The myrtle of Venus with Bacchus's vine?
Python 3 mostly changes things that deal with unicode (i.e. it uses unicode an it's "text" object, like Java).
If you don't care about unicode that much (e.g. you mostly deal with development tools, iso-latin1/ascii encoded files...) there is absolutely no rush to hop on the bandwagon. And perhaps you just hate unicode as a concept ;-).
I predict that the bandwagon will start rolling ~ Q2 / 2009, when toolkits like PyQt4 for 3.0 are materializing.
Save your wrists today - switch to Dvorak
I don't think so. One of the design ideas for python, IIRC, was to force "proper indentation" for "proper documentation".
I thought the same as you once, but I changed my mind. Now the braces just plainly, insanely annoy me.
Moreover, you cannot imagine how much time is wasted in typing something that has absolutely no meaning: you have to indent anyway. Braces are just a waste of your time.
So whitespace block delineation is finally out, in favor of braces? :P
This space for rent. All reasonable inquiries will be entertained at proprietors discretion.
I think that whenever a group releases a new version of their language, they should strive to make it (mostly) backwards compatible. Not only does Python 3.0 change the way things work in relation to specific function, but it also removes specific language conventions and creates new ones in their places. This means that very large projects have a lot of work to do to bring their project over to the new specification.
The question is: is this work worth the upgrade to python 3.0? I'd say on the whole, the changes do not contribute enough to the usability of the language to make it ultimately a worthwhile transition to make. I haven't seen really any compelling features in Python 3.0 that would provide enough incentive for me to spend hours of grunt work making all my code workable in Python 3.0.
</my two cents>
Web Hosting: Unlimited storage and bandwidth: $5/month
On the other hand, I've spent at least a full work week of my life fixing problems due to whitespace. Guido made a major fuck up there- by removing braces but not strictly defining whitespace, he's created a language where it's possible to have two identical looking pieces of code do very different things. If he had said that it must be indented by exactly 1 tab or exactly 4 spaces or whatever other measure and everything else would throw a syntax error, it would have been fine. As it is I'd say about 15-20% of the time I spent doing Python was spent fixing these kinds of bugs.
I still have more fans than freaks. WTF is wrong with you people?
Are Linux distributions that include packaged python versions and apps going to permit both 2.x and 3.x python versions to co-exist so all the apps (including local additions) don't have to be ported on the same day?
Yes.
Example program:
class MyClass(object): #{
def myfunction(self, arg1, arg2): #{
for i in range(arg1): #{
print i
# whoops, forgot to close that bracket!
#}
#}
Save your wrists today - switch to Dvorak
I think it's ridiculous that people's biggest complaint about Python is that it's whitespace sensitive. Any frustrations with it are easily solved by using the proper tools.
Learn a text editor, and this isn't an issue. In emacs, C-c C-q will properly tabify the function you're in, and tabs should behave mostly sane thereafter.
Sweet. Now with all that work that you've done in getting python to support braces, can you make it not depend on whitespace? I'm sure it won't take that much more effort.
There's no way an Intelligent Designer could think that
is a clearer syntax than
Of course, raw beginners don't know that % means format, but there was a time when I didn't know that / means division either. Will they deprecate all operators because they might confuse a beginner? I think there should be some reasonable limits to that everything-is-an-object thing.
can you make it not depend on whitespace? I'm sure it won't take that much more effort.
It would be easy to create a preprocessor to do that, but life's too short. I'll leave the excercise to someone that cares enough.
Save your wrists today - switch to Dvorak
Of course the list would be pretty long (good thing I don't have to list it), and of course Unicode is very significant, but I think there are other things just as significant if not more. Example: everything's an iterator now, not just a list.
BTW, Python 2.x has all the unicode support you need to write a correct application. You just have to use u'unicode strings' instead of 'strings' in a lot more places. Python 3.0 has just switched the default, which will make it easier for application developers to get it right. And that's VERY important. In both versions you have to think about encodings.
My prediction is about 18 months before Python 3.0 is considered the default. My team, in general a pretty early adopter of technology, won't be using it for at least 9 months, waiting for our dependency stack to fill in.
My fulltext search library, Hypy, on the other hand, should have Python 3.0 support any day now.
It's rare that you're presented with a knob whose only two positions are Make History and Flee Your Glorious Destiny.
So if someone on the internet told you that stretching your ass felt good, would you do it? Or would you consider your own experiences and preferences?
Do you even lift?
These aren't the 'roids you're looking for.
No, unfortunately they did the worst thing they could do in that respect. Nearly all the changes introduced will make longer lines of code. I think they are trying to make sure that you will need to use the line continuation backslash, which completely negates the advantages of whitespace formatting.
It seems that their definition of "clean syntax" is Java-like, rather than Perl-like. I never went to the extreme of playing "Perl Golf", but a concise syntax is one of the best ways to make readable code. I started using C rather than Pascal because of that. I switched from Perl to Python when I rewrote some Perl programs in Python and realized that, despite the somewhat longer code, Python was clearer to read. But I still miss the =~ regular expression match operator from Perl.
An optimal programming language should be well balanced. Not like APL, where a page of code can be resumed to a single character, but it's like learning to write in Chinese. Not like Java either, where you must write several pages of declarations before anything useful comes out. C is very close to the ideal, if you take the effort to understand how a computer works before you start to program. Perl is pretty good, if you resist the temptation to show off your ability. Python was almost there, the perfect compromise between readability and conciseness. Until 3.0, when they went astray...
I love Python. I hate Py3k.
All programming languages higher level than machine code that I've encountered, except for a few esolangs, use whitespace as a delimiter.
I would be interested to see an example of Python code where a change to whitespace causes two identical-looking pieces of code to do two different things.
The only issues that *ever* come up in such a scenario is a SyntaxError, and pretty much the only reason they ever happen en masse is due to indiscriminate copy-and-paste coding.
Syntax errors can barely even be called bugs, and in any significant project the amount of time you're going to spend dealing with them is easily dwarfed by the *real* bugs that are a natural part of the development process.
If developers are spending a truly inordinate amount of time on whitespace issues, it can only be due to lack of discretion and attention to detail, which I would be willing to wager is increasing the number of "real" bugs emerging as well.
have you been seen on slash?
I've edited Python in vi, Notepad, SciTE, Geany, and other editors without any problem. Never used emacs though. If whitespace is causing bugs in your team's code you need to (a) introduce process or (b) lose some dead weight from your team. For (a) you can standardise on editor and whether to use tabs or spaces, or you can get the coders to end a whitespace block with a comment, eg # endif. I've only been using Python a couple of years but my experience so far suggests the problem is with you and not the language.
Phillip.
Property for sale in Nice, France
There is a standard
I'm the guy with the unpopular opinion
If he had said that it must be indented by exactly 1 tab or exactly 4 spaces or whatever other measure and everything else would throw a syntax error, it would have been fine. As it is I'd say about 15-20% of the time I spent doing Python was spent fixing these kinds of bugs.
I have to assume that most of your time doing python has been spent copy/pasting code off the web. I've been coding python nearly daily for a couple years now. I've rarely made indentation errors, none in the last few months, and only once have I ever had an indentation error that took more than 10 seconds to debug. The thing is, most indentation errors are so visibly clear that it's really quite hard to make them.
If you're actually having problems with multiple spaces looking like tabs, you can use the -t option to make it throw an error if you use a mixture of tabs and spaces, but it really shouldn't be that hard.
I agree. I'm a Python fan but "use a proper text editor" is passing the buck big-style. Guido should have just mandated the use of spaces rather than tabs: everything renders spaces the same.
"Wise men talk because they have something to say; fools, because they have to say something" - Plato
Gawd, you all do realize that the code needed to re-deliminate python code to braces instead of spaces or tabs is about half a page right? You can have it both ways. Hint: Every time you hit an open bracket, increase a value by four. this value is the number of spaces before each line except where the line ends in '\' (a line continuation character). When you hit the close bracket, you decrease the number of spaces you emit by four. Simple, no?
CS majors know the time/space tradeoff, but they never get taught the 3rd, crucial, tradeoff of the set: comprehension!
Gawd, you all do realize that the code needed to re-deliminate python code to braces instead of spaces or tabs is about half a page right? You can have it both ways. Hint: Every time you hit an open bracket, increase a value by four. this value is the number of spaces before each line except where the line ends in '\' (a line continuation character). When you hit the close bracket, you decrease the number of spaces you emit by four. Simple, no?
CS majors know the time/space tradeoff, but they never get taught the 3rd, crucial, tradeoff of the set: comprehension!
for whitespace bugs in python. If your programmers insist on using their own personal editors with their own personal tab expansion preferences - then ban tabs. All fixed. Easily automated. Use a CVS script to reject *.py with tab chars.
I have also been bitten by C bugs caused by white space. Someone with a different tab stop had entered the code incorrectly, but it looked correct in my editor (with standard unix 8 space tab stops). Never did notice the misaligned brace until running it through pretty print...
I completely disagree. He should have made tab mandatory. Not space. This would have made it all much easier. Because 2 spaces, 3 spaces, look all very similar. but one tab is one tab.
"Freiheit ist immer auch die Freiheit des Andersdenkenden" - Rosa Luxemburg, 1871 - 1919
Um my bad on starting this little flame war I had no idea.
Agree completely. Though I have a heavily brace-oriented background, I've found learning Python while ignorant that you could use braces to contain code blocks, I've embraced the tab delimitation completely.
I've simply never had a major tab problem, and while I don't write terribly complicated code, I nest the hell out of things sometimes. I develop on Windows and use PythonWin and there's just never a problem with indentation. I totally don't get people who troll (not accusing you!) on the topic.
I posted about this before in a previous Python 3.0 article and a lot of people attacked me. However I very much feel that Pythons treatment of Unicode as UTF-16 is a HUGE problem that will cause no end of pain. I think a far cleaner solution to Unicode is to do the following:
- Make unmarked plain quoted strings produce byte strings just like they do now. Unless there are backslashes, the contents are precisely the bytes that are in the input file. Keep the automatic casting of byte strings to unicode strings.
- Force the encoding to be UTF-8 by default, or at least make it trivial to turn this mode on (in Python2.x the default init deletes the api to do this!)
- The sequence \uXXXX in a byte string constant should turn into the correct UTF-8 sequence. And the sequence \xXX in a Unicode string should be interpreted as bytes and converted from UTF-8 to unicode. This is necessary so that a string constant can easily be changed between bytes and Unicode.
- We must have lossless conversion of UTF-8 to UTF-16. The most popular method I have seen is to turn invalid bytes into 0xd8xx (which is invalid UTF-16 as it is lower-half surrogate pairs). Oddly enough this makes the UTF-16 api useless because the reverse conversion is not lossless, I have looked into this and it may be fixable but is complex: the to-UTF-8 converter must not translate a sequence of these to a legal UTF-8 sequence and instead convert that sequence to the typical 3-byte encoding of that number, and the from-UTF-8 converter must treat these typical 3-byte encodings as invalid byte sequences except when they are arranged such that the back converter would make them! This is messy but I see no other way to be able to use backends that insist on UTF-16 (in particular Windows filenames and it's clipboard).
The reason for this is that real Python programs need to handle arbitrary data that is *PROBABLY* UTF-8. Note that by "PROBABLY" I mean that the programmer really really wants to think of it as a sequence of unicode characters, not as a "byte sequence", but it must NOT compare any two different byte sequences as being equal.
I'm very afraid that Python3.0 as designed will encourage byte sequences to be treated as ISO-8859-1 rather than UTF-8 (because when you set the translation to that it is lossless and no errors are thrown, and \xXX does the same thing in both constants). IMHO this would be very, very bad for internationalization efforts. Believing the programmers will not take this easy solution, and instead rewrite their interfaces to the new byte/unicode naming and correctly handle exceptions thrown by converters is, I think, quite ignorant.
I am not joking or trolling about this. This has bitten me already and forced us to change all our use of Python from Unicode to byte strings. And we are just reading metadata from image files. Searching for comments on Python 3.0 on the web, it is apparent that web programmers are encountering this far more often and are very worried about this, and they certainly are trying to handle many orders of magnitude more data from sources that may be actively trying to exploit security holes.
Is lisp an esolang now?
The problem with whitespace is that it breaks code. Take a snipped of code copy it and paste it into a different indention level. In a block oriented language the code will continue to work exactly as intended and at no point will the code be invalid. In Python on the other side the code breaks as soon as you paste and you have to move that broken code back into usable form manually. Now a proper editor can help with that, but it won't stop Python code from temporarily break.
Thats Python is extremely lax with the whitespace of course just makes this problem worse.
And of course that isn't just theoretical, one of the worst coding experience I had in any language ever was refactoring a piece of Python code, since the old way to work of just copying stuff around, adding function names to it and then hitting auto-indent completly broke and I ended up constantly fixing white space and being extremely careful with my copy&paste, since the code constantly broke after every second operation.
As a KDE fan, I have to say just how jealous I am that other software development communities actually have common sense.
Apache: "Our newest is Apache 2, but you can use our rock-solid Apache 1 if you want." ... sigh ....
Python: "Our newest is Python 3, but you can use our rock-solid Python 2 if you want."
KDE: "What!? You're still using KDE 3? But we told all our developers to drop all KDE 3 and move on to our newest KDE 4, which just came out with the second release candidate version of the beta for our alpha version! Get with the times, man!"
404555974007725459910684486621289147856453481154 in hex is "You sank my Battleship?"
[GPG key in journal]
A good editor should re-indent the pasted code automatically. In VIM you can use :set ai, si.
Free Manning, jail Obama.
I bet the though process went something like this:
Guide: Hmm, should I use spaces or tabs for indentation?
College 1: Spaces, of course. Spaces look the same everywhere!
College 2: I disagree. One space is too small to visually indent code. Tabs FTW!
Guide: Why, I'll just do both.
Free Manning, jail Obama.
Worst is the cut and paste from one source into another.
So true, but for reasons that probably escape you.
The point is that is a flawed design that promotes inadvertent errors in code just like C's '=' and '==' operators are too easy to carelessly mix up (especially when switching between other languages that use '=' for equality tests). I like Python and the white space delimiting is liberating but it is unfortunately implemented in an ad hoc way that is susceptible to easily missed breakage. A better language wouldn't depend on the sort of higher level practices you suggest to guard against these sort of mistakes.
I am becoming gerund, destroyer of verbs.
echo "set et ts=4 sw=4" >> ~/.vimrc
You'll thank me in the morning.
I'm still a diehard C coder at heart, but I'll admit that braces as a syntactic measure are just plain bad (unless you're in a Lisp-variant, where a paren _is_ the whitespace, ffs). It's why reference-counting is insufficient for being a singular GC mechanism, and why, if compilers were built like garbage collectors, work efficiency would plummet.
Seriously, a decent editor that can swap out tab commands for a N-length block of spaces will alleviate your indentation worries. If you're worried about bytecode and compile-time efficiency (aka, the mythical "zomg whitespace==compile inefficiency!" fallacy), you wouldn't be using an interpreted language in the first, and you'd also know that Python+Psyco won't ever get you the same hardware optimizations as a true compiled language.
At that point, you're indenting for readability and maintainability anyway. Braces add nothing to syntax and actually add in avenues for compile-time error. I usually see this sort of issue as being with someone who isn't using Emacs of Vi properly [I'm a Vim user, but I hear tell that (setq default-tab-width) achieves the same thing in your emacs conf]. If you aren't using a capable editor, that's a fault of your editor, not of the language design principle.
It's been ages since I took compilers or checked out a copy of the G++ source, but IIRC, preamble whitespace is insignificant if you use a line terminator (aka, ";" for most C-lang expressions). Once the tokenizer kicks in, that whitespace becomes irrelevant to the expression because you know where expression and block delineations are. However, if you're typing that way in your CVS check-ins for maintenance sake anyway, why not lose the terminators and make the whitespace relevant? It all compiles the same and makes it easier for jyumang beans to read.
Never attribute to Hanlon that which can be adequately attributed to Heinlein.
# vim:ts=4:sw=4:softtabstop=0:smarttab :-)
--
The early bird catches the worm. The worm that sleeps late lives to see another day.
If you don't like the built in editor, I've enjoyed the Easy Eclipse for Python distribution as well.
http://www.easyeclipse.org/site/distributions/python.html
Guido made a major fuck up there- by removing braces but not strictly defining whitespace
Stop. First, the whitespace rule in Python *is* strictly defined.
The formal, exact, unambiguous specification of how Python interprets whitespace is in the official language reference - Lexical analysis.
It's pretty wordy, but I've studied it and it's quite precise. The relevant section is here:
"Firstly, tabs are replaced (from left to right) by one to eight spaces such that the total number of characters up to and including the replacement is a multiple of eight"
This is exactly the same as the default behaviour of Unix `expand`.
[Guido has] created a language where it's possible to have two identical looking pieces of code do very different things.
It depends what you mean by "looking". To you, perhaps 1 tab looks the same as 4 spaces. To me, maybe it looks the same as 2 spaces. To Jeff, maybe it looks like a red dot in his specially-configured editor. To Python, it happens to look the same as 8 spaces.
DO NOT MIX TABS AND SPACES. Then, I guarantee you that any two pieces of code which look the same to you (whether they use tabs or spaces) will also look the same to Python. (You don't have to enforce this across a whole file, just on a per-block basis, but it's best if your whole project has an agreed indentation standard).
That's silly. Then you'd be at Guido's whim; you'd have to indent the way he chose. This way, you can choose any indentation you like. Tabs, 2 spaces, 4 spaces, 3 tabs if you like. As long as you are internally-consistent, Python will be happy.
My second point to you: If you are pasting code from somewhere into your code, and you do not fix up indentation so it matches the surrounding code, you are worse than Hitler. Or at least very lazy. I don't care if you are using Python or C or Brainfuck.
If you carelessly paste 1-tab-indented code into a surrounding block which is 4-tab-indented, and don't fix it up, then how do you think I will feel when I open it in my editor configured to expand tabs to 2 spaces instead. It will be totally unreadable -- and this is why we indent in the first place (in any language, that is).
Python forces you to tidy this up, and that can only be a good thing. If your code is confusing Python, it's probably confusing a bunch of other readers as well.
2 spaces, 3 spaces, look all very similar. but one tab is one tab.
On your machine, one tab might be 8 spaces. But on someone else's, it might be 2 or 3.
"Wise men talk because they have something to say; fools, because they have to say something" - Plato
The point is that is a flawed design that promotes inadvertent errors in code just like C's '=' and '==' operators are too easy to carelessly mix up
May I suggest you try Ada or Eiffel then, which go to some trouble to try and weed out any such errors as early as possible (usually at compilation time).
Craft Beer Programming T-shirts
For the "=" and "==" thing, do equality tests backwards, like:
3==foo
If you accidentally put
3=foo
It'll throw an error about an undeclared variable (presuming you don't have a string variable named "3") when you try to compile.
upon the advice of my lawyer, i have no sig at this time
This is easy to demonstrate
for i in myarray:
** Do some stuf here, use spaces to delimit. Note we are already inside a function or class. That is, we are not at the first indent level
print "Hello world" //Note this line is tab delimited. It looks likes its at the right indent level but its not.
Now you expect the code to print hello world a load of times but it will actually do it only once.
Its easy to extrabolate this to less trival problems
No, and it uses whitespace as a delimiter, and, at least in many dialects, differentiates between kinds of whitespace (at least, between newlines and everything else.)
E.g., in some lisps:
is not the same as:
Though they differ only in whitespace.
Sure, it's called C. Since both are Turing-complete languages they're functionally-equivalent, and what's more C-like than C? :)
If you want a C-like higher level language, I'd recommend C# with Mono, though if your country's patent system is broken it may not be entirely safe to use. It support a good part of what makes Python so cool, like lambda functions, being able to pass functions as arguments or even return them, though they aren't as easy as in Python. Still, it's the closest we've got if you want your curly braces.
No problem is insoluble in all conceivable circumstances.
Is there a Python clone that uses C style formating?
See http://www.emacswiki.org/emacs-en/PyIndent
You're exactly right. But unless tabs are only one space, it's still going to be easy to see the difference between 2 tabs and 3 tabs.
Stop Global Warming!
Just say no to irreversible processes!
A good editor should re-indent the pasted code automatically. In VIM you can use :set ai, si.
Was this taken directly from the sendmail book of configuration file design? Ai! Si, señor!
but a tab is still a tab. It might be represented by X spaces depending on your setting (eg I have 1 tab = 4 spaces).
But if you go through code or grep or whatever, you can always say ^\t and be sure that you get what you want.
With spaces? did he intend with 2? 3? 6? how do you then do this?
"Freiheit ist immer auch die Freiheit des Andersdenkenden" - Rosa Luxemburg, 1871 - 1919
Even being a flamebait, this one has to be replied. It is very important to understant that 3.0 AND 2.6 are being supported at the same time by the same core developers. Code breaks only if you go for 3.0. Stay in 2.6 and you will be fine.
That being said, I do agree that this raises questions regarding the feeling that we now have two distinct languages. I really have mixed feelings on this one, but I am sure that no script got broken as you have the option of not going for 3.0 and still get support.
-- dnl
Well, if you want the long options, you can use :set autoindent smartindent
What? just so you can write less readable code?
Follow me