Slashdot Mirror


The Evolution of Python 3

chromatic writes to tell us that O'Reilly has an interview with Guido van Rossum on the evolutionary process that gave us Python 3.0 and what is in store for the future. "I'd like to reiterate that at this point, it's a very personal choice to decide whether to use 3.0 or 2.6. You don't run the risk of being left behind by taking a conservative stance at this point. 2.6 will be just as well supported by the same group of core Python developers as 3.0. At the same time, we're also not sort of deemphasizing the importance and quality of 3.0. So if you are not held back by external requirements like dependencies on packages or third party software that hasn't been ported to 3.0 yet or working in an environment where everyone else is using another version. If you're learning Python for the first time, 3.0 is a great way to learn the language. There's a couple of things that trip over beginners have been removed."

215 comments

  1. Combine them.... by thetoadwarrior · · Score: 4, Funny

    and make everyone happy with Python 5.6

    1. Re:Combine them.... by morgan_greywolf · · Score: 3, Funny

      Lemme guess... you're a student of the Sun Microsystems'-sponsored Bill Joy School of Version Numbering?

    2. Re:Combine them.... by Radish03 · · Score: 4, Funny

      Either that or he's a Winamp developer.

    3. Re:Combine them.... by morgan_greywolf · · Score: 2, Funny

      Same thing.

    4. Re:Combine them.... by Anonymous Coward · · Score: 0

      Personally, I'd be satisfied with Python 3.11 for Workgroups.

    5. Re:Combine them.... by danieltdp · · Score: 1

      Because, after all, the difference between python 3.0 and 2.6 is almost nothing. Just 0.4

      (ducks)

      --
      -- dnl
    6. Re:Combine them.... by DaVince21 · · Score: 1

      It'll be arriving soon, assuming that Python 3.0 has enough bugs and problems to get there.

      --
      I am not devoid of humor.
  2. Evolution? by El_Muerte_TDS · · Score: 2, Funny

    Shouldn't that be intelligent design? Otherwise we'd have way more python flavors.

    1. Re:Evolution? by Abreu · · Score: 1

      Well, Python has a distinct, well known creator, so I guess it does qualify

      --
      No sig for the moment.
    2. Re:Evolution? by Anonymous Coward · · Score: 0

      No, because there's a very limited amount of resources available so all the weaker variations quickly died off.

    3. Re:Evolution? by Anonymous Coward · · Score: 1, Insightful

      Can't evolution be controlled? I guess that is the same basic idea as in intelligent design, but it can be called evolution even though somebody steers where it's heading.

    4. Re:Evolution? by Anthony_Cargile · · Score: 1

      It depends on if Python can actually fork() on Windows yet without using Cygwin.

    5. Re:Evolution? by wilder_card · · Score: 1

      There were a lot more flavors, but 3.0 and 2.6 ganged up and ate them all. Don't even ask about poor version 3.14159...

    6. Re:Evolution? by Anonymous Coward · · Score: 0

      Erm, it's called selective breeding (or artificial selection), and it's been going on for about 10,000 years.

    7. Re:Evolution? by flyingfsck · · Score: 1

      "Python species"

      There, fixed it for you! ;)

      --
      Excuse me, but please get off my Pennisetum Clandestinum, eh!
    8. Re:Evolution? by Anonymous Coward · · Score: 0

      Only apps that utilize the POSIX subsystem on Windows can call fork().

      (Trivia: Even Windows 7 still comes with the OS/2 subsystem. Look for a file called os2ss.exe)

    9. Re:Evolution? by Anthony_Cargile · · Score: 2, Interesting

      Hmm, thanks for that little tip. I still have the leaked NT code from around 2004, and linking to that (granted it remained the same) might just be a good practice for some POSIX-esque cross-windows features, perhaps even undocumented, in future apps. Thanks!

    10. Re:Evolution? by Shadow+Wrought · · Score: 2, Funny

      Don't even ask about poor version 3.14159...

      I hear it bit its tail and now just slowly slithers around in a circle...

      --
      If brevity is the soul of wit, then how does one explain Twitter?
    11. Re:Evolution? by MrNaz · · Score: 1

      Only 10,000? You're one of those crazy creationists, aren't you?!

      --
      I hate printers.
  3. Re:Roland Piquepaille: a case study in madness by Anonymous Coward · · Score: 4, Funny

    I don't think will be a problem any more

  4. Probably not an issue for beginners? by NinthAgendaDotCom · · Score: 2, Interesting

    I just started learning Python a few weeks ago when I got laid off from my QA job. I imagine I'm not to the point yet where the language differences between 2.x and 3.x are going to matter?

    --
    -- http://ninthagenda.com/
    1. Re:Probably not an issue for beginners? by morgan_greywolf · · Score: 3, Informative

      Unless you're trying to learn how to code Django apps, which won't work on Python 3.0. Neither will a lot of other 3rd party modules.

    2. Re:Probably not an issue for beginners? by Random+BedHead+Ed · · Score: 3, Informative

      Not really. Keep learning 2.x as you were. Quoth Guido:

      http://www.artima.com/weblogs/viewpost.jsp?thread=211200

      Ignore 3.0 for now. If you encounter it, there is a handy script to help update your code, though most of the language is unchanged. The biggest gotcha is that print is now a function print() rather than a statement.

    3. Re:Probably not an issue for beginners? by joetainment · · Score: 2, Interesting

      They will matter a little bit. Some very basic things (like the way the print function works) have changed. However, there's plenty of information on their website, and they've made tools to make migration easier. Those little problems should be easy enough to work around if you use 3.x, or you could keep using 2.x for a while until everything else catches up.

    4. Re:Probably not an issue for beginners? by Anonymous Coward · · Score: 0

      Almost no significant third party modules work with Py3k, particularly those that rely on C extensions. No wxPython, no Twisted, etc. That's the only reason I'm using 2.6 (and I still have to compile a Win32 binary for pyOpenSSL myself).

    5. Re:Probably not an issue for beginners? by morgan_greywolf · · Score: 1

      No Twisted? Well, no Python 3.0 for a bunch of my apps for a while...I did mention Django, though, specificially because the Django devs have no definitive plans to update it 3.0 at this time and it's not inconceivable that they might never update it for 3.0.

    6. Re:Probably not an issue for beginners? by ultrabot · · Score: 1

      No Twisted? Well, no Python 3.0 for a bunch of my apps for a while...I did mention Django, though, specificially because the Django devs have no definitive plans to update it 3.0 at this time and it's not inconceivable that they might never update it for 3.0.

      Sounds like FUD - Django will update to 3.0, it will just take time:

      http://docs.djangoproject.com/en/dev/faq/install/#can-i-use-django-with-python-3-0

      The upgrade won't be exceedingly hard either. It could be done people not part of the django core team, even... Summer of code 2009?

      --
      Save your wrists today - switch to Dvorak
    7. Re:Probably not an issue for beginners? by TheoMurpse · · Score: 1

      Oh darnit. I've been programming amateurely (as an amateur?) for twenty years, starting with BASIC as a kindergartener or preschooler.

      I've always disliked print() and preferred print as a statement. Likely because of my background in BASIC and the fact that I learned C++ and cout before I learned C and printf().

      Can someone tell me why it was changed to print()? Philosophical reason or pragmatic? Just to conform with the more popular Java/JavaScript/C convention or something?

    8. Re:Probably not an issue for beginners? by Mozk · · Score: 1

      The biggest gotcha for me was that sockets now use byte arrays instead of strings.

      --
      No existe.
    9. Re:Probably not an issue for beginners? by hobbit · · Score: 1

      Why should print have special status as an operator? cout doesn't.

      --
      "Wise men talk because they have something to say; fools, because they have to say something" - Plato
    10. Re:Probably not an issue for beginners? by Anonymous Coward · · Score: 1, Insightful

      That link is 18 months old, in which he says "I expect it'll be two years before you'll need to learn Python 3.0", so if he followed the advice he should start learning 3.0.

    11. Re:Probably not an issue for beginners? by Otter · · Score: 2, Insightful

      Can someone tell me why it was changed to print()? Philosophical reason or pragmatic?

      Philosophical. Pragmatists like myself can't stand it.

    12. Re:Probably not an issue for beginners? by DiegoBravo · · Score: 1

      The obvious reason is to provide a consistent I/O class hierarchy where standard output is inherited from some other general stream.

      Another common reason (I don't know if directly applicable in this case) is to dynamically assign behavior. Say you have an object and you will call its method named m="f" (that value/method-name is resolved at execution time), so you may execute sort of object."f"(). If the concrete needed action is a data printing, then you could set m="print" and that's all. But being print a language statement (like for or other syntax constructs) disallows this useful practice.

      Please note that this is just the general idea, nothing of actual Python syntax.

    13. Re:Probably not an issue for beginners? by dodongo · · Score: 3, Insightful

      The reason is consistency. If I have:

      print "Hello world."

      and

      print("Hello world.")

      Which do you suppose is easier to find-and-replace to:

      o.write("Hello world.")

      The second, of course, because you can do it without a wildcard.

    14. Re:Probably not an issue for beginners? by TheoMurpse · · Score: 1

      That makes sense, but you could still get around it presently by doing something like m=lambda x:print x or something like that, though. Personally, I'd think this ability makes the fact that you're changing a big piece of syntax of greater weight than the inability to assign m=print that way.

      But then again, IANA computer scientist. I've just always disliked print() from an aesthetic perspective.

    15. Re:Probably not an issue for beginners? by grumbel · · Score: 1

      Can someone tell me why it was changed to print()? Philosophical reason or pragmatic?

      One reason for the change is simply that a language should be consistent and with print() it becomes that, since it turns print() into a usual function. The old print statement on the other side was pure magic, stuff build into the syntax of the language that is different from any normal function for no good reason. And that kind of magic is something you want to reduce to a minimum, since its ugly and just makes your language look weird.

      On a more practical point of view, you can now write:

      a = print
      a("Hello World")

      Which you couldn't before, since as said, print used to be magic.

    16. Re:Probably not an issue for beginners? by TheoMurpse · · Score: 1

      And that kind of magic is something you want to reduce to a minimum

      So Python is Gandalf?

    17. Re:Probably not an issue for beginners? by Anonymous Coward · · Score: 0

      While messy-looking, the following is actually valid syntax and will call sys.stderr.write("hi"):

      print >> sys.stderr, "hi"

      It was always possible to do a replace without a wildcard.

      And the required parentheses around print always trip me up when playing in the python 3.0 interpreter.

    18. Re:Probably not an issue for beginners? by mgiuca · · Score: 1

      o.write("Hello world.\n")

      print has an implicit '\n' at the end. Gotcha :)

    19. Re:Probably not an issue for beginners? by Just+Some+Guy · · Score: 1

      The reason is consistency.

      I agree, but for different reasons. Since old-Python's print is a statement, you can't pass it around to other functions. Instead, you have to write a named wrapper function to execute it:

      >>> def dosomething(function, value): function(value)
      ...
      >>> dosomething(print, '234')
      File "<stdin>", line 1
      dosomething(print, '234')
      ^
      SyntaxError: invalid syntax
      >>> def printfunc(value): print value
      ...
      >>> dosomething(printfunc, '234')
      234

      This is incredibly annoying and not consistent with the most of the rest of the language. I can't think of any other statement you'd want to pass around, but print is definitely useful that way.

      --
      Dewey, what part of this looks like authorities should be involved?
    20. Re:Probably not an issue for beginners? by m50d · · Score: 1

      They should have got consistency by going the other way. Once you've used a language where you can simply write "function argument" for a function call, brackets seem so clunky.

      --
      I am trolling
    21. Re:Probably not an issue for beginners? by Anonymous Coward · · Score: 0

      You're telling me statement versus function is not pragmatic?

    22. Re:Probably not an issue for beginners? by coldwd · · Score: 1
      Actually that comment is from over a year & a half ago. From TFA it looks like Guido now recommends learning 3.0 if you can:

      It's easier to learn the differences between 2.6 and 3.0 after you've learned 3.0 than to go the other way. If you learned Python 2.6, you'd probably use a book that had 2.5 on the cover and it was written for 2.2 or 2.3 and sort of somewhat updated by the author. A lot of those textbooks actually still use idioms that already were deprecated in the 2.3 or 2.4 timeframe. It's quite possible that if you're using 2.6 that you're actually writing a dialect of the language that is mostly compatible with 2.3 or something that old which is I think about five years old by now. On the other hand, if you learn 3.0, in order to be able to work with 2.6, you only have to unlearn a few things because many 3.0 features have actually been backporte to 2.6, or were already available. There's a handful of things that are essentially different like print statement versus print function.

      I'd switch tracks to 3.0 if possible as you'll be a bit more future-proof :)

      --
      "I wish I had a Kryptonite cross, because then you could keep both Dracula AND Superman away." --Jack Handy
    23. Re:Probably not an issue for beginners? by Anonymous Coward · · Score: 0

      Not really. Keep learning 2.x as you were. Quoth Guido:

      http://www.artima.com/weblogs/viewpost.jsp?thread=211200

      Ignore 3.0 for now. If you encounter it, there is a handy script to help update your code, though most of the language is unchanged. The biggest gotcha is that print is now a function print() rather than a statement.

      Well, that is a quote from 2007 saying "I expect it'll be two years before you'll need to learn Python 3.0".

    24. Re:Probably not an issue for beginners? by danieltdp · · Score: 1

      On the article on top of this thread he said the opposite. He argues that many books will show old dialects and you end up learning 2.2 stuff for example. On the other hand, if you go for 3.0, it is sort of easy to learn what changes to go back to 2.6. This way you avoid being tainted by old dialects that creep on books that get new editions as python versions go by. Note that is is Gido argument, no mine. It's on TFA

      --
      -- dnl
    25. Re:Probably not an issue for beginners? by dodongo · · Score: 1

      Good catch :)

    26. Re:Probably not an issue for beginners? by DaVince21 · · Score: 1

      For logical reasons, I'd think? It conforms with all the other functions, after all.

      --
      I am not devoid of humor.
  5. Trip over beginners? by AaxelB · · Score: 3, Funny

    There's a couple of things that trip over beginners have been removed.

    Ah, yes, I remember python tripping over me. It's actually pretty impressive that a snake figured out how to trip, why take out that feature? It seems like you're knocking it back a notch in the evolution toward legs.

    1. Re:Trip over beginners? by VirusEqualsVeryYes · · Score: 1

      So if you are not held back by external requirements like dependencies on packages or third party software that hasn't been ported to 3.0 yet or working in an environment where everyone else is using another version. If you're learning Python for the first time, 3.0 is a great way to learn the language. There's a couple of things that trip over beginners have been removed.

      Like basic grammatical structure, for instance? When did Palin become a Python dev?

    2. Re:Trip over beginners? by Anonymous Coward · · Score: 0

      First Ruby on Rails, now Python on Legs. What next?

    3. Re:Trip over beginners? by lewp · · Score: 4, Funny

      Well, Common Lisp stole my bike.

      --
      Game... blouses.
    4. Re:Trip over beginners? by Internalist · · Score: 1

      It seems like you're knocking it back a notch in the evolution toward legs.

      Evolution isn't goal-directed...

      --
      Research is what I'm doing when I don't know what I'm doing. -- Wernher von Braun
    5. Re:Trip over beginners? by Anonymous Coward · · Score: 0

      And Scheme stole my car?

    6. Re:Trip over beginners? by Anonymous Coward · · Score: 0

      ...and Scheme stole my car.

  6. Im working on a Python clone by Daswolfen · · Score: 0, Redundant

    I call it Monty:Python

    --
    Don't rush me, Sonny. You rush a miracle man, you get rotten miracles.
    1. Re:Im working on a Python clone by LighterShadeOfBlack · · Score: 2, Informative

      The name Python originally came from Monty Python, so you're about 18 years late on that joke.

      --
      Spelling mistakes, grammatical errors, and stupid comments are intentional.
  7. Getting into Python 3 by dedazo · · Score: 4, Informative

    For those interested, IBM is running a primer series on the new language/runtime features.

    There's also this older (but still relevant) PEP that explains things that did not change between the 2.x series and 3.0.

    Personally, I'm not looking forward to migrating existing code bases (especially non-trivial ones) to 3.0, but I'm planning to do all new development against it (of course assuming that the various packages I use have ports).

    For Python trivia lovers, here the the actual moment in time when 3.0 was let loose on the world. I'm such a sentimental geek :)

    --
    Web2.0: I love when people Flickr my cuil and digg my boingboing until my google is reddit and I start to yahoo
  8. This sentence fragment. by HTH+NE1 · · Score: 2, Funny

    So if ( you are ( not ( held back by ( external requirements like ( dependencies on packages ) or ( third party software that hasn't been ported to 3.0 yet )))) or ( working in an environment where everyone else is using another version )).

    The above sentence fragment is apparently a verbal quotation where Guido van Rossum forgot he used the word "if" when he was somewhere in the middle.

    --
    Oh, say does that Star-Spangled Banner entwine / The myrtle of Venus with Bacchus's vine?
    1. Re:This sentence fragment. by MichaelSmith · · Score: 1

      Its going to be a pain getting the indenting right in there.

    2. Re:This sentence fragment. by HTH+NE1 · · Score: 1

      Yeah, I thought about indenting it python-style (after I'd already hit Submit, natch) but it would be harder to read wrapped in <ecode></ecode> .

      And I should have had another set of parentheses around "( dependencies on packages ) or ( third party software that hasn't been ported to 3.0 yet )" to associate it stronger with "like".

      --
      Oh, say does that Star-Spangled Banner entwine / The myrtle of Venus with Bacchus's vine?
    3. Re:This sentence fragment. by atraintocry · · Score: 1

      If you code in Notepad, maybe :)

  9. In all seriousness by jgtg32a · · Score: 0, Flamebait

    Is there a Python clone that uses C style formating?

    Maybe I'm just whiny, but the braces and everything are just easier to read.

    1. Re:In all seriousness by bb5ch39t · · Score: 2, Insightful

      I don't think so. One of the design ideas for python, IIRC, was to force "proper indentation" for "proper documentation".

    2. Re:In all seriousness by Anonymous Coward · · Score: 2, Insightful

      I thought the same as you once, but I changed my mind. Now the braces just plainly, insanely annoy me.

      Moreover, you cannot imagine how much time is wasted in typing something that has absolutely no meaning: you have to indent anyway. Braces are just a waste of your time.

    3. Re:In all seriousness by Anonymous Coward · · Score: 0

      You could try "from __future__ import braces", but I get the funny feeling this won't be implemented any time soon. Call it a hunch.

    4. Re:In all seriousness by AuMatar · · Score: 3, Insightful

      On the other hand, I've spent at least a full work week of my life fixing problems due to whitespace. Guido made a major fuck up there- by removing braces but not strictly defining whitespace, he's created a language where it's possible to have two identical looking pieces of code do very different things. If he had said that it must be indented by exactly 1 tab or exactly 4 spaces or whatever other measure and everything else would throw a syntax error, it would have been fine. As it is I'd say about 15-20% of the time I spent doing Python was spent fixing these kinds of bugs.

      --
      I still have more fans than freaks. WTF is wrong with you people?
    5. Re:In all seriousness by ultrabot · · Score: 5, Funny

      Yes.

      Example program:

      class MyClass(object): #{
              def myfunction(self, arg1, arg2): #{
                      for i in range(arg1): #{
                              print i
                      # whoops, forgot to close that bracket!
              #}
      #}

      --
      Save your wrists today - switch to Dvorak
    6. Re:In all seriousness by Anonymous Coward · · Score: 3, Insightful

      I think it's ridiculous that people's biggest complaint about Python is that it's whitespace sensitive. Any frustrations with it are easily solved by using the proper tools.

      Learn a text editor, and this isn't an issue. In emacs, C-c C-q will properly tabify the function you're in, and tabs should behave mostly sane thereafter.

    7. Re:In all seriousness by KovaaK · · Score: 1

      Sweet. Now with all that work that you've done in getting python to support braces, can you make it not depend on whitespace? I'm sure it won't take that much more effort.

    8. Re:In all seriousness by ultrabot · · Score: 1

      can you make it not depend on whitespace? I'm sure it won't take that much more effort.

      It would be easy to create a preprocessor to do that, but life's too short. I'll leave the excercise to someone that cares enough.

      --
      Save your wrists today - switch to Dvorak
    9. Re:In all seriousness by AuMatar · · Score: 0, Troll

      I think that it's ridiculous that a language's readability depends on the tool used to read it. It's a sign the language is broken. A proper language would use an easily distinguishable delimeter- anything other than a whitespace. It could be { [ ( or (@$^(*#*@(&#*(@&#^& for all I care. If you need a special tool to read it, its flawed. I should be able to write my code in emacs, vi, nano, pico, ed, or notepad for that matter without having to spend any time messing with the setup. And my coworkers should be able to use whatever tool they want, even if they are heretical vi users. Nor should I have to know the 1 billion features of emacs. I have better things to waste braincells on.

      By the way, what would you do if you were in an environment that didn't have emacs- say editing on an embedded/mobile device? Or were working off of printouts? Or just didn't have it on the machine and couldn't install it (no network, network outage, improper permissions)?

      I also love that they should behave "mostly sane" thereafter. So even with tools it isn't promised to work right? No thanks.

      And I am speaking from experience here- I worked on a Python project in a team environment. It was a disaster, the whitespace thing caused daily bugs. There's no excuse for the amount of time and productivity it caused us to lose when the solution exists and is 4 or 5 decades old.

      --
      I still have more fans than freaks. WTF is wrong with you people?
    10. Re:In all seriousness by DragonWriter · · Score: 1

      A proper language would use an easily distinguishable delimeter- anything other than a whitespace.

      All programming languages higher level than machine code that I've encountered, except for a few esolangs, use whitespace as a delimiter.

    11. Re:In all seriousness by Moebius+Loop · · Score: 2, Insightful

      I would be interested to see an example of Python code where a change to whitespace causes two identical-looking pieces of code to do two different things.

      The only issues that *ever* come up in such a scenario is a SyntaxError, and pretty much the only reason they ever happen en masse is due to indiscriminate copy-and-paste coding.

      Syntax errors can barely even be called bugs, and in any significant project the amount of time you're going to spend dealing with them is easily dwarfed by the *real* bugs that are a natural part of the development process.

      If developers are spending a truly inordinate amount of time on whitespace issues, it can only be due to lack of discretion and attention to detail, which I would be willing to wager is increasing the number of "real" bugs emerging as well.

      --
      have you been seen on slash?
    12. Re:In all seriousness by horza · · Score: 4, Insightful

      I've edited Python in vi, Notepad, SciTE, Geany, and other editors without any problem. Never used emacs though. If whitespace is causing bugs in your team's code you need to (a) introduce process or (b) lose some dead weight from your team. For (a) you can standardise on editor and whether to use tabs or spaces, or you can get the coders to end a whitespace block with a comment, eg # endif. I've only been using Python a couple of years but my experience so far suggests the problem is with you and not the language.

      Phillip.

    13. Re:In all seriousness by AlexMax2742 · · Score: 4, Informative
      --
      I'm the guy with the unpopular opinion
    14. Re:In all seriousness by SleepingWaterBear · · Score: 4, Insightful

      If he had said that it must be indented by exactly 1 tab or exactly 4 spaces or whatever other measure and everything else would throw a syntax error, it would have been fine. As it is I'd say about 15-20% of the time I spent doing Python was spent fixing these kinds of bugs.

      I have to assume that most of your time doing python has been spent copy/pasting code off the web. I've been coding python nearly daily for a couple years now. I've rarely made indentation errors, none in the last few months, and only once have I ever had an indentation error that took more than 10 seconds to debug. The thing is, most indentation errors are so visibly clear that it's really quite hard to make them.

      If you're actually having problems with multiple spaces looking like tabs, you can use the -t option to make it throw an error if you use a mixture of tabs and spaces, but it really shouldn't be that hard.

    15. Re:In all seriousness by Anonymous Coward · · Score: 0

      Care to provide an example of said code that looks identical but does two completely different things, perhaps? Because I couldn't do it.

      i=1
      while i==1:
        print "test"
        i = 2

      output: test

      i=1
      while i==1:
          #note the extra space for the retarded
          print "test"
          i = 2

      output: test

      As it is, I'd say you're full of shit and you're just a freak for another language trolling this thread, but hey. Everyone's entitled to their opinion. Even if it's stupid.

    16. Re:In all seriousness by Anonymous Coward · · Score: 0

      I think that it's ridiculous anyone still mods you up, it's so obvious that you're trolling it's...well...ridiculous. The more people criticize your opinion the more inflated and retarded your claims about loss of "productivity" due to fucking WHITESPACE become. If the "whitespace thing caused daily bugs" in your little band of fake Python developing friends it was because they weren't fucking writing in the language properly, or they didn't know how to make a straight, vertical line out of familiar keyboard characters. Either way the problem seems to be their stupidity rather than a problem with the language. "No excuse," blah blah blah blah. You probably just got done doing your BSD is dead cut-and-paste job on a GNAA forum you dumb fuck. No one's falling for it any more.

    17. Re:In all seriousness by hobbit · · Score: 1

      I agree. I'm a Python fan but "use a proper text editor" is passing the buck big-style. Guido should have just mandated the use of spaces rather than tabs: everything renders spaces the same.

      --
      "Wise men talk because they have something to say; fools, because they have to say something" - Plato
    18. Re:In all seriousness by Anonymous Coward · · Score: 0

      This is a serious issue that seems to get dismissed regularly by the Python crowd. As you pointed out the issue is the definition of white space and like you have experienced this debugging nightmare myself.

      It is especially troublesome when using code from a third party written in an unknown editor. Basically one needs to have the editor you are using set up to display the various white spaces in differing colors or in other manners so that you can visually see where your code blocks should be. Worst is the cut and paste from one source into another.

      You also hinted on the solution here to the problem that would be a clear and unambiguous definition as to what is white space. Given that it wouldn't take much for a good editor programmer to come up with a really helpful Python mode.

    19. Re:In all seriousness by TheCouchPotatoFamine · · Score: 1

      Gawd, you all do realize that the code needed to re-deliminate python code to braces instead of spaces or tabs is about half a page right? You can have it both ways. Hint: Every time you hit an open bracket, increase a value by four. this value is the number of spaces before each line except where the line ends in '\' (a line continuation character). When you hit the close bracket, you decrease the number of spaces you emit by four. Simple, no?

      --
      CS majors know the time/space tradeoff, but they never get taught the 3rd, crucial, tradeoff of the set: comprehension!
    20. Re:In all seriousness by TheCouchPotatoFamine · · Score: 1

      Gawd, you all do realize that the code needed to re-deliminate python code to braces instead of spaces or tabs is about half a page right? You can have it both ways. Hint: Every time you hit an open bracket, increase a value by four. this value is the number of spaces before each line except where the line ends in '\' (a line continuation character). When you hit the close bracket, you decrease the number of spaces you emit by four. Simple, no?

      --
      CS majors know the time/space tradeoff, but they never get taught the 3rd, crucial, tradeoff of the set: comprehension!
    21. Re:In all seriousness by gullevek · · Score: 1

      I completely disagree. He should have made tab mandatory. Not space. This would have made it all much easier. Because 2 spaces, 3 spaces, look all very similar. but one tab is one tab.

      --
      "Freiheit ist immer auch die Freiheit des Andersdenkenden" - Rosa Luxemburg, 1871 - 1919
    22. Re:In all seriousness by Anonymous Coward · · Score: 0

      I think that it's ridiculous anyone still mods you up, it's so obvious that you're trolling it's...well...ridiculous.

      Pot, meet kettle.

      The GP had a legitimate point. YOU are the troll.

    23. Re:In all seriousness by dodongo · · Score: 1

      Agree completely. Though I have a heavily brace-oriented background, I've found learning Python while ignorant that you could use braces to contain code blocks, I've embraced the tab delimitation completely.

      I've simply never had a major tab problem, and while I don't write terribly complicated code, I nest the hell out of things sometimes. I develop on Windows and use PythonWin and there's just never a problem with indentation. I totally don't get people who troll (not accusing you!) on the topic.

    24. Re:In all seriousness by Count+Fenring · · Score: 1

      Is lisp an esolang now?

    25. Re:In all seriousness by Anonymous Coward · · Score: 0

      I agree. It is a major deficiency of the language to not have braces and a major egotistical mind-game to attempt foist your numb-nuts design on the world. I as well as the team I was on spent many wasted hours debugging whitespace problems. Braces are trivial to match with any reasonable editor. This reminds me of the pidgin people who know how to define the world's best UI and keep ending up with crap. Oh yeah, let's not forget microsoft's "\" for directory delimiters just so they could be different than unix. What a fucking joke. Or how about Apple's "puck" mouse. The list continues on with people being morons. Let Guido Sarduchi have his laughs. I am sure he fucks up the indentation all the time too.

    26. Re:In all seriousness by grumbel · · Score: 1

      The problem with whitespace is that it breaks code. Take a snipped of code copy it and paste it into a different indention level. In a block oriented language the code will continue to work exactly as intended and at no point will the code be invalid. In Python on the other side the code breaks as soon as you paste and you have to move that broken code back into usable form manually. Now a proper editor can help with that, but it won't stop Python code from temporarily break.

      Thats Python is extremely lax with the whitespace of course just makes this problem worse.

      And of course that isn't just theoretical, one of the worst coding experience I had in any language ever was refactoring a piece of Python code, since the old way to work of just copying stuff around, adding function names to it and then hitting auto-indent completly broke and I ended up constantly fixing white space and being extremely careful with my copy&paste, since the code constantly broke after every second operation.

    27. Re:In all seriousness by he-sk · · Score: 2, Interesting

      A good editor should re-indent the pasted code automatically. In VIM you can use :set ai, si.

      --
      Free Manning, jail Obama.
    28. Re:In all seriousness by he-sk · · Score: 1

      I bet the though process went something like this:

      Guide: Hmm, should I use spaces or tabs for indentation?
      College 1: Spaces, of course. Spaces look the same everywhere!
      College 2: I disagree. One space is too small to visually indent code. Tabs FTW!
      Guide: Why, I'll just do both.

      --
      Free Manning, jail Obama.
    29. Re:In all seriousness by Anonymous Coward · · Score: 1, Insightful

      Worst is the cut and paste from one source into another.

      So true, but for reasons that probably escape you.

    30. Re:In all seriousness by wiredlogic · · Score: 1

      The point is that is a flawed design that promotes inadvertent errors in code just like C's '=' and '==' operators are too easy to carelessly mix up (especially when switching between other languages that use '=' for equality tests). I like Python and the white space delimiting is liberating but it is unfortunately implemented in an ad hoc way that is susceptible to easily missed breakage. A better language wouldn't depend on the sort of higher level practices you suggest to guard against these sort of mistakes.

      --
      I am becoming gerund, destroyer of verbs.
    31. Re:In all seriousness by ins0m · · Score: 1

      echo "set et ts=4 sw=4" >> ~/.vimrc

      You'll thank me in the morning.

      I'm still a diehard C coder at heart, but I'll admit that braces as a syntactic measure are just plain bad (unless you're in a Lisp-variant, where a paren _is_ the whitespace, ffs). It's why reference-counting is insufficient for being a singular GC mechanism, and why, if compilers were built like garbage collectors, work efficiency would plummet.

      Seriously, a decent editor that can swap out tab commands for a N-length block of spaces will alleviate your indentation worries. If you're worried about bytecode and compile-time efficiency (aka, the mythical "zomg whitespace==compile inefficiency!" fallacy), you wouldn't be using an interpreted language in the first, and you'd also know that Python+Psyco won't ever get you the same hardware optimizations as a true compiled language.

      At that point, you're indenting for readability and maintainability anyway. Braces add nothing to syntax and actually add in avenues for compile-time error. I usually see this sort of issue as being with someone who isn't using Emacs of Vi properly [I'm a Vim user, but I hear tell that (setq default-tab-width) achieves the same thing in your emacs conf]. If you aren't using a capable editor, that's a fault of your editor, not of the language design principle.

      It's been ages since I took compilers or checked out a copy of the G++ source, but IIRC, preamble whitespace is insignificant if you use a line terminator (aka, ";" for most C-lang expressions). Once the tokenizer kicks in, that whitespace becomes irrelevant to the expression because you know where expression and block delineations are. However, if you're typing that way in your CVS check-ins for maintenance sake anyway, why not lose the terminators and make the whitespace relevant? It all compiles the same and makes it easier for jyumang beans to read.

      --
      Never attribute to Hanlon that which can be adequately attributed to Heinlein.
    32. Re:In all seriousness by kdart · · Score: 1

      # vim:ts=4:sw=4:softtabstop=0:smarttab :-)

      --

      --
      The early bird catches the worm. The worm that sleeps late lives to see another day.
    33. Re:In all seriousness by StormyWeather · · Score: 1

      If you don't like the built in editor, I've enjoyed the Easy Eclipse for Python distribution as well.

      http://www.easyeclipse.org/site/distributions/python.html

    34. Re:In all seriousness by Anonymous Coward · · Score: 0

      ts=8 you heretic

    35. Re:In all seriousness by Anonymous Coward · · Score: 0

      Yeah. Substituting curly braces with #endif is sooo smart. SMRT!

    36. Re:In all seriousness by mgiuca · · Score: 2, Insightful

      Guido made a major fuck up there- by removing braces but not strictly defining whitespace

      Stop. First, the whitespace rule in Python *is* strictly defined.

      The formal, exact, unambiguous specification of how Python interprets whitespace is in the official language reference - Lexical analysis.

      It's pretty wordy, but I've studied it and it's quite precise. The relevant section is here:

      "Firstly, tabs are replaced (from left to right) by one to eight spaces such that the total number of characters up to and including the replacement is a multiple of eight"

      This is exactly the same as the default behaviour of Unix `expand`.

      [Guido has] created a language where it's possible to have two identical looking pieces of code do very different things.

      It depends what you mean by "looking". To you, perhaps 1 tab looks the same as 4 spaces. To me, maybe it looks the same as 2 spaces. To Jeff, maybe it looks like a red dot in his specially-configured editor. To Python, it happens to look the same as 8 spaces.

      DO NOT MIX TABS AND SPACES. Then, I guarantee you that any two pieces of code which look the same to you (whether they use tabs or spaces) will also look the same to Python. (You don't have to enforce this across a whole file, just on a per-block basis, but it's best if your whole project has an agreed indentation standard).

      If he had said that it must be indented by exactly 1 tab or exactly 4 spaces or whatever other measure and everything else would throw a syntax error.

      That's silly. Then you'd be at Guido's whim; you'd have to indent the way he chose. This way, you can choose any indentation you like. Tabs, 2 spaces, 4 spaces, 3 tabs if you like. As long as you are internally-consistent, Python will be happy.

      My second point to you: If you are pasting code from somewhere into your code, and you do not fix up indentation so it matches the surrounding code, you are worse than Hitler. Or at least very lazy. I don't care if you are using Python or C or Brainfuck.

      If you carelessly paste 1-tab-indented code into a surrounding block which is 4-tab-indented, and don't fix it up, then how do you think I will feel when I open it in my editor configured to expand tabs to 2 spaces instead. It will be totally unreadable -- and this is why we indent in the first place (in any language, that is).

      Python forces you to tidy this up, and that can only be a good thing. If your code is confusing Python, it's probably confusing a bunch of other readers as well.

    37. Re:In all seriousness by hobbit · · Score: 1

      2 spaces, 3 spaces, look all very similar. but one tab is one tab.

      On your machine, one tab might be 8 spaces. But on someone else's, it might be 2 or 3.

      --
      "Wise men talk because they have something to say; fools, because they have to say something" - Plato
    38. Re:In all seriousness by Coryoth · · Score: 1

      The point is that is a flawed design that promotes inadvertent errors in code just like C's '=' and '==' operators are too easy to carelessly mix up

      May I suggest you try Ada or Eiffel then, which go to some trouble to try and weed out any such errors as early as possible (usually at compilation time).

    39. Re:In all seriousness by Anonymous Coward · · Score: 0

      I speak from experience as well. 3 teams for a total of 20 developers, with python. None of the problems you present about whitespaces. Situation was mixed, with people using eclipse, vi and emacs.

      I think that your worst problems are lack of adaptation, monkey typing and religious stance.

    40. Re:In all seriousness by Anonymous Coward · · Score: 0

      I think these block formatting holy wars rank very highly on the scale of pathetic geek arguments, but if you have to use a process with one language and not with another, doesn't that point to one language having a difficulty the other does not have?

    41. Re:In all seriousness by compro01 · · Score: 1

      For the "=" and "==" thing, do equality tests backwards, like:

      3==foo

      If you accidentally put

      3=foo

      It'll throw an error about an undeclared variable (presuming you don't have a string variable named "3") when you try to compile.

      --
      upon the advice of my lawyer, i have no sig at this time
    42. Re:In all seriousness by jibster · · Score: 3, Insightful

      This is easy to demonstrate

          for i in myarray:
              ** Do some stuf here, use spaces to delimit. Note we are already inside a function or class. That is, we are not at the first indent level

              print "Hello world" //Note this line is tab delimited. It looks likes its at the right indent level but its not.

      Now you expect the code to print hello world a load of times but it will actually do it only once.

      Its easy to extrabolate this to less trival problems

    43. Re:In all seriousness by DragonWriter · · Score: 1

      Is lisp an esolang now?

      No, and it uses whitespace as a delimiter, and, at least in many dialects, differentiates between kinds of whitespace (at least, between newlines and everything else.)

      E.g., in some lisps:

      (foo ; fop (bar baz)
      )

      is not the same as:

      (foo ; fop
      (bar baz))

      Though they differ only in whitespace.

    44. Re:In all seriousness by Draek · · Score: 1

      Sure, it's called C. Since both are Turing-complete languages they're functionally-equivalent, and what's more C-like than C? :)

      If you want a C-like higher level language, I'd recommend C# with Mono, though if your country's patent system is broken it may not be entirely safe to use. It support a good part of what makes Python so cool, like lambda functions, being able to pass functions as arguments or even return them, though they aren't as easy as in Python. Still, it's the closest we've got if you want your curly braces.

      --
      No problem is insoluble in all conceivable circumstances.
    45. Re:In all seriousness by Anonymous Coward · · Score: 0

      I'll accept your point that it shouldn't be that hard. The point is it shouldn't be *at all* in the first place, i.e., the problem should not even exist.
      Shooting yourself in the foot with a pellet gun because it ain't that hard might be better than using a shotgun, but it's still shooting yourself in the foot.

    46. Re:In all seriousness by Chosen+Reject · · Score: 1

      You're exactly right. But unless tabs are only one space, it's still going to be easy to see the difference between 2 tabs and 3 tabs.

      --
      Stop Global Warming!
      Just say no to irreversible processes!
    47. Re:In all seriousness by dacut · · Score: 1

      A good editor should re-indent the pasted code automatically. In VIM you can use :set ai, si.

      Was this taken directly from the sendmail book of configuration file design? Ai! Si, señor!

    48. Re:In all seriousness by Anonymous Coward · · Score: 0

      you can standardise on editor

      You're kidding, right? We can't even standardize on spelling. Besides, editors aren't really the problem. If I were to drop a block of Python into a Trac ticket which sends mail to another dev running Outlook, what he gets is going to be unusable without a lot of error-prone repairs to the whitespace. Web-hosted discussion forums are notorious for the same thing.

    49. Re:In all seriousness by gullevek · · Score: 1

      but a tab is still a tab. It might be represented by X spaces depending on your setting (eg I have 1 tab = 4 spaces).

      But if you go through code or grep or whatever, you can always say ^\t and be sure that you get what you want.

      With spaces? did he intend with 2? 3? 6? how do you then do this?

      --
      "Freiheit ist immer auch die Freiheit des Andersdenkenden" - Rosa Luxemburg, 1871 - 1919
    50. Re:In all seriousness by onto_dry_land · · Score: 0

      You are going to have to give a real example. If the tab delimited line looks like it is at the right indent level it will behave as if it is on the right indent level.

      One problem some people seem to have is that they display a tab in some non-standard way, e.g. with a tab stop of four columns instead of the correct eight columns. Then identical looking code will indeed behave in different ways. But that problem is not unique to whitespace or to Python. If you have a file viewer that displays the character "q" as "f", then the words "foo" and "qoo" will look the same but behave differently in just about any language.

    51. Re:In all seriousness by louiswins · · Score: 1

      Well, if you want the long options, you can use :set autoindent smartindent

    52. Re:In all seriousness by Anonymous Coward · · Score: 0

      so what version is that? i tried it out...

      myarr = [1,2,3]
      for i in myarr:
                      a = i # Spaces
                      print "Hello" # Tabbed

      this prints "Hello" 3 times AS EXPECTED. It even gives a warning about inconsistent tab/whitespace use when running python -t

    53. Re:In all seriousness by jaavaaguru · · Score: 1

      What? just so you can write less readable code?

  10. Unicode & toolkits by ultrabot · · Score: 1

    Python 3 mostly changes things that deal with unicode (i.e. it uses unicode an it's "text" object, like Java).

    If you don't care about unicode that much (e.g. you mostly deal with development tools, iso-latin1/ascii encoded files...) there is absolutely no rush to hop on the bandwagon. And perhaps you just hate unicode as a concept ;-).

    I predict that the bandwagon will start rolling ~ Q2 / 2009, when toolkits like PyQt4 for 3.0 are materializing.

    --
    Save your wrists today - switch to Dvorak
  11. Oh good. by powerlord · · Score: 4, Insightful

    There's a couple of things that trip over beginners have been removed.

    So whitespace block delineation is finally out, in favor of braces? :P

    --
    This space for rent. All reasonable inquiries will be entertained at proprietors discretion.
    1. Re:Oh good. by SatanicPuppy · · Score: 5, Insightful

      I'll preface this by saying that I program primarily in brace-based languages.

      Braces suck in the worst possible way as a method of delineation. Let me give an example:

      while(...){if(...){if(...){}elseif(...){}}}

      That's clearly the suck, so we break it out like:

      while(...){
          if(...){
                if(...){
                }elseif(...){}
          }
      }

      ...at which point we realize that the braces are basically useless, since the code is unreadable without the whitespace. Python just forces you to use a readable formatting, and it's not all that hard to get used to.

      --
      ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
    2. Re:Oh good. by Kingrames · · Score: 1

      No,theyremovedthewhitespace,butdidn'taddanybraces.

      --
      If you can read this, I forgot to post anonymously.
    3. Re:Oh good. by hwyhobo · · Score: 5, Insightful

      at which point we realize that the braces are basically useless, since the code is unreadable without the whitespace.

      No, it means the code is hard to read, but it still works. You can reformat that block, or you can change the spaces (tabs, number of spaces), and it will still work. In Python, it may look okay, and be readable, but it won't work.

      I guess it is a matter of priorities.

      BTW, I like Python (and have almost given up on Perl 6), but the white space thing drives me crazy.

      --
      End anonymous moderation and posting on /.
    4. Re:Oh good. by salimma · · Score: 1

      Copy-and-pasting is problematic too, because the pasted code might (in fact, is likely to) end up at the wrong indentation level.

      --
      Michel
      Fedora Project Contribut
    5. Re:Oh good. by garaged · · Score: 1

      that's why python (and most programming languages) promote the DRY filosophy

      --
      I'm positive, don't belive me look at my karma
    6. Re:Oh good. by costas · · Score: 4, Insightful

      The whitespace issue is a red-herring: most people get used to it quickly and it's not as strict as it sounds (you can mix-and-match tabs and spaces, as long as you are consistent for each *block*; not even an entire .py file). There's two real-world problems with it: copy-and-paste and generating Python code. Both are much less common than looking at badly-formatted code that it takes a bit to mentally parse which brace-delineated languages have.

    7. Re:Oh good. by hwyhobo · · Score: 1

      While the DRY philosophy may be quite useful in large programs, in small utilities, particularly if they do not necessarily run on the same system, attempting to "librarize" or "functionize" everything may not be practical, and may in fact defeat the readability of your code (of which Python is proud), and also introduce unnecessary bloat.

      Just the fact that you have to fear copy & paste is an indicator of bad design to me.

      --
      End anonymous moderation and posting on /.
    8. Re:Oh good. by larry+bagina · · Score: 1

      indent(1) much?

      The python FAQ's explanation for whitespace block levels is basically that braces/indentation might confuse you. Well, braces don't confuse me. But if you're going to pretend whitespace block indentation is good, then VB-style WHILE ... END, IF .. END IF, etc must be even better.

      --
      Do you even lift?

      These aren't the 'roids you're looking for.

    9. Re:Oh good. by hwyhobo · · Score: 1

      Both are much less common than looking at badly-formatted code that it takes a bit to mentally parse which brace-delineated languages have.

      There is nothing that prevents an organization from instituting coding standards, just like Technical Publications and Marcomm groups have their own Writing Standards (Guides). It is up to the management to punish non-adherence. However, breaking the program by design because someone missed the number of spaces or copied & pasted a few lines of code just rubs me the wrong way.

      YRMV, of course.

      --
      End anonymous moderation and posting on /.
    10. Re:Oh good. by drauh · · Score: 1

      uh, no. whitespace block indentation means blocks have no delimiters. using "END IF" etc means you're adding a closing delimiter. python says: you don't need it.

      really, though, this stuff is minor. any good editor will handle all this for you, whether with braces or without. the other language i like is LISP, which is diametrically opposite python in its use of ()s everywhere.

      --
      This is a tautology.
    11. Re:Oh good. by jellomizer · · Score: 1

      I was always a fan of the white space.
      Having worked with a lot of languages. Forcing good form is actually a nice feature, when the language doesn't enforce proper layout of application of any size will undoubtedly start getting sloppy. Adding an If statement or a while loop as a bug fix and not taking time to properly indent is quite common. And over say 20 years life span of an application, those simple fixes become sloppy code that is hard to read.

      --
      If something is so important that you feel the need to post it on the internet... It probably isn't that important.
    12. Re:Oh good. by robot_love · · Score: 1

      Wow. I've read a lot about coding and programming, but that's the first time I've ever seen someone claim that a programmers reluctance to copy and paste was a bad sign.

      Methinks you have chosen a rather strange hill to die on...

      --
      .there is enough of everything for everyone.
    13. Re:Oh good. by hwyhobo · · Score: 1

      Huh? I think you've read into my post what you wanted, not what it said.

      --
      End anonymous moderation and posting on /.
    14. Re:Oh good. by hwyhobo · · Score: 1

      Besides, this is getting seriously off-topic, so I will continue it (if you wish) when a more suitable thread arises.

      --
      End anonymous moderation and posting on /.
    15. Re:Oh good. by Jeremi · · Score: 1

      BTW, I like Python (and have almost given up on Perl 6), but the white space thing drives me crazy.

      I suppose there is no reason someone couldn't write some SVN hooks that would automatically add curly braces to Python code as it was being checked out from the repository, and automatically remove them again (and correctly indent the text) as Python code was being checked back in. And, of course, update the Python interpreter with a flag to optionally require curly braces instead of indentation, as a way of delimiting blocks of text.

      Given that, everybody could use their own favorite method of code formatting...

      --


      I don't care if it's 90,000 hectares. That lake was not my doing.
    16. Re:Oh good. by bitMonster · · Score: 2, Interesting
      Somebody wrote an encoding module to support braces in python code as kind of a joke. It's called pybraces.

      The code that implements it is surprisingly short.

    17. Re:Oh good. by cibyr · · Score: 1

      Cut-and-pasting then. It's still a problem (though one that should be solved by the editor, not the language IMHO).

      --
      It's not exactly rocket surgery.
    18. Re:Oh good. by Unoti · · Score: 1

      Parent is right. It sounds daunting before you actually work with it all the time. But in practice, you set up your editor to do the right thing, and it's never a problem again for you except like once every few months.

    19. Re:Oh good. by cryptoluddite · · Score: 1

      There's two real-world problems with it: copy-and-paste and generating Python code.

      And posting code on forums, or any other place that doesn't assign meaning to invisible characters. Python is like the cat of programming languages... always freaking out about some imaginary thing. You'll be writing your code all nice and friendly like and then accidentally hit tab instead of spaces and then out of the blue Rrrwrwr!

      Both are much less common than looking at badly-formatted code that it takes a bit to mentally parse which brace-delineated languages have.

      Any programming editor has a simple command to reindent or reformat the code. If you're writing your program in Word you're in trouble no matter what language.

      On the other hand
      ___adding
      ______somekind of
      ___workaround when posting
      is reallly annoying.

    20. Re:Oh good. by Anonymous Coward · · Score: 0

      You know, I would buy that argument except for all the other places where Python *does* require matching parens/braces/curlies. It's as if you're only willing to apply that argument to statements but ignore function calls, dictionaries, lists etc...

      It also annoys me that lambda is crippled because of this decision. I know, I know, Python programmers don't understand^wneed lambda anyways.

    21. Re:Oh good. by grumbel · · Score: 1

      Except of course that making copy&paste hard is doing the exact inverse of that philosophy, since it makes refactoring code hard, you no longer can move code around, slap a function name on top and be sure it will work afterwards, in Python you have to be extremely careful not to break any whitespace or the code might break in very non-obvious ways.

    22. Re:Oh good. by Nevyn · · Score: 1

      The whitespace issue is a red-herring: most people get used to it quickly

      You have data to back this up, I assume? I'm not certain that the majority of people I know using python hate the whitespace retardedness, but I know it's not a "red-herring" and it's far from 90%+ of people who don't hate it (people either don't care or hate it, with a very few who thinks it's a great idea ... IME).

      There's two real-world problems with it: copy-and-paste and generating Python code.

      Err ... and let's not forget that if you write a goddamn doc comment at the wrong level your program crashes. Or that python code tends to be much more crammed together because everyone is afraid of "lining up" blocks. Or that minimal patching is often impossible. Or, hell, that it's impossible to just position the cursor and start typing (nevermind, as you say, the fact moving code with cut/paste doesn't work). Or that half the people trying to use the editors they've been working with for years can't work the same way with python (for instance using TAB in xemacs to "reset indentation to correct level" just can't work in python) ... or that the other half tend to use a special python editor, just so they can work around the invisible syntax.

      Then there's the bits that seem to be there just to annoy developers, like that fact that it's almost impossible to write functions in ipython to test something out because the whitespace will almost certain screw you over ... so instead you have to spend 2-10x as long writing the entire thing out in a script (along with all the state of where you are).

      Then as you pointed out there's the awesome insanity of TABs vs. SPACEs, so things that look different aren't! Luckily you can pass -tt, unluckily that affects all the modules you import (and what they import) ... hahahaha!

      And then, just when you are deciding between stabbing your eyes out, breaking the spacebar on GvR's keyboard or patching "from __future__ import braces" to actually work ... someone posts saying that "it's not a problem really".

      --
      ustr: Managed string API with ave. 44% overhead over strdup(), for 0-20B
    23. Re:Oh good. by he-sk · · Score: 1

      If you use Notepad, that is.

      --
      Free Manning, jail Obama.
    24. Re:Oh good. by Anonymous Coward · · Score: 0

      Yes, yes, we all know that argument, and it's right too...

      4 things:
      - select that line in any IDE, select reformat/correct indentation/whatever, sorted.
      - this reliance on white space makes Python the only web/email unsafe language. With any other language, if you cut and paste a snippet of code from anywhere, it'll work, no question. Not so in Python.
      - likewise, with any other language, the whole space/tab mess is a mere visual annoyance. In Python it creates bugs.
      - finally, it means that perlish one-liners are right out. Since that's my main (and key) use of Perl, Python is a non-starter.

    25. Re:Oh good. by Anonymous Coward · · Score: 0

      Be very afraid when people start making a virtue out of a necessity, and even more so when they start calling it, or invoking, a "paradigm", "pattern" or "philosophy".

      Summoning up DRY to dismiss the fact that cutting and pasting can break code is dishonest. What if I'm using cut-and-paste to move the code around? Is that bad too?

      This also applies to RAII, or the whole no-checked-exception "KISS" in C#.

    26. Re:Oh good. by Anonymous Coward · · Score: 0

      That's the reason (well, one of them) why Python lacks block delimiters: messy code indentation in other languages. Things turn out funny sometimes.

      There's no need to get rid of braces/delimiters since you use a decent indentation scheme... IMHO this should be *indeed* solved by the editor, not by the language ;-)

    27. Re:Oh good. by Anonymous Coward · · Score: 0

      OK, that's nice. Shall I wait for a few years more until Python syntax is like C/C++ and basically I don't need to learn anything "new"?

    28. Re:Oh good. by mgiuca · · Score: 1

      You can reformat that block, or you can change the spaces (tabs, number of spaces), and it will still work. In Python, it may look okay, and be readable, but it won't work.

      Why would you want to mess up the indentation?

    29. Re:Oh good. by Spacelem · · Score: 1

      No, this does not refer to code duplication, but rather copying someone's code from a forum (where you might post your code for error inspection) becomes tricky due to whitespace mangling. Alternatively copying a bit of code into another file for testing only a small section of the program can become error prone.

      Code reuse on the other hand is something to be encouraged wherever it is practical and useful.

    30. Re:Oh good. by krischik · · Score: 1

      Of course I would use:


      while ... loop
          if ... then
              if ... then
                    null;
              else
                    null;
              end if;
          end if;
      end while;

      Knowing at the end of an block which block ends is nice and can lead to more intelligent error messages.

      Of course braces don't give you that advantage either. They are indeed completely useless and good for nothing. Apart from: How do you do a null / {} statement in Python.

    31. Re:Oh good. by krischik · · Score: 1

      That's the reason (well, one of them) why Python lacks block delimiters: messy code indentation in other languages.

      Only I move on to automatic code formatting: Cut-Copy-Auto_Format and all is well. Now that won't work in Python.

      Python strange syntax is solving a problem for in the mean time better solution where found.

    32. Re:Oh good. by m50d · · Score: 1
      However, breaking the program by design because someone missed the number of spaces or copied & pasted a few lines of code just rubs me the wrong way.

      It's unpleasant in the short term, but gains you a lot in the long term when you have to maintain the code. It's very much in the python spirit.

      --
      I am trolling
    33. Re:Oh good. by AstronomicUID · · Score: 1

      Apart from: How do you do a null / {} statement in Python.

      while True: pass

      --
      You must write The Book, and then tear away belief. Only you can save the light of man --Gary Numan
    34. Re:Oh good. by asretfroodle · · Score: 1

      For null statements, Python uses the "pass" statement.

  12. Backwards Compatibility by zwekiel · · Score: 2, Interesting

    I think that whenever a group releases a new version of their language, they should strive to make it (mostly) backwards compatible. Not only does Python 3.0 change the way things work in relation to specific function, but it also removes specific language conventions and creates new ones in their places. This means that very large projects have a lot of work to do to bring their project over to the new specification.

    The question is: is this work worth the upgrade to python 3.0? I'd say on the whole, the changes do not contribute enough to the usability of the language to make it ultimately a worthwhile transition to make. I haven't seen really any compelling features in Python 3.0 that would provide enough incentive for me to spend hours of grunt work making all my code workable in Python 3.0.

    </my two cents>

    1. Re:Backwards Compatibility by ultrabot · · Score: 5, Interesting

      I think that whenever a group releases a new version of their language, they should strive to make it (mostly) backwards compatible.

      That's why they keep releasing 2.x versions that are backwards compatible.

      This means that very large projects have a lot of work to do to bring their project over to the new specification.

      Very large projects can stay with the 2.x series (along with a big portion of users) just fine.

      The question is: is this work worth the upgrade to python 3.0? I'd say on the whole, the changes do not contribute enough to the usability of the language to make it ultimately a worthwhile transition to make. I haven't seen really any compelling features in Python 3.0 that would provide enough incentive for me to spend hours of grunt work making all my code workable in Python 3.0.

      And you shouldn't, since it would probably be a waste of work. 2.x is a rock-solid series that is years away from obsolescense, and new serious projects started right now should pick 2.5 / 2.6. Try starting to use Py3 for projects where it fits - your command line scripts, self-contained internal applications... and ramp up the stakes when new libraries are ported.

      A programming language deserves a "cleanup" every now and then - this is such a thing. Hey, people have survived worse things, like gcc version changes, Qt3 => Qt4, Gtk 1 => Gtk2...

      --
      Save your wrists today - switch to Dvorak
    2. Re:Backwards Compatibility by MSG · · Score: 1

      A programming language deserves a "cleanup" every now and then - this is such a thing. Hey, people have survived worse things, like gcc version changes, Qt3 => Qt4, Gtk 1 => Gtk2...

      Not to mention Perl4 -> Perl5.

    3. Re:Backwards Compatibility by negative3 · · Score: 1

      Backwards compatibility is good but it was never a part of the plan for Python 3 from the beginning - that was declared from the start and has been known for years. 2.6 and 3.0 were released close to each other. The biggest worry I have is that 3.0 is SLOWER than 2.6 in the benchmarks I have seen.

      --
      "Physics is to math what sex is to masturbation." - Richard Feynman
    4. Re:Backwards Compatibility by dkf · · Score: 1

      Hey, people have survived worse things, like gcc version changes, Qt3 => Qt4, Gtk 1 => Gtk2...

      Not to mention Perl4 -> Perl5.

      Or Perl5 -> Perl6...

      (The bane of my life has been glibc versioning. Brought to you by the "Stable ABI? What's that?" school of programming.)

      --
      "Little does he know, but there is no 'I' in 'Idiot'!"
  13. Correction to Title by Anonymous Coward · · Score: 0

    The Intelligent Design of Python 3

  14. Re:Roland Piquepaille: a case study in madness by Anonymous Coward · · Score: 0

    You mean you dropped him just because of that? Go out and live a bit.

  15. Are distributions going to permit both at once? by Anonymous Coward · · Score: 1, Interesting

    Are Linux distributions that include packaged python versions and apps going to permit both 2.x and 3.x python versions to co-exist so all the apps (including local additions) don't have to be ported on the same day?

    1. Re:Are distributions going to permit both at once? by joe_cot · · Score: 2, Informative

      I currently have both python2.4 and python2.5 installed in Ubuntu. They're different packages, and can easily be installed alongside each other.

      For distributions with dependency management (Ubuntu, Debian, Fedora, any modern distribution), this isn't a hard issue -- in the distros I'm familiar with (Debian/Ubuntu) the different versions of python are just separate packages, apps have a list of dependencies and can list that they depend on a certain version of python, and the python package is just a dummy package that "depends" on the latest version of python.

      The same thing is done with different versions of Java, GTK, etc. When a toolkit or language makes a huge backward-incompatible change, it's rare that they can't just be installed alongside each other. Different 2.x versions of Python work just fine alongside each other, and I don't see how Python 3 would be any different.

    2. Re:Are distributions going to permit both at once? by amorsen · · Score: 1

      Are Linux distributions that include packaged python versions and apps going to permit both 2.x and 3.x python versions to co-exist so all the apps (including local additions) don't have to be ported on the same day?

      Fedora is trying hard to avoid that, because it is so difficult to have two versions installed in parallel. All python modules will have to be available in two (or more) versions, which is a royal pain. However, it is not clear that avoiding it will be possible, so a decision hasn't been made yet.

      --
      Finally! A year of moderation! Ready for 2019?
  16. Re:Roland Piquepaille: a case study in madness by Anonymous Coward · · Score: 0

    I believe the Slashdot editors collect a good deal more than $80 per article, and they just copy and paste a summary. Roland made it easier by copying whole articles, whereas the Slashdot editors leave that to the karma whores and kind ACs.

  17. Cue whitespace ranting from wannabees by Qbertino · · Score: 0, Flamebait

    We've heard it all. Cut it out allready. Those ranting about Pythons whitespace are the ones that don't know what they are talking about because they have *never* even programmed in Python, and if it only were for half an hour. To all you suckers out there: Freaking write at least one simple bubblesort in Python, before you go out on a limp and talk about stuff you don't know.

    --
    We suffer more in our imagination than in reality. - Seneca
    1. Re:Cue whitespace ranting from wannabees by larry+bagina · · Score: 1

      So if someone on the internet told you that stretching your ass felt good, would you do it? Or would you consider your own experiences and preferences?

      --
      Do you even lift?

      These aren't the 'roids you're looking for.

    2. Re:Cue whitespace ranting from wannabees by Anonymous Coward · · Score: 0

      Actually popular literature suggests that constricting, not stretching, your anus is the key to "good-bying depression".

  18. I agree.... by mangu · · Score: 1

    There's no way an Intelligent Designer could think that

    >>>print(format(10.0, "7.3g"))
                  10

    is a clearer syntax than

    >>> print '%7.3g' % 10.0
              10

    Of course, raw beginners don't know that % means format, but there was a time when I didn't know that / means division either. Will they deprecate all operators because they might confuse a beginner? I think there should be some reasonable limits to that everything-is-an-object thing.

    1. Re:I agree.... by Anonymous Coward · · Score: 0

      results = "%7.3g" % 10.0
      print(results)

    2. Re:I agree.... by spiralx · · Score: 1

      Or use

      print("7.3g".format(10.0))

      like every normal person would.

    3. Re:I agree.... by Kent+Recal · · Score: 1

      It's still more visual noise. Imho the '%' syntax should stay the way it is - it's quick to type and easy to parse for a human.
      Furthermore it'd be great if they would add perl-style =~ regex-matches.

      Other than that they can objectify all they want, as far as I am concerned.

    4. Re:I agree.... by he-sk · · Score: 1

      WTF?! This notation makes the format string appear more important than the actual value. One could argue that the same happens in the old-style Python notation ("%7.3g" appears before the "10"), but that's just an artifact of the example and the old-style Python code is much more powerful.

      --
      Free Manning, jail Obama.
    5. Re:I agree.... by spiralx · · Score: 1

      No, the new style formatting is more powerful:

      "First, thou shalt count to {0}" # References first positional argument
      "My quest is {name}"             # References keyword argument 'name'
      "Weight in tons {0.weight}"      # 'weight' attribute of first positional arg
      "Units destroyed: {players[0]}"  # First element of keyword argument 'players'.
      "Harold's a clever {0!s}"        # Calls str() on the argument first
      "Bring out the holy {name!r}"    # Calls repr() on the argument first

      And my mistake, my example should've been one of these:

      print("{0:7.3g}".format(10.0))
      print("{number:7.3g}".format(number = 10.0))

    6. Re:I agree.... by he-sk · · Score: 1
      Yes, you're right. After reading PEP 3101, I have to agree that the new style is better.

      Your short example tripped me up, but >>> "User ID: {uid} Last seen: {last_login}".format( ... uid="root", ... last_login = "5 Mar 2008 07:20") is very readable. Even though I think that "format" is a poor choice for the method name because you're doing more than string formatting, don't you?

      --
      Free Manning, jail Obama.
    7. Re:I agree.... by he-sk · · Score: 1

      Accidentally hit the submit button.  The code should be

      >>> "User ID: {uid} Last seen: {last_login}".format(
      ... uid="root",
      ... last_login = "5 Mar 2008 07:20")

      --
      Free Manning, jail Obama.
    8. Re:I agree.... by spiralx · · Score: 1

      What would you use as a method name other than format, given that the method does variable substution, type conversion, alignment, padding etc etc? :) Plus, it matches usage of the String class in both Java and .NET...

    9. Re:I agree.... by he-sk · · Score: 1

      Good question.  IHMO, substitute would be a better choice since that's always going on, isn't it?

      In Java format belongs to a Formatter class and there's DateFormatter etc., so the code would look differently and convey another set of semantics.

      What's really wrong with the new Python syntax is that format is applied to the wrong entity.  In:

      "foo: {bar}".format(bar="baz")

      the data that is formatted is bar="baz" and "foo: {bar}" is really the representation.  So it should be:

      data = { key=value, ...)
      data.format("representation")

      I forget, does Python support inline creation of hash tables?  Then you could inline data:

      {key=value, ...}.format("representation")

      Now, if you change format to toString, the API makes sense.

      It's kinda like the if ("string".equals(variable)) idiom in Java.  It works and has the added benefit of protecting against NullPointerExceptions, but it's not really conveying the intent of the programmer at first glance.  Takes a while to get used to.

      --
      Free Manning, jail Obama.
    10. Re:I agree.... by spiralx · · Score: 1

      There's a String.format(formatstr, ...) method in Java since 1.5 (I think), although it does use the Formatter class internally.

      Yes, in Python you could do

      { "bar": "baz" }.format("foo: {bar}")

      if there were such a method. I can see where you're coming from, but the present usage is consistent with being "Pythonic" e.g.

      ", ".join([x * 2 for x in range(10)])

      can be read as "use ', ' to join the supplied list" and similarly

      "foo: {bar}".format(bar = "baz")

      can be read as "use 'foo: {bar}' to format the supplied values'.

      The use is also consistent whether or not you use a string or an instance of Formatter to determine formatting, and means you don't need to add a new format method to both tuples and dictionaries so that ("baz", ).format("foo: {0}") also works.

      Just read X.m(Y) as "use X to do m on Y" instead :)

      As for if ("string".equals(variable)), well, that's just Java being crap really :)

      Personal favourite new(-ish) Python feature is its trinary operator:

      result = "True" if condition else "False"

      A bit odd at first, but much more understandable once you're used to it IMO...

    11. Re:I agree.... by he-sk · · Score: 1

      Just read X.m(Y) as "use X to do m on Y" instead :)

      But that's really the opposite of OO semantics which is do m on X with Y. That's bad in a sense that it is not the intuitive thing to do for a beginner. I agree, once you get the idiom, it's very powerful.

      Your join example is of the same kind. But still better than have something like StringUtils.join(sep, collection) in Java.

      --
      Free Manning, jail Obama.
    12. Re:I agree.... by spiralx · · Score: 1

      Do repeat on X using Y as separators? ;) Ok, stretching it... but I think what you call OO semantics is just a convention. And the Pythonic way also has the advantage that you can see that you're using a string template to generate a string result.

      Gah, Java... may I never use it again. C# really is just so much nicer to use if I ever need to use a statically-typed language.

  19. Mostly? There's a lot else by xant · · Score: 1

    Of course the list would be pretty long (good thing I don't have to list it), and of course Unicode is very significant, but I think there are other things just as significant if not more. Example: everything's an iterator now, not just a list.

    BTW, Python 2.x has all the unicode support you need to write a correct application. You just have to use u'unicode strings' instead of 'strings' in a lot more places. Python 3.0 has just switched the default, which will make it easier for application developers to get it right. And that's VERY important. In both versions you have to think about encodings.

    My prediction is about 18 months before Python 3.0 is considered the default. My team, in general a pretty early adopter of technology, won't be using it for at least 9 months, waiting for our dependency stack to fill in.

    My fulltext search library, Hypy, on the other hand, should have Python 3.0 support any day now.

    --
    It's rare that you're presented with a knob whose only two positions are Make History and Flee Your Glorious Destiny.
  20. Re:Mmm nothing like a fresh tranny bottom by Anonymous Coward · · Score: 0

    just be careful. I tried to pick up a tranny a couple weeks ago. Turned out he was just a crossdresser. Very embarrassing for both of us.

  21. Best scify movie, eveh! by Gilmoure · · Score: 0, Offtopic

    Python vs. Mansquito III!

    --
    I drank what? -- Socrates
  22. Unfortunately not by mangu · · Score: 1

    So whitespace block delineation is finally out, in favor of braces?

    No, unfortunately they did the worst thing they could do in that respect. Nearly all the changes introduced will make longer lines of code. I think they are trying to make sure that you will need to use the line continuation backslash, which completely negates the advantages of whitespace formatting.

    It seems that their definition of "clean syntax" is Java-like, rather than Perl-like. I never went to the extreme of playing "Perl Golf", but a concise syntax is one of the best ways to make readable code. I started using C rather than Pascal because of that. I switched from Perl to Python when I rewrote some Perl programs in Python and realized that, despite the somewhat longer code, Python was clearer to read. But I still miss the =~ regular expression match operator from Perl.

    An optimal programming language should be well balanced. Not like APL, where a page of code can be resumed to a single character, but it's like learning to write in Chinese. Not like Java either, where you must write several pages of declarations before anything useful comes out. C is very close to the ideal, if you take the effort to understand how a computer works before you start to program. Perl is pretty good, if you resist the temptation to show off your ability. Python was almost there, the perfect compromise between readability and conciseness. Until 3.0, when they went astray...

    I love Python. I hate Py3k.

    1. Re:Unfortunately not by Just+Some+Guy · · Score: 1

      Python was almost there, the perfect compromise between readability and conciseness. Until 3.0, when they went astray...

      So what, specifically, don't you like about Py3K? I've appreciated what I've seen so far. I don't mean that as a troll - I'm genuinely curious. What's less readable or concise for you?

      --
      Dewey, what part of this looks like authorities should be involved?
  23. ie by Anonymous Coward · · Score: 0

    import evolution

  24. Directed Evolution by CustomDesigned · · Score: 0, Troll

    Can't evolution be controlled?

    Of course it can. But then it isn't "evolution" in the religious sense that hard core atheists insist on. The official Dogma explicitly requires *undirected* chance plus natural selection as the ultimate origin of anything that appears to be designed. (Notice I said, "ultimate", nitpickers.)

    I mean really, philosophical materialism is just as silly as the "the universe must have been created in 7 revolutions of a certain planet as measured 14 billion (or 6000) years into its evolution" camp. ("Evolution" in the continuous change according to a set of rules sense). Did they ever consider that our physical time was itself one the things being (allegedly) created? (Many Church Fathers did - e.g. Augustine)

    There are many meanings of "evolution" in common use, so discussions always end up in equivocation with straw and torn blue jeans all over the place.

    1. Re:Directed Evolution by Anonymous Coward · · Score: 0

      99% of the meanings were put there by creationist morons looking for something they can actually criticise.

  25. There is no excuse... by CustomDesigned · · Score: 1

    for whitespace bugs in python. If your programmers insist on using their own personal editors with their own personal tab expansion preferences - then ban tabs. All fixed. Easily automated. Use a CVS script to reject *.py with tab chars.

    I have also been bitten by C bugs caused by white space. Someone with a different tab stop had entered the code incorrectly, but it looked correct in my editor (with standard unix 8 space tab stops). Never did notice the misaligned brace until running it through pretty print...

  26. I had no Idea by jgtg32a · · Score: 1

    Um my bad on starting this little flame war I had no idea.

  27. Unicode by spitzak · · Score: 1, Troll

    I posted about this before in a previous Python 3.0 article and a lot of people attacked me. However I very much feel that Pythons treatment of Unicode as UTF-16 is a HUGE problem that will cause no end of pain. I think a far cleaner solution to Unicode is to do the following:

    - Make unmarked plain quoted strings produce byte strings just like they do now. Unless there are backslashes, the contents are precisely the bytes that are in the input file. Keep the automatic casting of byte strings to unicode strings.

    - Force the encoding to be UTF-8 by default, or at least make it trivial to turn this mode on (in Python2.x the default init deletes the api to do this!)

    - The sequence \uXXXX in a byte string constant should turn into the correct UTF-8 sequence. And the sequence \xXX in a Unicode string should be interpreted as bytes and converted from UTF-8 to unicode. This is necessary so that a string constant can easily be changed between bytes and Unicode.

    - We must have lossless conversion of UTF-8 to UTF-16. The most popular method I have seen is to turn invalid bytes into 0xd8xx (which is invalid UTF-16 as it is lower-half surrogate pairs). Oddly enough this makes the UTF-16 api useless because the reverse conversion is not lossless, I have looked into this and it may be fixable but is complex: the to-UTF-8 converter must not translate a sequence of these to a legal UTF-8 sequence and instead convert that sequence to the typical 3-byte encoding of that number, and the from-UTF-8 converter must treat these typical 3-byte encodings as invalid byte sequences except when they are arranged such that the back converter would make them! This is messy but I see no other way to be able to use backends that insist on UTF-16 (in particular Windows filenames and it's clipboard).

    The reason for this is that real Python programs need to handle arbitrary data that is *PROBABLY* UTF-8. Note that by "PROBABLY" I mean that the programmer really really wants to think of it as a sequence of unicode characters, not as a "byte sequence", but it must NOT compare any two different byte sequences as being equal.

    I'm very afraid that Python3.0 as designed will encourage byte sequences to be treated as ISO-8859-1 rather than UTF-8 (because when you set the translation to that it is lossless and no errors are thrown, and \xXX does the same thing in both constants). IMHO this would be very, very bad for internationalization efforts. Believing the programmers will not take this easy solution, and instead rewrite their interfaces to the new byte/unicode naming and correctly handle exceptions thrown by converters is, I think, quite ignorant.

    I am not joking or trolling about this. This has bitten me already and forced us to change all our use of Python from Unicode to byte strings. And we are just reading metadata from image files. Searching for comments on Python 3.0 on the web, it is apparent that web programmers are encountering this far more often and are very worried about this, and they certainly are trying to handle many orders of magnitude more data from sources that may be actively trying to exploit security holes.

    1. Re:Unicode by omuls+are+tasty · · Score: 2, Informative

      I've read your previous posts. You weren't making any sense back then, and you aren't making any now either. However, for some reason your trolling seems to go well with the mods every time.

      It's a fact that the Unicode support in Python is not perfect (see e.g. this post). However, every and any issue you might have with Python 3 internal representation of Unicode strings, you are bound to have with Python 2 as well. The only thing that has changed is that the unicode and str types got replaced with str and bytes types respectively, and that you can't mix the two anymore without explicitely encoding/decoding them. And both of these are good things (TM).

      Also, for someone who has supposedly spent so much time investigating Python Unicode support to great depths, it's rather funny that you don't know that Python does not use UTF-16 for internal representation of Unicode strings. It uses either UCS-2 (the default) or UCS-4, which are both fixed-length encodings. Unfortunately, the default one cannot represent characters outside the Basic Multilingual Plane, but hey - neither can say Java.

      And finally, your argument that "real Python programs need to handle arbitrary data that is *PROBABLY* UTF-8." is utter rubbish. If you need to handle non-ASCII textual data you either:

      1. need to know the encoding of the data explicitely (say an HTTP header, or just a general convention) OR
      2. need to perform heuristics on the data and try to guess the encoding 0 and then hope for the best
    2. Re:Unicode by Anonymous Coward · · Score: 0

      By far the most significant change related to Unicode is that some byte-oriented interfaces have been completely removed.

      AFAICT, there is no way to get the program's raw argv[]. You can only get Python's unicode version, and if any of the arguments aren't actually text in the locale's encoding, you lose. And there's nothing that your script can do about it; even if you change the encoding on the very first line of your script, it's too late.

      The problem here is that Guido is a unicode True Believer, and intends to advance the cause by causing grief for anything that doesn't fit the unicode world-view.

    3. Re:Unicode by spitzak · · Score: 1

      It may say "UCS-2" but Python on Windows uses UTF-16, by the simple fact that it copies the strings unchanged to the Windows API, and that is UTF-16.

    4. Re:Unicode by spitzak · · Score: 1

      The very paper you linked to shows that the conversion built into Python does UTF-16, not UCS-2:

          >>> char = u"\N{MUSICAL SYMBOL G CLEF}"
          >>> len(char)
          2

      If it did UCS-2 the assignment would have to produce some kind of error, or at least produce a 1-character incorrect string.

      It seems to me you don't know what you are talking about.

      And I really want you to explain how you plan to handle UTF-8 data that may have errors in it. Are you thinking that somehow the real world outside the program is some kind of magical perfect place that does not produce incorrect strings? Do you think people are going to catch errors? Or do you think, like I do, that a million Python programmers are going to give up on Unicode completely and treat all 8-bit data as ISO-8859-1, which is the easy solution?

    5. Re:Unicode by omuls+are+tasty · · Score: 1

      Again, no. G clef is code point 0x1D11E, outside of BMP (higher than 0xFFFF), which UCS-2 (unlike UTF-16) can't represent. As a result, you get nonsense when you try to use that code point in Python. Which is what I said in my previous post.

      How do you handle incorrect UTF-8? You report the error, use 'some other error handling scheme instead of 'strict' for decoding, or write a decoder of your own.

    6. Re:Unicode by omuls+are+tasty · · Score: 1

      As I understand a lot of work has been done to provide both string and bytes version for the various filesystem/IO APIs, so I wouldn't say that the Python devs intend to cause "grief for anything that doesn't fit the unicode world-view". But if there's no way to get a bytes version of argv then yes I'd say it's a genuine problem, especially on POSIX systems.

    7. Re:Unicode by spitzak · · Score: 1

      I find it very hard to believe that Python turned U+1D11E into a two-word string when it "does not do UTF-16". No plausable bug in UCS-2 converting would do that, it would either produce an error or a 1-word string.

      How do you handle incorrect UTF-8? You report the error, use 'some other error handling scheme instead of 'strict' for decoding, or write a decoder of your own.

      Yes you would like to think that. However that is NOT what programmers do, and you are living in fantasy land if you think they will.

      Just yesterday, completely by coincidence, a very smart programmer encountered an invalid UTF-8 encoding that they were trying to display, and the resulting "fix" was: she ran EVERY string through a filter she called "sanitize" that replaced EVERY byte with the high bit set with the string "\xNN" (where NN is the byte in Hex). As far as she was concerned, this "fixed" it, because English text still worked, and the ONLY non-English she ever encountered was an invalid UTF-8 encoding. Basically not only did she completely break UTF-8, she even broke ISO-8859-1 characters!

      Please you have to realize what people will really do!

      Also an anonymous poster points out the Python3.0 forces the argv command line arguments through the default conversion and there is NOTHING the Python program can do about this. This is absolutely one of the worst possible decisions possible! You basically are unable to put an invalid encoding on the command line, and if there is no way to force UTF-8 before this happens, you cannot put anything other that ISO-8859-1 on the command line! So much for being able to make a Python program that can delete or rename a file with invalid UTF-8 in it's name.

    8. Re:Unicode by omuls+are+tasty · · Score: 1

      I find it very hard to believe that Python turned U+1D11E into a two-word string when it "does not do UTF-16". No plausable bug in UCS-2 converting would do that, it would either produce an error or a 1-word string.

      You should think a bit harder. U+1D11E is more than 2 bytes can encode. I'm not an expert on Python internals but you don't need to be one to make an educated guess how that gets encoded into 4 bytes (two UCS-2 characters). If Python did use UTF-16, it would not have trouble representing the character and it would report the length to be 1, not 2, but it does not do so because UTF-16 is variable-length and hence computationally much more expensive. For that very reason there's no language that I know of that does use UTF-16 to represent Unicode.

      Yes you would like to think that. However that is NOT what programmers do, and you are living in fantasy land if you think they will. Just yesterday...

      I'm not living in a fantasy land, you're just working with uneducated programmers. Encodings are not rocket science. And you and your programmers will likely have to learn to work with them if you're going to survive in any kind of non-English environment. Do you think Microsoft added wide byte support because they were kind?

      Also an anonymous poster points out the Python3.0 forces the argv command line arguments through the default conversion and there is NOTHING the Python program can do about this. This is absolutely one of the worst possible decisions possible! You basically are unable to put an invalid encoding on the command line, and if there is no way to force UTF-8 before this happens, you cannot put anything other that ISO-8859-1 on the command line! So much for being able to make a Python program that can delete or rename a file with invalid UTF-8 in it's name.

      ISO-8859-1? I don't think that word means what you think it means. I think the word you're looking for is "ASCII". There are Latin-1 byte sequences that are not valid UTF-8 sequences, so using Latin-1 on the command line does not make you safe in any way.

      As I already said, I otherwise concur that the string-only argv is a blunder and that a bytes version should be provided for the all the lowest level APIs.

    9. Re:Unicode by spitzak · · Score: 1

      len of a UTF-16 non-bmp character had certainly better be 2! If you think len() of a variable length endcoding should do anything other than return the number of code units, you have certainly never worked with variable length encodings. I can tell you that is USELESS, unless your purpose is to artifically make it impossible to use a variable length encoding (I have seen this ridiculous approach used as "proof" that UTF-8 is unusable, but it is completely bogus. strlen() returns the number of code units and if you think otherwise you are an IDIOT).

      I don't think the Python authors are so stupid. The fact that len returns 2 indicates they have not gone off the deep end, and they are supporting UTF-16 in the correct way.

      ISO-8859-1? I don't think that word means what you think it means

      What I meant by "ISO-8859-1" is the expected result when lazy programmers are forced to write their own decoders because Python is not providing a lossless UTF-8 encryption. I expect a very common solution will be "change every byte into the matching word". In fact I was shocked when I saw an even *worse* solution which was "change every byte with the high bit set to "\xNN" and then run the normal converter". I had better learn to stop being shocked by how shoddy the things programmers will do. I do not think you have experienced this or you would not be so blase about "oh they will implement their own decoders".

      Do you think Microsoft added wide byte support because they were kind?

      Microsoft forced wide bytes on us because a bunch of politically-correct idiots thought that there was some horrible problem if English gets the "better" shorter encodings. If they really wanted I18N they would have implemented UTF-8 like intelligent people had done before them in Plan9. Microsoft has done more damage to Unicode than 20 years of ASCII-only programmers could ever do with this boneheaded move.

      The only advantage of the Unix wars is that it delayed the same politically correct bullshit from appearing on Unix (Sun was certainly busy on implementing "wide characters" when I was working there), which probably would have forced it onto the internet protocols. Now only we have to deal with this crap on Windows. And with stupid people writing Python, apparently.

    10. Re:Unicode by omuls+are+tasty · · Score: 1

      I have to apologize to you. I was under the wrong impression that UCS-2 uses the entire word range, but it doesn't, the surrogate range is not used making it upwards-compatible with UTF-16. And since v2.2, Python can deal with surrogate pairs in \U escapes and some other places. While you get valid UTF-16 the support is half-baked. E.g. the G clef example is not recognized as a single character by unicode.name on my Windows build, rather as two (invalid) UCS-2 characters. My Linux build works nicely, suggesting it was built with UCS-4 support.

      Maybe I'm an idiot, but I'm not so hot for code units. The behavior of len() itself is not that important, but I don't want to think about cutting off code points in the middle while iterating and slicing, it's a much bigger concern to me than round-tripping invalid .

      I still don't understand the crux of your argument, really. Are you suggesting that UTF-8 should be used for internal representation? Now that I think about it, the fact that Microsoft chose another route suddenly makes it seem like it might be the cleverer choice ;) Still, Java, C# and Qt (the other languages/library I've used when dealing with Unicode) all use UCS-2/UTF-16, so the criticism could well-apply there as well. If Python 3 provides bytes versions of low-level interfaces I don't see what the fuss is about. People who write broken software in Python 3 would still do it in Python 2.

    11. Re:Unicode by spitzak · · Score: 1

      I believe counting anything other than code units will lead to far more broken strings. The problem is that eventually somebody will use that number as an offset rather than calling the "count n unicode points" function. Far better to use offset at all times.

      I do believe that using UTF-8 as an internal representation is a very good idea, but that few have realized it except for K&R in Plan9. The main reason is that it is the easiest way to preserve invalid strings and to avoid the many security and other bugs when a function maps more than one string to the same object.

      However another potential solution is a lossless conversion from UTF-8 to UTF-16 such as utf-8b. This is problematical because a simple implementation will break the lossless conversion of UTF-16 to UTF-8, which in effect means you cannot use any UTF-16 api to your program any more. It may be possible to make lossless conversion both ways, but I have tried to figure this out and it is not easy. In any case even if Python switched to UTF-8, this is going to be necessary if we are going to work around Windows stupid decision to use UTF-16 for filenames.

      Certainly with the variable length there is now zero reasons to choose UTF-16 over UTF-8, but UTF-32 I suppose still has an argument for it.

      My primary concern with Python3 was that you can no longer write bytestring()=="string constant" without it producing an exception if the bytestring has invalid encoding in it. Actually I learned that Python 3.0 gives you a type conversion error always for this, that is a lot better, but it means an awful lot of Python software will not compile.

      I am very concerned, just from previous experience, that unless Python makes it trivial to handle invalid UTF-8, most programmers will just punt and translate it as either ISO-8859-1 or even ASCII, which is extremely counter-productive if the intention is to encourage Unicode. And the converter throwing exceptions I think will lead to a lot of DOS attacks.

      I am also concerned that it is impossible to correctly change a string constant between unicode and bytes by putting a 'b' in front of it, due to different results for \x and \u. This I think will be a big obstacle for the "just change your code to use bytestrings" argument.

    12. Re:Unicode by omuls+are+tasty · · Score: 1

      I disagree about UTF-8. My native tongue and both of its scripts (Latin and Cyrillyc) use non-ASCII characters, and iteration and slicing pop up fairly often - much more often than invalid UTF-8. I'd really hate it to have to think about code point boundaries while working on them, UCS-2/UTF-32 are much nicer in that regard.

      Again, I'm not saying that the issue of handling invalid UTF-8 is not important. It is of course completely trivial to handle invalid UTF-8 in Python - it's just not trivial to round trip it. Still, if Python provides bytes interface for the lowest level APIs I don't see a huge problem there - even though a standard library solution would be very nice.

      And I certainly don't see a big problem with string literals, just add .encode()?

    13. Re:Unicode by spitzak · · Score: 1

      Iterating is pretty easy in UTF-8. The problem is that many people think iterating is "increment an integer and then pass it to this function" and the only way to implement that is to iterate again all the way from the start of the string, which is inefficient. A "iterator object" would do the job and you could select from several iterators depending on whether you wanted combining characters, words, Korean syllables, etc.

      The fact that lots of programmer think that you do string[++integer] to iterate is a huge problem and really the big obstacle to using UTF-8 or UTF-16. But it would help Unicode considerably if people were forced to use iterators, as they are the only way to correctly do decomposed characters and a number of languages. I think it would have been nice if Unicode had refused to implement precomposed characters to force people to do it correctly from the start.

      "constant".encode() does seem to work. Concatenation of the results can be used if you need to make a string constant containing both UTF-8 errors and readable source of Unicode characters. I still feel this is a lot less obvious that just adding/removing the 'b' in front of the string (ie I did not think of it) and thus I still feel the string constant issue should be addressed.

    14. Re:Unicode by omuls+are+tasty · · Score: 1

      Well you almost always use iterators in Python anyway, so yes the str class could do UTF-8 internally and implement an "intelligent" iterator. Slicing would require a bit more work, and generally the whole thing would be somewhat slower but who cares, Python isn't exactly the pinnacle of speed anyways. It's just a matter of priorities I guess.

      But why the heck would you want to create constant strings containing UTF-8 errors? The only need I see for creating purposely errorenous UTF-8 is for round tripping existing malformed identifiers (such as filenames), not to introduce new ones! And also, how would you treat \u escapes in byte strings? Always as UTF-8? According to the encoding of the file? Explicit is better than implicit...

    15. Re:Unicode by spitzak · · Score: 1

      Slicing would not be a problem if you cut at the iterator values. Also a huge amount of slicing is just to fit into fixed-size buffers that are glued together again later, this will not break UTF-8 if provisions are made to remember the parsing state between each block (this usually happens automatically, such as when writing to a pipe).

      The main reason for creating constant strings containing UTF-8 errors is because they are actually ISO-8859-1 or CP1252 strings. You are probably right that requiring a 'b' before them is acceptable. Most other uses don't require string constants but just preservation of errors, such as "write a Python program to delete this file with an invalid UTF-8 name".

      I would hard-code the meaning of \uXXXX in byte strings to mean UTF-8. Most other encodings of any interest are either 1-byte ones (and thus \xXX works and is already used by most programmers), there are the older Asian 2-byte encodings but I think it is safe to require encode() or literal \xXX sequences to quote them. The reason for this is to make it easy for programmers to change their constants between bytes and unicode when using UTF-8.

      I would also make \xXX in Unicode strings mean "decode this as though it is UTF-8". The primary reason for this is because this is the only form that is portable to many languages such as C++. You are right that using encode() would work as well but because no compilation error is produced people are going to make mistakes without this.

  28. 2.6 just as well supported as 3.0? I'm jealous! by KWTm · · Score: 1

    As a KDE fan, I have to say just how jealous I am that other software development communities actually have common sense.

    Apache: "Our newest is Apache 2, but you can use our rock-solid Apache 1 if you want."
    Python: "Our newest is Python 3, but you can use our rock-solid Python 2 if you want."
    KDE: "What!? You're still using KDE 3? But we told all our developers to drop all KDE 3 and move on to our newest KDE 4, which just came out with the second release candidate version of the beta for our alpha version! Get with the times, man!" ... sigh ....

    --
    404555974007725459910684486621289147856453481154 in hex is "You sank my Battleship?"
    [GPG key in journal]
    1. Re:2.6 just as well supported as 3.0? I'm jealous! by drinkypoo · · Score: 0, Offtopic

      Don't feel bad, GNOME is racing KDE to the bottom. The new network manager MIGHT work properly which would be a nice change, but I wouldn't know because the thing is so fucking inscrutable. And for NO REASON WHATSOEVER gnome-panel was set to a "required" application so you can't quit the last gnome-panel... Well, there MIGHT be a reason. I think that logout doesn't quite work right without it. So pathetic. I ended up putting XP back on my laptop because Linux has become less reliable on this system as time has gone by (Ubuntu's fault? Someone else? No idea) and XP "just works", sad but true. And it's a Centrino/Quadro system, too! And you know what? Windows XP is about the most usable GUI (when it works) that there is today. It avoids all the retarded mistakes of OSX and has all the most key features. And it's not Vista :)

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  29. No sense forking Python when you can trick it by dhTardis · · Score: 1

    Is there a Python clone that uses C style formating?

    See http://www.emacswiki.org/emacs-en/PyIndent

  30. Re:Roland Piquepaille: a case study in madness by Anonymous Coward · · Score: 0

    I bet you are that guy discussing with him the other day. The one who bought the low id account and confessed doing so.

    Get a life

  31. Re:Compatibility by danieltdp · · Score: 1

    Even being a flamebait, this one has to be replied. It is very important to understant that 3.0 AND 2.6 are being supported at the same time by the same core developers. Code breaks only if you go for 3.0. Stay in 2.6 and you will be fine.

    That being said, I do agree that this raises questions regarding the feeling that we now have two distinct languages. I really have mixed feelings on this one, but I am sure that no script got broken as you have the option of not going for 3.0 and still get support.

    --
    -- dnl