Python 2.6 to Smooth the Way for 3.0, Coming Next Month

← Back to Stories (view on slashdot.org)

Python 2.6 to Smooth the Way for 3.0, Coming Next Month

Posted by ScuttleMonkey on Friday October 3, 2008 @10:49AM from the my-tab-key-still-hates-you dept.

darthcamaro writes "Some programming languages just move on to major version numbers, leaving older legacy versions (and users) behind, but that's not the plan for Python. Python 2.6 has the key goal of trying to ensure compatibility between Python 2.x and Python 3.0, which is due out in a month's time. From the article: 'Once you have your code running on 2.6, you can start getting ready for 3.0 in a number of ways,' Guido Van Rossum said. 'In particular, you can turn on "Py3k warnings," which will warn you about obsolete usage patterns for which alternatives already exist in 2.6. You can then change your code to use the modern alternative, and this will make you more ready for 3.0.'"

184 comments

Min score:

Reason:

Sort:

More ready? by RichiH · 2008-10-03 10:56 · Score: 0, Offtopic

English is not my first language, but isn't "more ready" wrong?
1. Re:More ready? by Anonymous Coward · 2008-10-03 10:57 · Score: 0
  
  It's not eloquent but I don't think its wrong.
2. Re:More ready? by Onaga · 2008-10-03 11:04 · Score: 4, Funny
  
  But which one is correcter?
3. Re:More ready? by cpicon92 · 2008-10-03 11:04 · Score: 1
  
  Technically the correct term would be readier, but that sounds a little awkward to some people. Generally the rule is: One Syllable=[adjective]er More than one Syllable=more [adjective] Unfortunately very few people tend to adhere to this. They usually randomly pick one method or the other, or worse, they use both. (more readier).
4. Re:More ready? by Anonymous Coward · 2008-10-03 11:23 · Score: 2, Interesting
  
  Technically the correct term would be readier, but that sounds a little awkward to some people. Generally the rule is: One Syllable=[adjective]er More than one Syllable=more [adjective] Unfortunately very few people tend to adhere to this. They usually randomly pick one method or the other, or worse, they use both. (more readier).
  Ready has two syllables.
5. Re:More ready? by Shin-LaC · 2008-10-03 12:12 · Score: 1
  
  GP missed an important part of the general rule: adjectives that end with "y" form the comparative with "ier" even if they are two syllables. Uglier, happier, prettier, etc.
6. Re:More ready? by Anonymous Coward · 2008-10-03 12:46 · Score: 0
  
  There's only one way to find out!
  FFFIIIGGGHHHTTT!!
i like python by demmer · 2008-10-03 11:02 · Score: 0, Insightful

because of that
1. Re:i like python by h4rm0ny · 2008-10-03 20:46 · Score: 1, Interesting
  
  I like Python for a whole lot of other reasons too. I am a programming language snob. I used to write device drivers in C, I respected the power of the language and how unforgiving it was. My first reaction to Python was "layout is part of the language? Ha!". But credit to me, I tried it out properly, and fuck me, it's fun! I needed to carry out some very repetitive operations on a web-interace and naturally I didn't want to spend hours clicking buttons on a website. I thought to myself, I wonder how hard it is to manage cookies in Python. About forty minutes later I had a flexible and working Python script which was carrying out all my actions for me. And the forty minutes was mostly writing supporting code to compute the appropriate actions to send to the URL.
  
  Python, is quite simply, great. You only have to read from say, the Python Cookbook, to get a feel for how much thought has gone into the design of the language. I'm still a programming language snob, it's just that I found Python was well worth being proud of using. :D
  
  (Mind you, there online documentation could be better - PHP's site for example, is so much friendlier).
  
  --
  
  Aide-toi, le Ciel t'aidera - Jeanne D'Arc.
2. Re:i like python by AlXtreme · 2008-10-03 23:39 · Score: 4, Informative
  
  (Mind you, there online documentation could be better - PHP's site for example, is so much friendlier).
  They're actually hard at work on that problem too. In addition to Python 2.6 being released, the Python documentation is now generated using Sphinx. See for example the new tutorial output. Big WTF the first time I saw it, but it's a decent improvement with more in the pipeline.
  
  --
  This sig is intentionally left blank
3. Re:i like python by sg_oneill · 2008-10-04 17:07 · Score: 1
  
  Thats more to do with PHP having excelent documentation than Pythons being poor.
  I do agree the formal stuff is a bit terse, but once you've mastered the tutorial, most of pythons really just working out the modules, and they tend to be pretty well documented.
  
  --
  Excuse the Unicode crap in my posts. That's an apostrophe, and slashdot is busted.
Not sure about this one by the+eric+conspiracy · 2008-10-03 11:03 · Score: 2

Why not just wait for 3.0 to make the changes? That way you'll only have to test everything once.
And if it's like some other languages you might have a long time to wait before 3.0.
1. Re:Not sure about this one by jeremiahstanley · 2008-10-03 11:11 · Score: 5, Insightful
  
  Because the development cycle is longer than that for derivative projects. Imagine if you could have a cycled and tested app that was ready from day 0...
  
  --
  Hire me...
2. Re:Not sure about this one by arevos · 2008-10-03 11:46 · Score: 4, Informative
  
  And if it's like some other languages you might have a long time to wait before 3.0.
  Given that the first release candidate of Python 3.0 is already out, I doubt we'll be in for a very long wait.
3. Re:Not sure about this one by AM088 · 2008-10-03 11:58 · Score: 3, Informative
  
  I think the point is that with 2.6, your old code will work but will tell you what to change. If you move to 3.0, unless you have those changes already, it just won't work.
4. Re:Not sure about this one by Anonymous Coward · 2008-10-03 12:03 · Score: 0
  
  It's around 2 weeks till final is out. They're planning on doing 1 more RC I think, then it's over and done with. Code base is long since locked except for critical bugs.
5. Re:Not sure about this one by fyngyrz · 2008-10-03 12:07 · Score: 2, Insightful
  
  If you move to 3.0, unless you have those changes already, it just won't work.
  
  ...which is why some heavy python users, myself included, aren't going to use 2.6 or 3.0. I have huge amounts of python in operation, and the very last thing I'm going to do is break any of it with an incompatible language that happens to slightly resemble python (no matter who wrote it, and no matter what they call it, it isn't python if it can't run mundane python code.)
  Every once in a while we see one of these "brainstorms"; for example, Microsoft pulled VB from the office suite... only to put it back. Because the idea was stupid; there was a ton of production code / applications they flat out broke. Python's doing exactly the same thing, and it's not going to work out for the same reason(s.)
  If you're going to modify a language, you *must* do it in a compatible manner, otherwise what you're doing is making a new language that will require an entirely new community. Names notwithstanding, and resemblance beyond incompatibilities notwithstanding.
  
  --
  I've fallen off your lawn, and I can't get up.
6. Re:Not sure about this one by tazzzzz · 2008-10-03 12:54 · Score: 5, Informative
  
  ...which is why some heavy python users, myself included, aren't going to use 2.6 or 3.0. I have huge amounts of python in operation, and the very last thing I'm going to do is break any of it with an incompatible language that happens to slightly resemble python (no matter who wrote it, and no matter what they call it, it isn't python if it can't run mundane python code.)
  "slightly resemble python"? Python 3.0 code looks just like the Python that's been around for years. Maybe there's some handy new syntax (with), but it's still Python.
  This is not about fundamentally changing Python. This is about cleaning up warts, some of which have been around since Python 1.x.
  
  If you're going to modify a language, you *must* do it in a compatible manner, otherwise what you're doing is making a new language that will require an entirely new community. Names notwithstanding, and resemblance beyond incompatibilities notwithstanding.
  From what I've seen, the Python devs have put together about the best possible migration path while still actually making the changes that need to be made.
  Here's the picture, in case it's not clear: Python 2.6 is just as backwards compatible as the other 2.x releases. Which is to say that porting from 2.5 to 2.6 is pretty trivial. I'd expect any actively used and maintained library to be 2.6 compatible within weeks (and a great many probably didn't break at all).
  2.6 lets you use many of 3.0's features that don't break compatibility (and there are many). It also has a warnings mode to help you spot 3.0 incompatible code. And it lets you selectively turn on 3.0 features within a module.
  Want to start using the new print function?
  from __future__ import print_fiunction
  Voila! The print keyword goes away and you have the new print function. Certainly bits of new Python 3.0 syntax work now as well:
  try:
  1/0
  except ZeroDivisionError as e:
  pass
  The "as e" bit is new.
  Finally, there's actually a "2to3" tool that makes many of the changes in an automated fashion.
  The single biggest change from a compatibility standpoint is that "foo" is a unicode object in 3.0 and a string (set of bytes) in 2.x. You can even prepare for that switch:
  from __future__ import unicode_literals
  foo = "foo" # this will be unicode
  bar = b"bar" # this is a set of bytes
  unibar = bar.decode("utf-8") # get a unicode from the bytes
  They have put *a lot* of thought into how to make this transition. People will gradually shift to 2.6, just as they did with 2.5. And, over time, they will change to using the new features. They'll probably upgrade to 2.7 (yes, there will be one), and use the new features even more. And eventually their code will just be 3.0 code and the switch will be a no brainer.
7. Re:Not sure about this one by the+eric+conspiracy · 2008-10-03 13:20 · Score: 0
  
  I think the point is that with 2.6, your old code will work but will tell you what to change. If you move to 3.0, unless you have those changes already, it just won't work.
  So you are saying that if I fix all the warnings in 2.6, my code will work 100% unchanged in 3.0? If not, why wouldn't I just wait for 3.0 and then just fix everything ONCE?
  And now there is a 2.7? Sounds like death by a thousand cuts.
8. Re:Not sure about this one by MightyYar · 2008-10-03 13:38 · Score: 2, Informative
  
  If not, why wouldn't I just wait for 3.0 and then just fix everything ONCE?
  Well, first of all, 2.6 and 3.0 come out at the same time and share many of the same new features... so there's no "just wait for 3.0" possible, it's either/or right now.
  The advantage is that if you have a big pile of 2.5 code right now, you can slowly turn on the "use 3.0 style" switches in 2.6 and migrate your code one little switch at a time over a long period of time.
  That way, a few years from now when they decide to stop supporting new features in the 2.x path and you really "must have" some new feature in the 3.x path, it will be significantly easier for you to switch if you've turned on the "use 3.0" switches previously.
  
  --
  W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
9. Re:Not sure about this one by sjames · 2008-10-04 03:16 · Score: 1
  
  Why not just wait for 3.0 to make the changes? That way you'll only have to test everything once.
  Because 2.6 and 3.0 have different objectives.
  2.6 is simply the next in the 2.x line and one of the new features is the ability to import 3.0 features from __future__. Otherwise, it'll be no bigger a transition than 2.4 to 2.5 was. Existing programs will likely run without any issues.
  3.0 is a bigger transition. It will drop a few things now considered mis-features (if we had known then...). Most current programs will break in 3.0 (but often in ways that are trivially fixable).
  The hope is that by continuing development of 2.x in parallel with 3.x, by the time 2.x is fully retired, programs written for it will have (through the __future__ support of 3.0 features and syntax) been gradually updated such that they have been made 3.x compatible.
  It's essentially a way to avoid leaving a lot of people with a significant codebase out in the cold while also avoiding getting Python stuck in the past.
10. Re:Not sure about this one by Anonymous Coward · 2008-10-06 10:14 · Score: 0
  
  Hey now, don't try to bring facts into this! Everyone *knows* Py3 is a completely incompatible language that will break everything! Everything, I say! La la la--I can't hear you!
  Sigh.
  I remember the same sorts of fearful, antagonistic, ignorant bitching back when 2.x started. People swore up and down that they'd never move beyond 1.6 because "it wasn't Python anymore". Python not only survived, it thrived.
  There will always be that vocal minority who oppose any change, regardless of benefit, just because they might have to learn something new, or update some dubious old code. The GP doesn't want to update to Py3? Fine. We'll be happy to let his/her code take its honored place next to all that obsolete Fortan IV code no modern compiler can handle.
  Keep up or get out of the way.
tough transitions by AceJohnny · 2008-10-03 11:06 · Score: 4, Interesting

These kind of compatibility switches are make-or-break. I'm glad there's Python 2.6 to try to ease the problem, but Py3k means that everybody who publishes python software will all of a sudden have to maintain 2 branches, for Python 2.X line and Python 3.X line.
This isn't the same as one software package having "legacy" and "bleeding edge" branches, because that's their own choice. In this case the underlying language is forcing them to choose.
Honestly, I'm not confident in the economics of such transitions, and believe Py3k will die out.

--
Misleading titles? Inflammatory blurbs? Keep in mind that Slashdot is a tabloid.
1. Re:tough transitions by demmer · 2008-10-03 11:08 · Score: 0
  
  depends on how fast ubunto & co include 3.x when the target group of an appication already has 3.x there is no need to maintain the 2.x branch
2. Re:tough transitions by imbaczek · 2008-10-03 11:22 · Score: 2, Interesting
  
  it'll take several years, but a critical mass will switch eventually IMHO.
3. Re:tough transitions by Anonymous Coward · 2008-10-03 11:31 · Score: 0
  
  Everybody who publishes python software will all of a sudden have to maintain 2 branches, for Python 2.X line and Python 3.X line.
  Is there any reason for people not just to upgrade their Python? If they are using a Linux distro, it will most likely happen automatically anyway..
  That is said as a python software publisher, who mostly feels like just upgrading the code.
4. Re:tough transitions by Anonymous Coward · 2008-10-03 11:32 · Score: 3, Insightful
  
  Honestly, I'm not confident in the economics of such transitions, and believe Py3k will die out.
  Why would Python 3.0 'die out'? Even if you don't believe existing projects will make the switch there's no reason why new projects won't want to have the considerable benefits of using Python 3.0.
5. Re:tough transitions by DragonWriter · 2008-10-03 11:46 · Score: 3, Insightful
  
  These kind of compatibility switches are make-or-break. I'm glad there's Python 2.6 to try to ease the problem, but Py3k means that everybody who publishes python software will all of a sudden have to maintain 2 branches, for Python 2.X line and Python 3.X line.
  No, they don't "have to" maintain two branches. They can choose to, or they can maintain one (which depends on their particular circumstance); if necessary (if it is an app and not a library) they can just distribute the right interpreter with the app.
  
  This isn't the same as one software package having "legacy" and "bleeding edge" branches, because that's their own choice.
  Yeah, actually, it is exactly the same as that, at least as long as bug-fixes and maintenance continues on Python 2.x: the "one software package" being the Python interpreter.
  And, yeah, if those maintaining python-based projects choose to maintain Python-2.x and Python-3.x based versions, that will also be an instance of exactly what you say it wouldn't be, as it will still be their own choice.
6. Re:tough transitions by GooberToo · 2008-10-03 12:06 · Score: 2, Funny
  
  Why would Python 3.0 'die out'?
  Its widely believed a large asteroid fell from the sky and wiped the mighty python 3.0 out. ;)
7. Re:tough transitions by GooberToo · 2008-10-03 12:12 · Score: 5, Insightful
  
  For whatever reason, people fail to understand python natively supports parallel installs. Furthermore, since python's preferred script magic is "#!/bin/env python", rather than, "#!/bin/python", the executing script will use the python that it finds in your path. Additionally, you can also tie python to a specific version as "python2.5". Want a different python? Change your path. A script requires a specific version of python? Change the script to require it. It's one line and trivial. It's at the top of the file, so there's no hunting even.
  New python releases only pose problems for the uninitiated, the ignorant, or the dumb.
8. Re:tough transitions by Anonymous Coward · 2008-10-03 12:21 · Score: 2, Funny
  
  Honestly, I'm not confident in the economics of such transitions, and believe Py3k will die out.
  No wireless. Less space than a nomad. Lame.
9. Re:tough transitions by Anonymous Coward · 2008-10-03 12:32 · Score: 0
  
  I was going to add 'or the Perl programmer' but realized you already had.
10. Re:tough transitions by Anonymous Coward · 2008-10-03 13:16 · Score: 0
  
  Why bother with Python? Between Perl, Ruby, and Haskell, I have enough dynamism to make Python ashamed.
11. Re:tough transitions by xant · 2008-10-03 15:02 · Score: 2, Insightful
  
  Uh, it's almost exactly the opposite of what you're saying. You don't have to have a Python 3.x line; you can just deploy your code on Python 2.6, keep your working application working, and do all your new development and testing with Python 3.x warnings turned on. Then your next release is Python 3.0 compatible; or if you somehow fail to do finish the Python 3.x upgrades in time for your next release, you don't have to release on Python 3.x, you can just keep using Python 2.6 even though your code is partially upgraded.
  Partially upgraded codelines are always the problem with major version upgrades, and the Python 2.6/3.0 future compatibility is designed precisely so that this problem is not a problem.
  Python has bent over backwards to make the upgrade as easy as possible for people with serious Python applications in production.
  
  --
  It's rare that you're presented with a knob whose only two positions are Make History and Flee Your Glorious Destiny.
12. Re:tough transitions by thogard · 2008-10-03 16:44 · Score: 1
  
  Funny thing is that none of my production code base even runs under 2.6. I'm moving stuff from a very old server to new hardware and so far I've had to move 2.1,2.2,.2.3 and 2.4 over and some stuff broke when using the newest version of some of the old version. The result is now I have to spend lots of time maintaining programs that should not have to be maintained. I have never seen a project written in Python that meets its time or financial budget and stuff like this makes me want to ban the language from our systems completely. That also seems to be a tend with major open source projects that also never seem to get finished. There is a perpetual tweaking that must be done to keep things working and that is so wrong. Throw in the security issues and the maintenance costs of python code and there is no positive return on investment management point of view. Remember stability is good. Loading thousands of unauditable packages is bad.
13. Re:tough transitions by jgrahn · 2008-10-03 19:20 · Score: 3, Insightful
  
  For whatever reason, people fail to understand python natively supports parallel installs. Furthermore, since python's preferred script magic is "#!/bin/env python", rather than, "#!/bin/python", the executing script will use the python that it finds in your path. Additionally, you can also tie python to a specific version as "python2.5". Want a different python? Change your path. A script requires a specific version of python? Change the script to require it. It's one line and trivial. It's at the top of the file, so there's no hunting even.
  Changing my path is not practical. It's too broad. I'd have to write a shell script wrapper for the application which did 'env PATH=new_python:$PATH the_real_application "$*"' or something. And it's not just me; I'd have to communicate this to all other users of the system somehow. And changing one line of a script is not trivial, if I'm not root.
  All this may seem like minor things, but it adds up. And no other good language puts me in situations like that.
  
  New python releases only pose problems for the uninitiated, the ignorant, or the dumb.
  Or those of us who have been around for a while, and seen innocent backwards-incompatible changes become maintenance nightmares ... Ok, maybe not a nightmare in this case, but an inconvenience and annoyance which will keep being inconvenient and annoying for years, until the last Python 2.x dependency goes away.
  The best way to judge this would probably be to look at what Linux distributions like Debian want to do about Python 3.0. They ship one Python as the default (2.4 currently, for Debian) but provide others too. I bet even a change from 2.4 to 2.5 is a major migration for them.
14. Re:tough transitions by mark_hill97 · 2008-10-03 21:32 · Score: 1
  
  actually its more like this to change to a different version: ln -s /usr/bin/python2.5 /usr/bin/python
15. Re:tough transitions by Anonymous Coward · 2008-10-03 22:15 · Score: 0
  
  It could die out in the sense that Python 2.x stays more popular among users of the language, and keeps attracting significantly more new work. If such a situation comes to pass, eventually Python 3k will fall behind in support for new technologies, and many of its early adopters will migrate to other languages, or even to Python 2. Of course, some will remain with Python 3k - there is no programming language so obscure that there isn't a community of die-hard developers around it, but few programming languages ever make it to the level where even Python 2 has made it.
  I'm not saying that any of the above is likely, and the Python developers are apparently making a very good effort to make the transition from Python 2 to Python 3k as painless as a transition to a backwards incompatible version can be, but even then such a transition can make a proportion of developers to move on to other alternatives, instead of sticking with Python.
16. Re:tough transitions by afd8856 · 2008-10-04 00:10 · Score: 1
  
  I find it really easy to use virtualenv (sometimes together with zc.buildout) to encapsulate applications and modules. In fact, I tend to cuss when a module that I want to try doesn't offer a way to be easily integrated with virtualenv (such as an egg or at least a subversion checkout with a working setup.py package file).
  
  --
  I'll do the stupid thing first and then you shy people follow...
17. Re:tough transitions by Nevyn · 2008-10-04 02:40 · Score: 1
  
  Furthermore, since python's preferred script magic is "#!/bin/env python", rather than, "#!/bin/python",
  
  It's possible that some of the python maintainers prefer that, but the distributions sure as hell don't. "Grab a random python binary that you hit first in my path" does not make for a reliable system. It destroys any idea of security (SELinux, setuid, consolehelper, etc. etc.), and I've seen more than a couple of bugs where applications stupidly used it and then someone wanted to try a newer python in /usr/local ... oops, random stuff starts breaking.
  
  --
  ustr: Managed string API with ave. 44% overhead over strdup(), for 0-20B
18. Re:tough transitions by Bragador · 2008-10-04 03:21 · Score: 1
  
  Not only that but new users will pick up 3.0. Actually, I want to learn how to program and I'm waiting for 3.0 for that very reason.
19. Re:tough transitions by sjames · 2008-10-04 03:52 · Score: 1
  
  I'm glad there's Python 2.6 to try to ease the problem, but Py3k means that everybody who publishes python software will all of a sudden have to maintain 2 branches, for Python 2.X line and Python 3.X line.
  If it was even slightly hard to install 2 versions of Python at the same time, that might be true. However, that's not the case. I see nothing there that will FORCE a developer to maintain two versions of their Python software.
  Most will probably stick with 2.x for now, perhaps trying out 3.x or just importing from future and playing with updating their code. By the time 2.8 is out, insisting on at least 2.6 to run your code will be perfectly reasonable. At that point, start importing from future and actually updating. Once that's complete, you now support Python 2.x (where x>=6) and Python 3.x. with ONE branch.
20. Re:tough transitions by GooberToo · 2008-10-04 04:02 · Score: 1, Informative
  
  Changing my path is not practical. It's too broad. I'd have to write a shell script wrapper for the application which did 'env PATH=new_python:$PATH the_real_application "$*"' or something. And it's not just me; I'd have to communicate this to all other users of the system somehow. And changing one line of a script is not trivial, if I'm not root.
  You have a system admin problem not a python problem. If you can't run system installed software and your admin refuses to help, you have an admin problem. Making it a python problem when your admin isn't doing his job, doesn't really make it a python problem.
  All this may seem like minor things, but it adds up. And no other good language puts me in situations like that.
  You still have multiple ways to address the issue. It is trivial. Even with multiple users.
  Or those of us who have been around for a while, and seen innocent backwards-incompatible changes become maintenance nightmares ... Ok, maybe not a nightmare in this case, but an inconvenience and annoyance which will keep being inconvenient and annoying for years, until the last Python 2.x dependency goes away.
  Or you can trivially fix it as above and be done with it. You're making it a mountain when it isn't even a mole hill. If you have such problems, stop using that version. It really is that easy.
  Here's why your issues simply don't exist. For your situation to have occurred, you must have an admin that installs a new version of python and makes it the default system version. Furthermore, you must have multiple users using scripts installed system wide, which would have been installed by the same admin, which are now broken, and an admin that refuses to help make these system wide scripts which you can't edit, and can't run using the old version of python. And, that means you refuse to change your user environment. That's nothing but a bad admin and lazy users, pure and simple. Furthermore, it's unlikely that your admin would install a new python version as the default, installed the non-default libraries, and decided the user base doesn't really need the new version and that they users requiring python in the first place don't need to run the scripts which are the entire purpose of having a new python install in the first place. In other words, nothing in your argument makes practical sense.
  And yes, those are run on sentences. I used them on purpose to highlight your convoluted argument.
21. Re:tough transitions by GooberToo · 2008-10-04 04:04 · Score: 1
  
  When you have system dependencies, that's a little different. Just install your new python ensuring your old python is still the system default python. Change your path. You're done.
  The system scripts still run. Your new scripts now run using the new python. Oppps...stuff works well and no issues exist.
22. Re:tough transitions by brunson · 2008-10-06 04:21 · Score: 1
  
  Honestly, I'm not confident in the economics of such transitions, and believe Py3k will die out.
  Just like PHP 5?
  
  --
  09F911029D74E35BD84156C5635688C0
  Jesus loves you, I think you suck
23. Re:tough transitions by brunson · 2008-10-06 04:36 · Score: 1
  
  It sounds like your programmers suck, we bring in projects early and under budget all the time, no matter what the language.
  But even if they didn't suck, you could just leave the older versions of python installed and let the old, sucky code run in its preferred version.
  
  --
  09F911029D74E35BD84156C5635688C0
  Jesus loves you, I think you suck
24. Re:tough transitions by Anonymous Coward · 2008-10-06 10:39 · Score: 0
  
  > And no other good language puts me in situations like that.
  Then I would argue that the languages you use are not all that "good."
  Just about any production-level language (which Python now is), has had at least one backward compatibility break during its life. Try running any non-trivial Java 1.0 code on a modern VM. Or compile Visual C++ 6 code in Visual Studio 2008. Not only have these languages undergone dramatic library changes, they've undergone basic syntax changes. Even the "mighty" Perl has broken old code pretty badly between versions.
  You say you've been around long enough to see small changes bloom into nightmares. Okay, then you've also been around long enough to know that languages *must* evolve to stay relevant. We even have a term for languages which don't: dead. Fortran is alive (unfortunately); Fortran IV is dead. Would you still want to use Fortran IV today? Is Fortran IV relevant to modern programming needs? Not so much.
  Languages either evolve, or get replaced. Which do you think costs more?
  BTW, I think you missed that the GP actually answered your problem for you: if you require Python 2.5, your magic script should invoke "python2", not "python". That's why Python installs with the version number on the end of the interpreter. Your script will then run until 2.x is no longer available for your platform (by which time any 3.x problems should have been long sorted).
25. Re:tough transitions by Anonymous Coward · 2008-10-09 02:21 · Score: 0
  
  Change your code to work on Python3000 and stop whining.
What's new by ChienAndalu · 2008-10-03 11:12 · Score: 5, Informative

Here are the changes.
I really have to check out the multiprocessing package. Too bad that I have to wait for the print function and the new division handling.
1. Re:What's new by yuriyg · 2008-10-03 11:43 · Score: 2, Informative
  
  Too bad that I have to wait for the print function and the new division handling.
  Huh?
  from __future__ import print_function from __future__ import division
2. Re:What's new by ChienAndalu · 2008-10-03 11:44 · Score: 1
  
  ... wait for those features to be present by default ;-)
3. Re:What's new by mgiuca · 2008-10-04 02:44 · Score: 1
  
  from __future__ import division has actually worked since Python 2.2.
  It's just that Python 3.0 finally gives them an excuse to make it compulsory.
Re:The Case Against Barack Hussein Obama by vtcodger · 2008-10-03 11:15 · Score: 0, Offtopic

*** Obama will castrate our military and destroy our nuclear deterrent. ... etc,etc,etc for thousands of tiresome words. ***
Sounds good to me. I reckon I'll vote for him.

--
You can't see ANYTHING from a car, You've got to get out of the goddamned contraption and walk...Edward Abbey
haha..Python,....do you get it ?? by Anonymous Coward · 2008-10-03 11:23 · Score: 0

haha! Python ! Do you get it ? Guido named it after MPFC, this is very amusing. Do you get it ? haha Python....it's MPFC! Haha, do you see ?
Cut the crap. by Anonymous Coward · 2008-10-03 11:26 · Score: 5, Interesting

These changes are NOT earth-shattering. 2.6 is mostly just going to add a few new features, most important being the with statement. Most code written using Python idioms will be fine under 2.6 and 3.0. Now, if you tried to write Java-esque or C-esque code under Python, you might run into issues. Even then, I doubt it. They've been deprecating features for awhile, and 3.0 is probably the point at which they'll be yanked...you've only had a year or two of DeprecationWarnings.
I'm not sure why people whine about a language evolving. Retain backwards compatibility to a fault and you end up with C++, which is crippled by C-isms. You either know your code well enough that you could make the small incremental changes along the way, or you simply don't upgrade.
Python most needs sane standard libraries. It is far too much of a "let's throw this in there" with three different naming conventions and no package organization. It is a shame, because the language itself is pretty powerful in the right hands.
1. Re:Cut the crap. by jimdread · 2008-10-03 11:32 · Score: 2, Insightful
  
  I'm not sure why people whine about a language evolving.
  
  It's because all their old code breaks. And that hurts.
2. Re:Cut the crap. by Anonymous Coward · 2008-10-03 11:57 · Score: 0
  
  He just explained why it won't. In the cases where it does break, it's usually trivial to make the changes to get it working. Really, I haven't seen a single case where it would force massive rewrites of code.
3. Re:Cut the crap. by slimjim8094 · 2008-10-03 11:59 · Score: 3, Insightful
  
  So don't use Python 3.0. If it's critical, you're not upgrading from a known working base anyways, right? And if it's not, this will hold your hand.
  
  --
  I have developed a truly marvelous proof of this comment, which this signature is too narrow to contain.
4. Re:Cut the crap. by Anonymous Coward · 2008-10-03 12:55 · Score: 0, Interesting
  
  Well I'd guess you'd be in for a fair amount of digging if you want it to run on 3.X and you aren't using unicode for your text strings yet.
5. Re:Cut the crap. by RAMMS+EIN · 2008-10-03 13:03 · Score: 1
  
  Isn't there a simple solution to that? I mean, someone or some group could take it upon themselves to maintain the old incarnation of the language, and then old code would continue to run fine.
  
  --
  Please correct me if I got my facts wrong.
6. Re:Cut the crap. by Anonymous Coward · 2008-10-03 14:29 · Score: 0
  
  Because if we really wanted to rewrite the same program every year or so, we'd be using Microsoft languages and SDKs.
7. Re:Cut the crap. by xTantrum · 2008-10-03 15:53 · Score: 1
  
  I love python but god knows its the whore of programming languages.
  
  --
  $action = empty(PHP) ? backToC() : unset(PHP) ; "when the concrete cases are understood, the abstractions are readily
8. Re:Cut the crap. by Anonymous Coward · 2008-10-03 16:19 · Score: 0
  
  Why would that be? Text strings will act the same way as they always did, they'll just be unicode. It solves a lot of Python 2.x headaches. I really can't see how it could cause any big problems.
9. Re:Cut the crap. by Anonymous Coward · 2008-10-03 22:26 · Score: 0
  
  Well, the problem of existing text files, databases, network interfaces etc. comes to mind. It isn't enough if your code works within the language if all its external interfaces suddenly change and you need to dig up some backwards compatibility layer and start using it all over your code.
10. Re:Cut the crap. by Anonymous Coward · 2008-10-04 10:34 · Score: 0
  
  Text files and databases will be read just as correctly as they were before, they are just internally represented as unicode if you use standard strings, and output as unicode when you write to file unless otherwise specified. Really, this SOLVES problems, any possible problem it creates is fairly trivial to fix. No backwards compatibility layer needed. Text was handled badly and inconsistently before, now it's handled well.
Really? by Peaker · 2008-10-03 11:29 · Score: 5, Insightful

What Python features broke for you between minor releases?
I find it pretty hard to believe any Python user would actually switch to Perl, and stick to it.
You sir, are probably making this story up :-)
1. Re:Really? by pongo000 · 2008-10-03 14:30 · Score: 1
  
  What Python features broke for you between minor releases?
  I can assure you that the one Python application I use regularly (trac) cannot be upgraded between minor versions without large-scale upgrades to dependent modules. It was an absolute nightmare upgrading from a machine with 2.2. to 2.3...many hours spent tracking down modules that simply didn't work with 2.3.
  Coming from the perl world, having to deal with just one dependency nightmare with Python was enough to entice me to stay in the perl world...
  trac, however, is excellent software, so I put up with it being a Python app. I find it rather shameful that minor Python releases render so many modules as doorstops...
2. Re:Really? by love_encounter_flow · 2008-10-03 21:09 · Score: 1
  
  your post is not detailed in that part, but the incompatibilities that you talk about may very well stem from a very vexing problem with python, that is the version dependency of compiled c extensions. this probably does not matter for people on real computers, but windows folks will have to (*) get themselves visual c to re-compile an extension to fit the version of python; or (*) recompile all of python plus their extensions with a free alternative to msvc; or (*) hunt for ready-made binaries on the web. i believe the issue goes away when you use the ctypes module to access c executables, as then the c part and the python part work in a more decoupled fashion. as much as one may criticize windows folks for 'not being (able|willing) to compile c', it has to be said that (*) python is also here to make programming easier and more accessible and to take 'compilation' out of 'programming' where possible; (*) the choice of msvc as the standard c compiler for python on windows is a pity (as it is not a free tool); (*) reading how-to's on compiling *can * be quickly discouraging. i am not 100% sure but i think the necessity to recompile a given extension module for a new version of python is caused solely by a mismatch in the version numbers (the one in the extension and the one that python announces). this makes any troubles encountered when moving from python 2.x to 2.x+1 even more annoying. the area of smooth and easy upgrading of python c components has for sure a lot of openenings for creative thinkers. if it was for 100% python modules alone, i would migrate from 2.5 to 2.6 and early on to 3.0 with few second thoughts. anything that has ever stopped me upgrading has been the availability of the binaries that i didn't want to loose in the process.
3. Re:Really? by david_thornley · 2008-10-04 03:52 · Score: 1
  
  the choice of msvc as the standard c compiler for python on windows is a pity (as it is not a free tool);
  
  Does the Express edition work? That's not free as in speech, but it is free as in beer. (I've never used either the express edition or Python on MS Windows, so I don't know myself.)
  
  --
  "When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
String f**k up by spitzak · 2008-10-03 11:41 · Score: 3, Interesting

Reading the release, they have decided to really push 16-bit strings (they call this "Unicode" but it really is what is called UTF-16). I think this is a serious mistake.
The proper solution is to use 8-bit strings, but any functions that care (such as I/O) should treat them as being UTF-8. Most functions do not care and thus the treatment of "Unicode" and "bytes" are the same.
The problem with UTF-16 is you cannot losslessly convert a string that *might* be UTF-8 to UTF-16 and then back again. This is because any illegal UTF-8 byte sequences will be lost or altered. This is a MAJOR problem for code that wants to process data that is likely to be text but must not be altered under any circumstances, in effect such programs are forced to be ASCII-only, even though UTF-8 is purposly designed so that such programs could display all the Unicode characters. Note that bad UTF-16 (ie with mismatched surrogate pairs) can be losslessly converted to UTF-8 and back.
This has been a real pain so far in our use of Python, and I am quite alarmed to see that they are changing the meaning of plain quotes in 3.0 to "Unicode". This is really a serious step backwards, as we will be forced to tell anybody using our system to put 'b' before all their string constants and I suspect there will be a lot less automatic conversion of these strings to unicode when we want to display them. Note that Qt is also causing a lot of trouble here too.
1. Re:String f**k up by Animats · 2008-10-03 12:01 · Score: 4, Informative
  
  The problem is that there are three kinds of string-like objects in Python: UTF-16 strings, ASCII strings, and uninterpreted arrays of 8-bit bytes. Python 2.5 sort of supports all 3, with "array of bytes" the least well supported. Since this is a language without declarations, the semantics of this gets messy.
  The most common problem was that functions like ".read()" yielded strings, not arrays of bytes. This follows C standard library semantics, but is a bad fit to Python. In 3.0, ".read()" yields an array of bytes, not a string. If the data read is to be converted to a string, "decode" is required. That's the right answer.
  This is consistent with modern thinking about data representation. Consider SQL, which makes a similar distinction between "TEXT" and "BLOB".
2. Re:String f**k up by Anonymous Coward · 2008-10-03 12:17 · Score: 0
  
  So your customers are putting illegal UTF-8 in their string constants, and you're passing them around to functions that expect unicode input, and then Python and Qt are "causing trouble" when they won't accept it?
  Use bytes, do your own damn encoding/decoding after you've meditated long and hard on what the fuck kind of data you're really dealing with here, and stop being so damn half-assed. The API contracts are not to be broken.
3. Re:String f**k up by John+Millikin · 2008-10-03 12:24 · Score: 4, Informative
  
  Spoken like somebody that's never had to deal with encoding issues. Using UTF-8 internally is fine, but exposing it to the programmer is insane and error-prone. And if the programmer then proceeds to manipulate that raw byte buffer as a string, he's an idiot.
  
  The proper solution is to use 8-bit strings, but any functions that care (such as I/O) should treat them as being UTF-8. Most functions do not care and thus the treatment of "Unicode" and "bytes" are the same.
  You might not be aware of this, but computers are used for more than just transmitting text. I don't want my binary streams being rewritten to gibberish because some I/O routine was written to be too clever. Furthermore, not every system uses UTF-8. Some may even need to send data over a *gasp* network! Good luck getting every other computer in the world to start using UTF-8 immediately.
  
  The problem with UTF-16 is you cannot losslessly convert a string that *might* be UTF-8 to UTF-16 and then back again. This is because any illegal UTF-8 byte sequences will be lost or altered.
  If you try to convert bytes that aren't in UTF-8 using a UTF-8 codec, an error will be raised. This behavior is proper -- if you don't know what format your input is in, there's no way to perform text-based operations on it.
  
  This has been a real pain so far in our use of Python, and I am quite alarmed to see that they are changing the meaning of plain quotes in 3.0 to "Unicode".
  Every developer I know uses Unicode strings already. The new behavior is just one less character to type in front of literals.
  
  This is really a serious step backwards, as we will be forced to tell anybody using our system to put 'b' before all their string constants
  Otherwise said as: "We're too stupid to fix the glaring encoding errors in our product, so we'll just use bytes everywhere and pretend it's all working". Also, Unicode strings in Python are implemented with either UTF-16 or UCS-4 depending on platform.
4. Re:String f**k up by Anonymous Coward · 2008-10-03 12:39 · Score: 0
  
  I'm a bit confused here. It sounds like your data is a sequence of bytes, and not necessarily a sequence of valid characters. People expect a string to be a sequence of characters, and many many programs would break if strings could contain non-character garbage. Sometimes you need to work with raw byte sequences, and so python 3k provides the bytes type.
  Though using UTF-16 is no doubt a source of great pain for your project, it seems like a very niche issue you are having. However, I don't really understand what exactly what your project is trying to do, so I could be missing a greater issue here. Could you clarify what your input data is, how you manipulate it, how you output it, and at what points string conversions screw everything up?
5. Re:String f**k up by belmolis · 2008-10-03 12:41 · Score: 4, Informative
  
  Python does not use UTF-16 strings; it uses UCS-2 strings. The difference is that in UCS-2, every character is represented by exactly two bytes, while in UTF-16, some characters, those outside Plane 0, are represented by two "surrogate" pairs, totaling four bytes. UCS-2 does not provide any representation for characters outside the BMP. In other words, UCS-2 is a straightforward fixed length encoding, while UTF-16 is a more complex variable-length encoding.
  Python can in fact use either of two internal representations for text: UCS-2 or UTF-32 = UCS-4. If you give the option --enable-unicode=ucs4 to configure when building Python, you will get a Python that supports all of Unicode rather than just the BMP.
6. Re:String f**k up by RAMMS+EIN · 2008-10-03 13:00 · Score: 1
  
  I think the real lesson here is that byte sequences and character sequences are not the same. Every character sequence can be encoded to a byte sequence (by using an appropriate encoding), and every byte sequence can be converted to a character sequence (by means of some decoding), but they are fundamentally different things. I wonder if we wouldn't be better off making this explicit, and providing distinct string (character sequence) and blob (byte sequence) types.
  
  --
  Please correct me if I got my facts wrong.
7. Re:String f**k up by spitzak · 2008-10-03 13:11 · Score: 2, Interesting
  
  No, think a little harder.
  Imagine a file system that names the files with strings of bytes.
  It is absolutely vital that if I ask for a list of files and then try to open them, that this all work, no matter what byte sequence has managed to get in there as a filename.
  It is also *nice* but nowhere near as vital that I be able to show these names to users and they read them as Unicode strings.
8. Re:String f**k up by spitzak · 2008-10-03 13:18 · Score: 1
  
  Interesting. I was afraid they were making all these functions return strings. If they are returning bytes as well it would certainly make things a lot better. However I would expect them to have the same trouble I am having.
  Let's assume read returns a string of bytes. What I am worried is that the following example text will not work as expected:
  if file.read()=="utf8 string" ...
  I expect this will automatically convert the result of file.read() to UTF-16 and then do the comparison. This will not produce the correct test if in fact the UTF-8 is an invalid encoding. Even if it turns the result into a string with error characters, it will still match the other string if it had error characters in the same place resulting from a different wrong utf-8 string.
  From your description it sounds like the following will do the correct thing, which is better than I thought from my reading:
  if file.read()==b"utf8 string" ...
  So at least this can be achieved. However I am worried that users will be tempted to type the incorrect code because it is easier.
  It's possible that the == test will not work unless the compared string is a bytes string, but I would think that would break far too many Python programs. The other possibility is that failures to convert the utf8 to Unicode will throw an error, but then you have just introduced a million DOS flaws into everybody's programs.
9. Re:String f**k up by spitzak · 2008-10-03 13:28 · Score: 2, Interesting
  
  You might not be aware of this, but computers are used for more than just transmitting text. I don't want my binary streams being rewritten to gibberish because some I/O routine was written to be too clever
  Thank you for explaining exactly why I want UTF-8 to be used, while thinking you were arguing against it.
  Data is NOT just text. Therefore we should not be mangling it because we think it is text. We have enough trouble with MSDOS inserting \r characters. This crap is a million times worse.
10. Re:String f**k up by spitzak · 2008-10-03 13:32 · Score: 2, Insightful
  
  People expect a string to be a sequence of characters. Please notice the first word in that sentence.
  "People" are not computers. "people" LOOK at the display. People are not trying to copy the data literally from one place to another or do comparisons of strings or read files that might (horrors) not contain correct UTF-8 data. There is no reason to mangle the data until the very last moment before it is put on the display.
  I can quite confirm that if you have more than one way to represent the same sequence (such as different ways of producing the same UTF-8 error) you will produce a MAJOR screw up, quite likely an exploitable security hole. It also is not nice if "copy" mangles data just because it had a sequence that could not be coinverted correctly to glyphs.
11. Re:String f**k up by spitzak · 2008-10-03 13:33 · Score: 0, Troll
  
  No, Python is using UTF-16 nowadays. At least be somewhat informed before trying to argue with me about this.
12. Re:String f**k up by spitzak · 2008-10-03 13:42 · Score: 2, Interesting
  
  Spoken like somebody that's never had to deal with encoding issues. Using UTF-8 internally is fine, but exposing it to the programmer is insane and error-prone. And if the programmer then proceeds to manipulate that raw byte buffer as a string, he's an idiot.
  The compiler will turn "unicode" into the utf-8 encoding. The programmer does not see \xnn sequences of the utf-8 bytes. Try some modern compilers with utf-8 support some day before you say anything stupid again.
  Any programmer that modifies UTF-16 as a raw array of words is an idiot. Besides surrogate pairs, there are combining characters and bidirectional indicators and lots of other trouble. In fact I prefer UTF-8 exactly because it discourages such misuse of strings, which are really made of words, sentences, etc.
  If you try to convert bytes that aren't in UTF-8 using a UTF-8 codec, an error will be raised. This behavior is proper -- if you don't know what format your input is in, there's no way to perform text-based operations on it.
  You have just introduced a massive DOS hole into your programs. Or do you really think you should run a "is this correct UTF-8" call before any attempt to convert? Sorry, it is not going to raise an error, it will instead convert to error UTF-16 characters.
  Every developer I know uses Unicode strings already. The new behavior is just one less character to type in front of literals.
  You know that Python will convert your bytes from UTF-8 to "Unicode" automatically when needed? No you didn't? Might want to study up on that...
  Otherwise said as: "We're too stupid to fix the glaring encoding errors in our product
  The encoding errors are not in our product. They are in the files we are attempting to read (metadata attached to images, mostly). Dumbass
13. Re:String f**k up by spitzak · 2008-10-03 13:51 · Score: 3, Insightful
  
  I think the lesson is that there is ONLY byte sequences.
  The fact that some code can interpret that byte sequence and draw something on the screen that the user thinks of as "text" is completely irrelevant and should not be a fundemental datatype of a programming language. This should be part of the code that draws the text. Imagine if every other type of data, such as image pixels, or sound samples, had a different IO routine and you could never read a file with the wrong routine because the conversion was lossy.
  The real problem is that everybody's mind has been polluted by decades of ASCII where there was no difference between characters and bytes. All I can suggest is to try to think of text as words or sentences. Nobody would suggest that it would be good to make all words use the same amount of storage, or that it is important that you be unable to split a string except at word boundaries. But there has been so much use of ASCII that people think this is important for "characters".
  I also believe there is a serious politically-correctness problem. Otherwise logical programmers are consumed with guilt because Americans get the "better" short encodings, and therefore feel they have to punish themselves by making the conversion to i18n as painful as possible so that Americans have just as much trouble as anybody else. The fact that they have actually made I18N far harder for everybody and thus actually discouraged it is the ironic result of this guilt.
14. Re:String f**k up by Animats · 2008-10-03 13:56 · Score: 2, Informative
  
  From What's new in Python 3.0: The str and bytes types cannot be mixed; you must always explicitly convert between them, using the str.encode() (str -> bytes) or bytes.decode() (bytes -> str) methods.
  That's the right way to do it, but I agree that as a retrofit to existing code, it's a headache.
  Worse, it's a problem that's detected at run time, not compile time, at least with the CPython implementation.
15. Re:String f**k up by belmolis · 2008-10-03 14:03 · Score: 4, Informative
  
  In fact I am better informed than you are. When not compiled to use UCS-4, Python uses what is properly called UCS-2, with half-baked extensions for treating it as UTF-16. Certain functions know about surrogate pairs, such as those that convert between UTF-8 and the internal representation. However, such basic functions as len do not know about surrogate pairs. Try giving a character outside the BMP as the argument to len. It will return 2, not 1.
16. Re:String f**k up by tazzzzz · 2008-10-03 14:09 · Score: 4, Informative
  
  Reading the release, they have decided to really push 16-bit strings (they call this "Unicode" but it really is what is called UTF-16). I think this is a serious mistake.
  The proper solution is to use 8-bit strings, but any functions that care (such as I/O) should treat them as being UTF-8. Most functions do not care and thus the treatment of "Unicode" and "bytes" are the same.
  I'm going to try once more, slightly differently. Two other people apparently have tried and failed.
  Python 3.0's handling of strings is basically the same as Java's, because it has proven to work quite well there.
  For webapps, and the rules may be a little different on the desktop, "best practices" in Python for some time have been that you use unicode objects everywhere internally when you are representing text. When you hit a boundary (a file on disk, the net), you encode that unicode string into whatever encoding makes sense (often UTF-8). So far, so good, I hope?
  Python's internal representation of unicode objects is only relevant in that you need it to support whatever code points you care about. I don't think there are any code points that you can represent in UTF-8 that Python will screw up after decoding/encoding. I'm sure there are many people who would be interested to see such a test case.
  If you have a bunch of bytes that *might* be UTF-8, you're screwed. "process data that is likely to be text but must not be altered"? What do you mean by text? 7-bit ASCII? UTF-8? And where is the text coming from? Unless you tell Python the encoding of the file, you're going to get bytes out, not unicode objects.
  The whole point is that Python unicode objects know how to represent code points. If you have get a set of bytes from somewhere you *have* to know what encoding it is in order to be able to treat it as a bunch of text characters. Python unicode objects will not be "bad UTF-16". How they're stored is not generally important. What's important is that Python internally keeps track of the code points and will either successfully convert to whatever encoded sequence of bytes you want or it will raise an exception because the encoding you've chosen doesn't have one of the characters in your string.
  Python 3.0 makes this all clearer. When you talk about a "string", you're talking about a bunch of unicode characters. Anything else is a collection of bytes.
  By the way, you can specify what encoding a Python source file is in so that your string literals are all properly decoded.
  For further reading...
  http://www.joelonsoftware.com/articles/Unicode.html
17. Re:String f**k up by tazzzzz · 2008-10-03 14:14 · Score: 3, Informative
  
  Actually, this has been explicit in Python for some time. In Python 2.x, "string" objects are byte sequences and "unicode" objects are character sequences.
  What changes in Python 3.0 is that "unicode" objects have been renamed "string" and "string" objects have been renamed "bytes". So, not only is it explicit, but the naming makes more sense.
  The other related change is that string literals in your code are interpreted as Python 3.0 "string" objects ("unicode" in Python 2.x terminology), whereas previously you had to stick a 'u' in front of the string to get that behavior. And you can indeed specify the encoding of your source files, which is nothing new.
  All of this to say, you're right on the money and Python is already in the spot you describe as "better off".
18. Re:String f**k up by tazzzzz · 2008-10-03 14:27 · Score: 1
  
  I think the lesson is that there is ONLY byte sequences.
  The fact that some code can interpret that byte sequence and draw something on the screen that the user thinks of as "text" is completely irrelevant and should not be a fundemental datatype of a programming language.
  No, text is important and there certainly are more than byte sequences. Yes, byte sequences are important and they certainly still exist in Python 3.0 (and, in fact, you now get a mutable byte sequence type as a bonus).
  Let's say I have a webapp and there's a form with a state/province field. The user selects "California" from the list. The browser converts that set of characters to UTF-8 (because that's what's specified on the page) and then sends those bytes to the server. The web framework on the server properly spots the UTF-8 encoding, decodes it back into a bunch of characters.
  This sequence of steps allows me to validate that the characters "California" represent a valid state.
  If all I had was a series of bytes and not actual characters, I'd be SOL.
  >>> u"California".encode("rot-13")
  'Pnyvsbeavn'
  Pnyvsbeavn is a perfectly legit series of bytes to represent "California", but I clearly couldn't do any useful validation there unless I decode it.
  So, in many instances, the code does care about more than a sequence of bytes and "strings" containing "characters" are a very useful construct.
19. Re:String f**k up by earthbound+kid · 2008-10-03 14:34 · Score: 1
  
  UCS-2 does not provide any representation for characters outside the BMP
  That's not quite correct. You can use characters outside the BMP, they just have messed up len and slices, since they're actually made of two pseudo-characters.
  
  >>> pb u'\U00010000' >>> len(pb) 2 >>> pb[0] u'\ud800' >>> pb[1] u'\udc00' >>> pb u'\U00010000'
  
  I would show that I was able to print it, but Slashdot hates Unicode.
20. Re:String f**k up by earthbound+kid · 2008-10-03 14:39 · Score: 2, Insightful
  
  The proper solution is to do what they did: hide from the programmer what internal format is used for strings. The only time programmers should know about the encoding is when they themselves explicitly select an encoding so that they can turn a bunch of bytes into a string or when they're sending the string out into the world as a bunch of bytes. Encode and decode explicitly at the edges. Internally, hide the implementation details. It's just basic OO.
21. Re:String f**k up by Anonymous Coward · 2008-10-03 14:46 · Score: 1, Informative
  
  I was on your side right up until you said:
  
  Dumbass
22. Re:String f**k up by spitzak · 2008-10-03 15:00 · Score: 1
  
  Well in a lot of ways that (not doing any automatic conversion) is the only correct solution if they really want plain quotes to be Unicode and not bytes/utf-8. It will be such a pain to fix existing code, though, that I would not have thought they would do that.
23. Re:String f**k up by spitzak · 2008-10-03 15:07 · Score: 1
  
  The fact that len returns 2 for a non-BMP character indicates that UTF-16 *is* being used. len is returning the number of words that the string occupies. This is a useful number (it indicates how much memory is needed to copy the string). The number of "characters" is completely useless, it causes crashes if you think it has something to do with memory usage, and it is useless for analyzing text unless you believe all the letters in Unicode are like fixed-pitch Latin letters.
  x.len() when x is a UTF-8 string should return the number of bytes as well, and in fact this is how Python works.
24. Re:String f**k up by spitzak · 2008-10-03 15:09 · Score: 1
  
  You are describing UTF-16. The characters outside the BMP take 2 words and thus len is 2.
25. Re:String f**k up by Anonymous Coward · 2008-10-03 16:13 · Score: 0
  
  x.len() when x is a UTF-8 string should return the number of bytes as well, and in fact this is how Python works.
  That would be len(x), not x.len(). Congrats, you've just demonstrated that you don't know jack about how Python works.
26. Re:String f**k up by Anonymous Coward · 2008-10-03 16:14 · Score: 0
  
  Character sequences are an extremely useful abstraction. 'q' is a roman letter, '2' is a digit, '&lambda' is a Greek letter who's capital is '&Lambda'. This is easy to think about with characters, but much harder with bytes (particularly with variable length encoding). Perhaps in your particular case you can get away with thinking of strings as sequences of bytes, but many times (I really think most times) it is extremely useful to abstract and think of strings as character sequences, and that's what the string type does. strings not be used to represent arbitrary binary data -- that's what bytes is for.
  From another comment of yours:
  
  There is no reason to mangle the data until the very last moment before it is put on the display.
  If you don't want python abstract your data into a string of characters, then don't use string. If you are using the Greek and Cyrillic alphabet, changing capitalization, correcting spelling, or need to ensure that all characters are actual letters and not random garbage, don't use bytes.
27. Re:String f**k up by Anonymous Coward · 2008-10-03 16:48 · Score: 1, Interesting
  
  The number of "characters" is completely useless.
  Whawhawhaaaat? It is useless to know how many characters are in a string, but usefully to know the size the string takes in memory in a completely memory-managed language?
  Either you are exaggerating excessively (to prove your point, I posit charitably?), have an extraordinarily insular view of the programming world, are a troll, or, I think most likely, are an intelligent and thoughtful programmer the midst of temporarily insanity brought about by becoming entrenched fallacy defending a losing argument.
  Take a deep breath...No, really. Do it. Breath in....wait a moment....and out.
  All of Slashdot has pounced on your message, is arguing against you, and insulting you. It's because we're jerks. Also there's a technical problem in a few of your comments, but mostly we're just jerks. Take a night off, cool off, remember that we are jerks and the insults are not really directed at you, admit to yourself that there were mistakes in your argument, learn from them, and move on.
  Ahh, and I just realized why you care about the size a string takes in memory. You are doing IO and are trying to use a string to store non-text data. Don't use a string. Use the bytes type instead.
28. Re:String f**k up by amorsen · 2008-10-03 20:02 · Score: 1
  
  Hiding is only good if it actually works. Once you leak information about the internal encoding to the program, you have lost. Such as the length of a one-character-string sometimes being 2 -- have one program depend on that, and you can never change the supposedly hidden encoding. Of course noone would be stupid enough to return 2 when asked for the length of certain one-character-strings...
  
  --
  Finally! A year of moderation! Ready for 2019?
29. Re:String f**k up by tepples · 2008-10-03 23:40 · Score: 1
  
  Otherwise said as: "We're too stupid to fix the glaring encoding errors in our product, so we'll just use bytes everywhere and pretend it's all working".
  Or "our handheld device has only 4 MB of RAM, and the version of Python provided by our system library vendor, which is UCS-4, would allow us to load one-fourth the text into an in-memory database".
  
  Also, Unicode strings in Python are implemented with either UTF-16 or UCS-4 depending on platform.
  How, when, and by whom is this decision to turn on --with-wide-unicode (UCS-4) made for each platform? What Google keywords should I have used?
30. Re:String f**k up by VGPowerlord · 2008-10-04 03:28 · Score: 1
  
  (Note: I am not the grandparent)
  So, what if I'm from the UK using an editor that uses ASCII and I insert a £ into my python code or pull one from a data file? That's at code point 163 in ISO/IEC 8859-1... but if it's assumed to be utf-8, it'd be part of a multi-byte character because the first bit is set.
  
  --
  GLaDOS for President 2016! "Well here we are again. It's always such a pleasure." -- GLaDOS, 2011
31. Re:String f**k up by Animats · 2008-10-04 03:39 · Score: 1
  
  It might be helpful to run your programs through one of the more advanced Python compilers, like Shed Skin or PyPy, if and when they get converted to Python 3.0. They have implicit type analysis, and if you get data from "read" and apply a string operation without conversion, they will usually report that as a compile-time error. So you may get to find most or all of the errors up front. CPython, being a naive interpreter, will happily compile code that will always raise an exception at run time.
32. Re:String f**k up by mrvan · 2008-10-04 03:41 · Score: 1
  
  I'm using python in an environment with lots of external strings (from the web, from files), and the current mechanism is horrible. I end up with non-ASCII data in strings a lot if I'm not extremely careful with thinking about which string is ASCII and which is uninterpreted bytes, and have spend endless hours debugging silly decoding problems.
  If nothing else, having the read() methods return bytes and dealing with strings as unicode objects (regardless of internal encoding, I doubt that the python spec forces an interpreter to choose a unicode encoding for internal use) forces you to think about the encoding/decoding, which is a necessity in the post-ASCII world.
  I would think that they should also enable you to specify an encoding on opening a file, so you can do something like
  f = file.open('/tmp/bla', 'utf-8')
  mystring = f.read()
  rather than forcing you to do f.read().decode(), which I think is ugly and tedious.
  As a student, I always thought that the plethora of streamreaders, writers, stringwriters, bufferedwriters etc in java was a complete mess, but I definitely appreciate it now...
33. Re:String f**k up by Tacvek · 2008-10-04 03:47 · Score: 1
  
  How, when, and by whom is this decision to turn on --with-wide-unicode (UCS-4) made for each platform? What Google keywords should I have used?
  Well that obviously varies by the platform. Under Debian GNU/Linux the decision would be made by the maintainer of the python package. But does it really matter? On what platform are you forced to use the python provided by the system vendor, rather than your own package?
  
  --
  Stylish sheet to fix many problems in Slashdot's D3: https://gist.github.com/801524
34. Re:String f**k up by tepples · 2008-10-04 09:41 · Score: 1
  
  On what platform are you forced to use the python provided by the system vendor, rather than your own package?
  On platforms that verify digital signatures on executables and where certificates aren't handed out like candy. But still, for applications deployed in Europe and the Americas (not east Asia), UCS-2/UTF-16 is still significantly larger than UTF-8.
35. Re:String f**k up by earthbound+kid · 2008-10-04 20:03 · Score: 1
  
  Here's the thing: that only happens in Python if you go outside the BMP, but even in the best character encoding scheme, unless you normalize, you can't tell if é is U+00E9 (Latin small letter e with acute) or e plus U+0301 (Combining acute accent). So, you can never really trust the length of a Unicode string.
  Would it be better if Python reported the length of non-BMP characters correctly? Yes. But, given how funky Unicode can be, it's an understandable trade off to make.
36. Re:String f**k up by earthbound+kid · 2008-10-04 20:05 · Score: 1
  
  Well, in that case, Python must be using UTF-16 in version 2.6 for OS X and not UCS-2.
37. Re:String f**k up by spitzak · 2008-10-05 06:20 · Score: 1
  
  Are you sure it is doing this?
  In Python 2.5.2 this works:
  >>> u"abc"=="abc"
  True
  So it would appear some kind of conversion is done automatically.
  In my opinion this means programs will port easily, but it is going to open a whole lot of nasty holes as non-equal bytes strings can appear equal when converted to UTF-16.
38. Re:String f**k up by spitzak · 2008-10-05 06:27 · Score: 1
  
  If "characters" are important, then the combining characters and invisible formatting ones in Unicode mean that UTF-32 and every other way of encoding Unicode is useless as well, they are *all* variable length. It is in fact far preferrable to use UTF-8 as this forces programmers to understand variable length right away.
  I would also like a really clear explanation as to why "characters" are important, but "words", "sentences", "paragraphs", "lines", and all kinds of other structures that most readers of text think is important are ok to be variable-sized. Maybe we should be making *all* of them fixed-size, since they are "imporant".
  Currently use of UTF-16 is strongly biased against full Chinese and against any language that uses combining characters because it encourages a very Western interpretation of text as individual characters, despite the fact that a lot of the push for UTF-16 is due to a misguided attempt to be "fair" to foreign languages.
39. Re:String f**k up by spitzak · 2008-10-05 06:29 · Score: 1
  
  Sorry but I was pissed at him for calling me stupid: "We're too stupid to fix the glaring encoding errors in our product..."
  I should not be trading insults, you are right.
40. Re:String f**k up by spitzak · 2008-10-05 06:33 · Score: 1
  
  If you actually have the byte 163 in the file, it almost certainly will be an invalid UTF-8 encoding (it would have to be directly proceeded with an accented letter in ISO-8859-1 for it to look like legal UTF-8).
  One of the big reasons why I want the strings to remain bytes is because of exactly this. Yes the compiler can convert, but, believe it or not, we really do read text produced by other programs, often with incorrect UTF-8 encoding. Only by leaving it as bytes can we properly analyize this. It is relatively ok if when we draw your string we get an error box where your pound sign is. It is NOT ok if when we read your string it is *converted* to an error box and the fact that you attempted to put a pound sign in is irretrivably lost!
41. Re:String f**k up by spitzak · 2008-10-05 06:42 · Score: 1
  
  Why is it so important that "number of characters" (actually number of Unicode code points) is O(1), but "number of words", "number of sentences", "number of lines", "number of glyphs", and a zillion other possible questions are O(n)?
  This is the basic question that everybody here refuses to answer. They just blindly state that "it is really important for it to be fast to figure out the 'number of characters'"
  Please give an actual real example of source code where you *use* the "number of characters". You are either going to realize that you are not using the "number of characters" or you are going to make a fool of yourself, possibly by saying string[number_of_character(string)-1] or something. To avoid making a fool of yourself, please use at least two completly unrelated strings (where there is absolutly no relationship between the contents), where one of them is the one where you measured this "number of characters" and the other is the one you somehow apply this answer to, without in fact measuring a "number of characters" in this replacement string. Think very very very hard, to see if you can come up with an example where "number of bytes" (or "number of words" for UTF-16) would NOT work.
  IMHO the problem is that programmers have for decades been using ASCII where "number of characters" is O(1) and thus they think it is "important". In fact what is important is "number of bytes", despite your glib comment right at the start that even you seem to think it is unimportant.
42. Re:String f**k up by spitzak · 2008-10-05 08:33 · Score: 1
  
  Maybe I should clear this up a bit more.
  If your editor inserted the UTF-8 encoding of two bytes (0xc2,0xa3 I think) the result should be those same two bytes. However I/O routines when told to print the string should then decode the UTF-8 and produce the pound sign. If the compiler is producing something other than UTF-8 (such as current Python does if you put a 'u' before the quote) then the compiler does the conversion, not the I/O routine. My main argument is that I think this is a job for I/O, not the compiler, and I don't like Python changing the default.
  It is a requirement that if you actually put the 8 characters "\xc2\xa3" into the compiler input then you get the same two bytes. The primary reason for this is compatibility with existing compilers where this is the only reliable way to quote UTF-8. However this is also necessary so that you can make string constants with invalid UTF-8 encodings in them. I think doing u"\xc2\xa3" should produce a UTF-16 string with two characters in it as well.
  Giving the compiler the text "\xa3" must result in that byte being in the string, despite the fact that it is not a valid UTF-8 encoding. u"\xa3" should result in a UTF-16 string with a single pound sign in it.
  I think your question was if they inserted the pound sign as an actual 0xa3 byte in the source file. IMHO this should result in a single byte. Some people disagree, they say this should either produce an error (as it is not UTF-8) or it should be turned into the UTF-8 encoding of the pound or an error indicator. They may be right.
43. Re:String f**k up by tazzzzz · 2008-10-05 12:17 · Score: 1
  
  Maybe I should clear this up a bit more.
  If your editor inserted the UTF-8 encoding of two bytes (0xc2,0xa3 I think) the result should be those same two bytes. However I/O routines when told to print the string should then decode the UTF-8 and produce the pound sign. If the compiler is producing something other than UTF-8 (such as current Python does if you put a 'u' before the quote) then the compiler does the conversion, not the I/O routine. My main argument is that I think this is a job for I/O, not the compiler, and I don't like Python changing the default.
  The compiler also has to do I/O to read the file, and to do so successfully it needs to know what encoding your source file is in (and Python has had a mechanism for this for quite some time).
  What's good about this change, imho, is that people are now *forced* to consider what encoding their I/O is being done in if they want to do string-like things. For too long, too many people have just plain ignored the issue of encodings and run into problems at inopportune time. This change pushed people closer to best practices.
  By the way, when you put a "u" in front of the string in current Python versions (which is the default behavior in Python 3.0), it's not a matter of the compiler "producing something other than UTF-8". Rather, the compiler is using either your declared encoding for the file or your system default encoding to decode your source file and turn your literal string into a proper unicode string. If you stick UTF-8 encoded literals in your file and tell Python that it's a UTF-8 file, you will get a proper unicode object and you can convert to whatever encoding you want when you are presenting that literal externally.
44. Re:String f**k up by tazzzzz · 2008-10-05 12:31 · Score: 1
  
  No, think a little harder.
  Imagine a file system that names the files with strings of bytes.
  It is absolutely vital that if I ask for a list of files and then try to open them, that this all work, no matter what byte sequence has managed to get in there as a filename.
  It is also *nice* but nowhere near as vital that I be able to show these names to users and they read them as Unicode strings.
  While many file systems do likely represent file names with strings of bytes, odds are that the OS is using some kind of encoding for those filenames. After all, the OS should be able to display filenames to the user, right?
  So, it all boils down to what the Python 3.0 os.listdir (and related) routines return. I don't know the answer to that offhand (and I don't feel like building Python 3.0 to confirm). If Python has no idea what encoding the filenames are in, it has no choice but to return bytes objects.
  You can only ever get a unicode object if you know what encoding the source is, and that would go for filenames as well.
45. Re:String f**k up by Anonymous Coward · 2008-10-05 17:20 · Score: 0
  
  I get:
  [code]>>> len(pb)
  1
  >>> pb[0]
  u'\U00010000'
  >>> pb[1]
  Traceback (most recent call last):
  File "", line 1, in
  IndexError: string index out of range[/code]
46. Re:String f**k up by spitzak · 2008-10-06 04:55 · Score: 1
  
  I think the byte string should contain the literal bytes that are in the input source file, so in effect the string is in the encoding of the source file. However in most of my posts I am assuming the encoding is UTF-8.
  I agree that a u"" string should convert from the encoding the compiler is using to read the file to UTF-16 (or apparently UTF-32 on Linux). However I am very worried that doing this by default (rather than the previous requirement to put a u in front) is going to cause a lot of grief, because it is no longer a 1:1 mapping. Strings are used for things other than text and this is going to cause trouble. I also very much dislike the fact that this does not force people to make their input files be UTF-8, in fact it sounds like Python may not even default to UTF-8, because of incompatability. This is very very bad IMHO.
47. Re:String f**k up by spitzak · 2008-10-06 05:03 · Score: 1
  
  The file system identifies files with an array of bytes (or 16-bit words on NTFS). There is no "encoding", that is a function of the I/O routines that draw this array of bytes on the screen to present it to the user. The fact is that no matter what you do, it is possible to put an invalid "encoding" into that array on that disk. We MUST be able to specify that invalid encoding filename in the api (at least so that it can be deleted).
  The only portable way to do os.listdir is to return arrays of bytes. On NTFS/Windows it should convert from UTF-16 to UTF-8. Fortunatly invalid UTF-16 can be losslessly translated to UTF-8 and then back (as long as unmatched surrogate pair halves are preserved, which is easy to do). So this will be able to present and handle invalid filenames on both Unix and Windows.
  I am very afraid they will f**k it up and make os.listdir return "unicode" (which will really be on Windows whatever array of words the file is identified with, whether it is valid UTF-16 or not). They will then write some monstrous mess on Unix because they will quickly discover that outside the perfect academic world, invalid encodings do end up naming the files. Most likely they will have to put escape sequences of some sort into their UTF-16 to quote the exact bytes that were in the UTF-8.
48. Re:String f**k up by Anonymous Coward · 2008-10-06 10:52 · Score: 0
  
  Your post might have been helpful and interesting if you had not devolved into so much elitist name-calling.
  Did someone miss their nap-nap?
49. Re:String f**k up by earthbound+kid · 2008-10-06 21:25 · Score: 1
  
  Are you OS X or Linux or Windows? I'm on OS X. My understanding is that this varies from OS to OS.
50. Re:String f**k up by asretfroodle · 2008-10-08 13:57 · Score: 1
  
  Sometimes people do want to know the number of characters. Comparing the average word lengths between two documents for instance. Try not to get so worked up about it.
51. Re:String f**k up by spitzak · 2008-10-09 02:57 · Score: 1
  
  If you want to compute the average word length, you need to locate the spaces between the words. Since this requires scanning the text anyway, you can decode the UTF-8 at the same time.
  I want an explanation why "number of characters" has to be O(1) (constant time) rather than O(n) (takes time relative to the size of the string). Your example fails this test.
52. Re:String f**k up by asretfroodle · 2008-10-09 11:52 · Score: 1
  
  I'm not disputing the time complexity issue. Simply pointing out that there are valid reasons for looking at character counts rather than byte counts. You seemed to be getting quite excited about the fact that you perceive no practical difference between the two.
Re:Not really by Anonymous Coward · 2008-10-03 11:49 · Score: 0

nice one!
Not really by widman · 2008-10-03 12:00 · Score: 4, Interesting

You can keep your code compatible with both at the same time. Deprecated features are trivial to rewrite in most cases. There are even tools for this.
Doesn't matter by morgan_greywolf · 2008-10-03 12:08 · Score: 2, Interesting

Most distros already include the current and previous versions of Python. So Ubuntu, for instance, will include 2.6 and 3.0, and possibly 2.5 as well.
Furthermore, you can check to see what version of Python you're running under and make your code so that it accomodates both. This is all accessible via sys.version or sys.version_info
>>> sys.version '2.5.1 (r251:54863, Jul 31 2008, 22:53:39) \n[GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] >>> sys.version_info (2, 5, 1, 'final', 0)
With that knowledge, you just put all your version specific stuff in modules.
So you can do a:
import sys major,minor,micro,release,release_num = sys.version if (major > 3): import module_for_python_3.0 else: import module_for_python_2.x

--
My blog
1. Re:Doesn't matter by morgan_greywolf · 2008-10-03 12:20 · Score: 1
  
  Slashdot ate my equals sign somehow. change the (major > 3) to (major >= 3)
  
  --
  My blog
2. Re:Doesn't matter by kisielk · 2008-10-03 12:28 · Score: 1
  
  Another common pattern to use for this, as well as for libraries, is the following:
  try: import one_way_to_do_it except: import more_common_way_to_do_it
3. Re:Doesn't matter by morgan_greywolf · 2008-10-03 12:53 · Score: 1
  
  That can hide subtle problems, especially in your own modules. A module can fail to load and you'll be left scratching your head as to why, because your method will cause the interpreter to abort with an exception on the second module, even if the 'one_way_to_do_it' module is supposed to work.
  The better way would be to trap individual exceptions with something like this:
  try: import one_way_to_do_do_it except SomeException: handle_SomeException() except SomeOtherException: handle_some_other_exception()
  and so forth, so you get a better idea of what's going on. But I like my first way better because it tells the knowledgeable script user what's going on ;)
  
  --
  My blog
Module support for 3.0 is a long way off by Animats · 2008-10-03 12:10 · Score: 3, Insightful

Many essential third party libraries need to be converted for Python 3.0. I need M2Crypto (SSL support) and MySQLdb (MySQL support), neither of which is ready for Python 3.0, and neither of which has been updated in the last year or so.
My guess is that it will be three years before stock mainstream Linux distros come with Python 3.0 and a set of libraries that work with it.
1. Re:Module support for 3.0 is a long way off by Ixokai · 2008-10-03 19:48 · Score: 2, Informative
  
  This is quite true: but sort of irrelevant. Even the core developers on Python-dev have been seen to state on more then one occasion that they don't expect Python 3.0 to be the "standard" for a period of time that will stretch to years: one? three? The specifics don't exactly matter.
  That's why they've done the releasing of Python 2.6 and Python 3.0 in parallel (although 3.0 was recently delayed a little, the development of each have been hand in hand); they fully expect to maintain the 2.x line for awhile, and are already talking of 2.7.
  Each new iteration of 2.x will bring it closer to 3.0, and the third party modules will steadily become more and more available. Right now the IMHO biggest hurdle in the development of the modules for 3.0 is a lack of a serious conversion document from the point of view of the C internals. But they're even working on that.
  3.0 seems to be, more then anything else, a work yet in progress. Even when it's released, its not fully expected to everyone will be converting their code over to be 3.0. They don't expect people to *really* start using it in a standard way until 3.1, 3.2 or so -- and whatever version of 2.x that will accompany it that people willll be converting from at that time, complete with additional features to help ease the transition.
  Personally, I find the strategy for migrating Python to 3.0 ... comforting. I don't necessarily agree with *all* of the changes done to 3.0, but most I quite like. Since I have a massive codebase at work that's currently running on 2.x, a major/incompatible change to "fix" the language is something that alarmed me early on.
  However, now I know that 2.x will be supported for quite awhile, and new releases will be made upon it to ease the way, I have a roadmap to follow that makes the burden significantly easier. Once we update our codebase to 2.6., I'll probably start slowly modifying things to activate more optional 3.x-isms, and by that time the myriad third party libraries will probably be supported.
  2.6 brings a number of interesting features to us; and allows us to start working slowly towards migrating to the 3.0 world. This is a -very- well thought out migration plan, IMHO.
Re:Braces by Anonymous Coward · 2008-10-03 12:18 · Score: 0

Yeah, it's true. You can actually start using it right now with "from __future__ import braces".
Anthony Baxter on Python 2.6 and 3.0 by xixax · 2008-10-03 12:37 · Score: 3, Informative

Anthony Baxter gave a pretty good talk on the implications at LCA 2008 earlier this year.
http://video.google.com/videoplay?docid=4264641260805367198&hl=en

--
"Everything is adjustable, provided you have the right tools"
Old news... by pdxp · 2008-10-03 12:47 · Score: 4, Interesting

3.0rc1 (beta) is already available and has been for some time now. The advantage of 2.6 is not as much its backward-compatibility but its ability to tell you exactly what needs to change (via runtime warnings) for 3.0 without actually breaking your code. I've been using both for months now, so this article isn't exactly hot news.
Re:Not really by Anonymous Coward · 2008-10-03 13:05 · Score: 0

Thanks!
I always aim to please.
DB access.. by Anonymous Coward · 2008-10-03 13:44 · Score: 0

Between 2.x series I saw DB access strategies change, for example. That's the prominent one that pushed me over the edge to try perl.
1. Re:DB access.. by GooberToo · 2008-10-04 04:21 · Score: 1
  
  Please explain. If you used DBAPI standard interfaces, it's unlikely anything you were using broke or changed. Most DBAPI packages do a pretty good job (all I've seen) of explaining which interfaces comply with the DBAPI spec and which interfaces don't. My guess is you didn't pay attention. That's a coder problem, not a language problem.
2. Re:DB access.. by Anonymous Coward · 2008-10-08 06:32 · Score: 0
  
  Durrrrrrrrr
Re:Hahahahahaha! by Anonymous Coward · 2008-10-03 15:07 · Score: 0

This is pretty much a given in both Ruby and Rails! In every case I know of, users had advance warning from their interpreters when anything was deprecated and would become obsolete within a few versions.
Sounds like Python is bragging about something their chief competitor has been doing better for a long time.
Umm no... Python has had DeprecationWarnings for longer than Ruby has existed.
Re:Hahahahahaha! by Jane+Q.+Public · 2008-10-03 16:26 · Score: 0, Troll

Okay, that's cool, but Ruby and Rails have also had "upgrades in preparation for", similar to this. This is hardly news. It is like they are bragging about something that everybody else does.
No problem... by jopsen · 2008-10-03 20:58 · Score: 1

Microsoft and lots of other proprietary software companies does that all the time... :)
Trolling by Maguscrowley · 2008-10-03 21:18 · Score: 1

This trolling problem is getting out of hand. I really think that we should consider banning suspect IP ranges and proxies. Near half of this page is trolling. It's making reading real comments prohibitively difficult, especially with people responding to -1 posts.
ARGH. by Anonymous Coward · 2008-10-03 21:29 · Score: 0

In the immortal words of Wolfgang Pauli, this isn't right. It isn't even wrong.
If you are using text, YOU ARE USING UNICODE. There is NO such thing as plain text. No exception. Just because you don't understand where Unicode comes in between the bytes you naively think of as characters and the text as it appears on your screen, doesn't mean it isn't there. It does mean, however, that you are one of the countless incompetents whose code people like me get to waste hours fixing, so thank you, ÐÏÏÐ½Ó©ÅÐ. :|
Parallel installs even on Windows? by tepples · 2008-10-03 23:14 · Score: 1

I want to become less uninitiated:

For whatever reason, people fail to understand python natively supports parallel installs.
But some popular environments (Windows, Mac, shared web hosting) identify scripts not by their script magic but instead by their file extension. When I used Google to search for python parallel install windows, I got a whole bunch of results about parallel ports and parallel processing. Does a parallel install work in Linux, Solaris, *BSD, and the like, or is there a recommended way to use it with more popular desktop operating systems such as Windows and Mac OS X? And how do parallel installs interact with web hosting?
1. Re:Parallel installs even on Windows? by GooberToo · 2008-10-04 03:45 · Score: 1
  
  It is the way python simply installs. Each python install places its library into a numbered directory (e.g. python2.4, python2.5). The only thing you may have to change is the "python" proper binary, which is copied from or linked to the numbered python binary.
  In other words, each python install should have its own directory structure which insures one installation doesn't effect the other. The only other issues is which binary you get when you run "python". Typically "python" proper points to the newest install but that's easy to change too. Simply link/copy "python" back to whatever version you prefer for your default.
  I can't speak for OSX but the above is true for the other platforms. I'd be surprised if it is not true for OSX.
2. Re:Parallel installs even on Windows? by tepples · 2008-10-04 09:34 · Score: 1
  
  The only thing you may have to change is the "python" proper binary, which is copied from or linked to the numbered python binary.
  So, under Windows, how do I force a specific .py file to use C:\python24 or C:\python25 or C:\python26 or C:\python30 upon double-click, without changing behavior of other .py files installed on the same machine? And how can I make mod_python read the #! line before loading a module?
  
  I can't speak for OSX but the above is true for the other platforms.
  Mac OS X should act like FreeBSD. I'm more concerned about 1. Windows, and 2. shared web hosting using mod_python and the like.
3. Re:Parallel installs even on Windows? by GooberToo · 2008-10-05 04:41 · Score: 1
  
  Admittedly, I did forget about the Windows case.
  Create multiple users, each with its own path. Use runas features. Some people use wrapper scripts to set their path. Most people seem to prefer the first option as they typically don't use the command line in the first place. If you are a command line guy, you'll likely prefer the second option.
  A third option is to use cygwin, which does honor the environment's path and magic. Some people hate cygwin. If you're are command line person on windows, you should seriously consider cygwin as it addresses many of Window's short comings.
4. Re:Parallel installs even on Windows? by brunson · 2008-10-06 04:31 · Score: 1
  
  You have an OS problem, but a python problem. Switch to something that doesn't suck.
  
  --
  09F911029D74E35BD84156C5635688C0
  Jesus loves you, I think you suck
5. Re:Parallel installs even on Windows? by tepples · 2008-10-06 04:41 · Score: 1
  
  Switch to something that doesn't suck.
  How do I tell that to all my customers who do not want to pay an order of magnitude more to upgrade to dedicated hosting?
6. Re:Parallel installs even on Windows? by brunson · 2008-10-06 06:08 · Score: 1
  
  Request a government bailout for being dumbasses and hosting on windows? It seems to be all the rage.
  
  --
  09F911029D74E35BD84156C5635688C0
  Jesus loves you, I think you suck
7. Re:Parallel installs even on Windows? by asretfroodle · 2008-10-08 09:19 · Score: 1
  
  The way I've always done it is to create a batch file to launch the script.
  The python script then uses sys.version to check which interpreter was called. If it's nto the right one then it prints usage instructions and exits.
  I don't really have much experience with shared hosting environments - perhaps virtualenv could help?
Can't have a future statement in a try block by tepples · 2008-10-03 23:21 · Score: 1
Another common pattern to use for this, as well as for libraries, is the following:
try: import one_way_to_do_it except: import more_common_way_to_do_it
But how well does a try block work with things that depend on from __future__ statements that Python 2.5.x doesn't recognize, such as the different print syntax and the different string literal syntax ("8bitchars", u"32bitchars" vs. b"8bitchars", "32bitchars")? From Python 2.5.x's definition of a future statement:
A future statement must appear near the top of the module. The only lines that can appear before a future statement are:
- the module docstring (if any),
- comments,
- blank lines, and
- other future statements.
This appears to exclude try statements.
1. Re:Can't have a future statement in a try block by maxume · 2008-10-04 10:57 · Score: 1
  
  The from future statements are for writing forward compatible code (easing the transition from current versions to future versions).
  If you want code to run on python 2.5 and 2.6 (and thus be backwards compatible), you have to do without the alternative syntax for literals; you could, however, write your own print function and use that instead of the syntax, you just wouldn't be able to name it print (but you could name it something silly like print_function so that it would be easy to find and replace when you stopped supporting versions of python that do not have a print function).
  
  --
  Nerd rage is the funniest rage.
2. Re:Can't have a future statement in a try block by tepples · 2008-10-04 11:57 · Score: 1
  
  If you want code to run on python 2.5 and 2.6 (and thus be backwards compatible)
  So what do I do if I want code to run on shared web hosting plans that use Python 2.5 and shared web hosting plans that use Python 3.0?
3. Re:Can't have a future statement in a try block by maxume · 2008-10-05 00:12 · Score: 1
  
  I think the real answer to that question is that you will never have to do that (decent hosts will offer at least 2.6 for a long time, and it isn't that hard to do a local install). The other answer is that you write code for 2.5 and for the 2to3 tool and then run the translated version on the 3.0 host.
  
  --
  Nerd rage is the funniest rage.
Deploying to end users? by tepples · 2008-10-03 23:26 · Score: 1

So don't use Python 3.0.
That woul dbring the same problems as the transition from PHP 4 to PHP 5. How would I deploy my product to end users who have installed Python 3.x as the system-wide handler for .py files? Will Python Software Foundation recommend the use of an extension such as .py2? Conversely, if I do take advantage of Python 3.x, how would I deploy to end users who still use 2.x?
1. Re:Deploying to end users? by maxume · 2008-10-04 00:17 · Score: 1
  
  Py2exe and Py2app and so forth offer one way to work around this (The 2 in there means 'to', not python 2, the apps bundle scripts with an interpreter instance).
  
  --
  Nerd rage is the funniest rage.
2. Re:Deploying to end users? by Delkster · 2008-10-04 00:26 · Score: 1
  
  There's no reason why you couldn't have Python 2.5, 2.6 and 3.0 installed on the same system. You can even supply your own Python environment with your software package if you really want to.
  I don't know about Windows, but on *nix you can also specify which interpreter to use at the beginning of the script, and the different versions of the Python interpreter will be available under different names so you can differentiate. For example the default Python version on my system is 2.5, also available as "python2.5". Looks like I still have also 2.4 installed and available as "python2.4".
  So, if your code requires Python 2.5, you could just have "#!/usr/bin/env python2.5" at the beginning of your script, and it would be run using the python2.5 command.
  If you decide to write for 3.0, of course you'll have to specify 3.0 as a requirement. That happens with any new major versions of pretty much anything. If you take advantage of new features in Java 1.6, obviously you can't deploy it on Java 1.5.
  You'll need to have your end-users install Python 3.0 (unless you provide it yourself), but that doesn't mean they couldn't keep also using 2.x for whatever other needs they may have.
raise UnicodeError by tepples · 2008-10-03 23:32 · Score: 1

The problem with UTF-16 is you cannot losslessly convert a string that *might* be UTF-8 to UTF-16 and then back again. This is because any illegal UTF-8 byte sequences will be lost or altered.
Then set strict conversion, which will raise UnicodeError for any nonconforming byte sequences. My problem with UTF-16 is how it bloats in-memory databases of mostly-ASCII text by a factor of nearly 2 (or 4 if Python is compiled with UTF-32 to handle hieroglyphics and ancient Chinese).
1. Re:raise UnicodeError by spitzak · 2008-10-05 08:14 · Score: 1
  
  Throwing exceptions on bad UTF-8 strings is great if they are strings you control. It is not useful for strings provided by the outside environment. I can assure you that users want that data copied even if it contains errors, and they only want to see an error message when the data is interpreted.
  The best that could be done with exceptions is make some kind of union of the UTF-16 and the bytes (or perhaps convert the bytes by just padding each out to 16 bits), along with a flag indicating if the data converted right. Though it is possible you could save the overhead of repeatedly testing if the conversion works, I suspect most programs will have to leave the data as bytes.
True by theolein · 2008-10-04 00:05 · Score: 1

There seems to be a massive increase in trolling recently.
Re:Exactly why... by Delkster · 2008-10-04 00:10 · Score: 1

I will say the same thing that plagues python has plagued Java and probably hosts of other platforms.
Java? Some things have been broken occasionally in newer releases if you're doing stuff that's exotic enough, but in general I'd say that Java has been pretty damn backwards-compatible all the way. Save for those odd bugs, things written and compiled for Java 1.3 tend to run pretty niftily on 1.6.
#! on Windows or Apache? by tepples · 2008-10-04 00:43 · Score: 1

So, if your code requires Python 2.5, you could just have "#!/usr/bin/env python2.5" at the beginning of your script, and it would be run using the python2.5 command.
Is there a way to get Windows Explorer or Apache mod_python to follow the #! line instead of the .py extension?
1. Re:#! on Windows or Apache? by Hooded+One · 2008-10-04 09:39 · Score: 1
  
  For Windows, it's theoretically possible to write a dispatch program to associate with .py files that looks for a version-specific shebang, and tries to find the appropriate version of Python on the system. In the case of Apache, I think you're stuck with whichever version your copy of mod_python was compiled against.
2. Re:#! on Windows or Apache? by tepples · 2008-10-04 10:03 · Score: 1
  
  In the case of Apache, I think you're stuck with whichever version your copy of mod_python was compiled against.
  So how should publishers of software that runs on leased shared web hosting work around Python version incompatibilities? In PHP, it's normally done by associating .php4 to PHP 4, .php5 to PHP 5, and .php to one or the other based on the hosting provider's preference (or, if you're really lucky, the user's preference in the hosting control panel, like on Go Daddy). But I haven't seen any mention of extensions like .py2 or .py3.
Sigh... by Junta · 2008-10-04 03:01 · Score: 1

Don't mod something down just because you disagree. When I have mod points, I never downmod things out of disagreement. This is a legitimate concern over the python strategy. They have benefited from their flexibility (the language at a given instant I will give is relatively low on quirks as they are rethought and replaced, whereas perl is chock-full of quirks that you must learn to live with), but there is a price.
Nothing is perfect. Nothing is without flaws. To achieve one end, something almost always is given up. Don't mod a post down because it points out what was given up to achieve an impressive advantage.

--
XML is like violence. If it doesn't solve the problem, use more.
Re:Exactly why... by Anonymous Coward · 2008-10-04 04:17 · Score: 0

I agree, I had one major projects transitioning from 2.4 to 2.5 and modules like SOAP not work because someone had decided that some section should appear before some section. I had to basically shift through the SOAP code and "patch" them. That was the last time I used python.
Give me a frigging break! by Jane+Q.+Public · 2008-10-04 18:56 · Score: 1

"Troll" for THAT comment? Man, somebody does not know what "troll" means!

Nothing I can do about it but bitch, but my bitch is legitimate.

NEWS, modders: disagreement does NOT automatically equal "troll".
Speak for yourself. by fyngyrz · 2008-10-05 09:17 · Score: 1

No. You can go on all you want about "needed to change" and "autofix" and etc, but the bottom line is that this code presently isn't broken, and I am not about to fix code that isn't broken. It makes no sense on any level; financially, time-wise, or strategically. I have better things to do than refactor my code for entirely arbitrary reasons. Perhaps I just place a different value on my time than you do; that's fine. You should, of course, feel free to do whatever you like.

--
I've fallen off your lawn, and I can't get up.
Parallel installs even on Linux shared hosting? by tepples · 2008-10-06 07:51 · Score: 1

Request a government bailout for being dumbasses and hosting on windows?
Shared Linux hosting also has problems with different versions of an interpreter. I didn't see anything in the mod_python manual about ability to select an interpreter based on the #! line. I've changed the subject line to clarify.