Python 3.0 Released

Comment removed by account_deleted · 2008-12-04 01:02 · Score: 5, Informative

Comment removed based on user account deletion

Libraries by explodymatt · 2008-12-04 01:03 · Score: 5, Interesting

Python 3 being out is great, they've fixed a few things that allow bad programming, but does anyone know how long it will take for the libs to start getting ported? Especially numpy and scipy

Re:Libraries by Yvanhoe · 2008-12-04 01:06 · Score: 5, Informative

Last time I checked (several months ago) it was not thought that backward compatibility would be broken very hard. Most of the modification to do should be automatic so I think that a lot of packets that are still maintained will quickly be made compatible for python 3

--
The Wise adapts himself to the world. The Fool adapts the world to himself. Therefore, all progress depends on the Fool.
Re:Libraries by gzipped_tar · 2008-12-04 01:21 · Score: 3, Interesting

IIRC numpy and scipy have dependencies on other libraries that are not 2.6-clean. They also have a lot of issue themselves. Currently it's not a priority for them to migrate.
Can't remember when did I read about that... and I'm too lazy to dig it out from their Trac :-P

--
Colorless green Cthulhu waits dreaming furiously.
Re:Libraries by gzipped_tar · 2008-12-04 01:59 · Score: 4, Informative

I just did a Google search site:scipy.org ("2.6" OR "3.0") -"ipython" -"nipy" and there are a lot of results turning up (and of course lots of bogus ones).
It seemed things are much better that I thought of. Those guys are making progress every day and my news were old news...
The difficulty with numpy/scipy is they need a great amount of C-level coding. There's 2to3 for Python code but tweaking C code is not that easy...

--
Colorless green Cthulhu waits dreaming furiously.
Re:Libraries by Alomex · 2008-12-04 02:16 · Score: 5, Insightful

Backward compatibility is (i) over-rated and (ii) misunderstood.
It is over-rated in the sense that the number of current users which are inconvenienced is a very small percentage of the total number of users of the language (unless the language is in the tail end of its life, like Fortran and Cobol).
It is misunderstood in that with the use of a simple header or import declaration it is possible to have two different versions co-exist while the transition happens. This is done in HTTP where the first thing that clients exchange is the version of the protocol they'll use. It is also done in LaTeX, where the first declaration informs the compiler which major version is being used (pre-2e or 2e).
Kudos for Python for not being afraid to rock the backwards compatibility boat.
Re:Libraries by gardyloo · 2008-12-04 03:28 · Score: 2, Funny

unless the language is in the tail end of its life, like Fortran and Cobol
Thus the phrases "The looooooooooong tail" and "You're ALL tail, baby".
Re:Libraries by Rostin · 2008-12-04 04:13 · Score: 4, Informative

unless the language is in the tail end of its life, like Fortran...
Fortran will continue to thrive for many years. I don't know numbers, but based on my personal experience, it's the preferred language of most computational scientists and engineers. The most recent revision occured in 2003. According to this, a new one is being worked on.
Re:Libraries by gfxguy · 2008-12-04 04:26 · Score: 3, Insightful

Python 3 being out is great, they've fixed a few things that allow bad programming
Really? So if I write code in Python 3, it's guaranteed to be "good" programming?
Honestly, I didn't look at the article... have they actually made things MORE rigid?
I use python... I like python... but I can't help but think it was designed by someone who was pissed off that people didn't format their code the way he formatted his code. Since his way was obviously the "right" way, why not write a language that forces you to do it that way? Problem solved! I love the addition of artificial barriers like being able to make your code fail if you mix spaces and tabs in your indents. Nothing anal retentive about that!
Sorry, just ranting... I really do use and like it, just some of it seems so anal retentive to me, and the thought that you can "fix things" that allowed bad programming seems like it's even more so.

--
Stupid sexy Flanders.
Re:Libraries by Alomex · 2008-12-04 04:32 · Score: 4, Insightful

Fortran will continue to thrive for many years.
I agree. The point is that the number of current users is a non-negligible percentage of the universe of future users. It is in that sense that it is "near the tail end".
For languages which are very early in their life cycle, such as Python, the number of users inconvenienced today are negligible compared to the total number of users that it will have and benefit from the changes.
Re:Libraries by Anonymous Coward · 2008-12-04 05:07 · Score: 2, Insightful

"Sorry, just ranting... I really do use and like it, just some of it seems so anal retentive to me [forced indentation]..."
I ranted about that too when I first started using Python. But when I noticed how nice it was to read other peoples' code, I concluded that it was a good thing.
Re:Libraries by blincoln · 2008-12-04 06:10 · Score: 3, Interesting

I can't help but think it was designed by someone who was pissed off that people didn't format their code the way he formatted his code. Since his way was obviously the "right" way, why not write a language that forces you to do it that way? Problem solved!
This is actually the main reason I haven't worked with Python beyond tweaking a few existing scripts. The funny thing is that (unless I'm misremembering the syntax) I already code using that style in other languages. But the idea of forcing that style on everyone annoys me enough to put me off of the language as a whole.
I was really hoping that 3.0 would remove that petty stupidity. Doing so would even retain backwards compatibility with prior versions!

--
"...always new atoms but always doing the same dance, remembering what the dance was yesterday." -Richard Feynman
Re:Libraries by Anonymous Coward · 2008-12-04 06:54 · Score: 5, Insightful

I can't help but think it was designed by someone who was pissed off that people didn't format their code the way he formatted his code. Since his way was obviously the "right" way, why not write a language that forces you to do it that way? Problem solved!
This is actually the main reason I haven't worked with Python beyond tweaking a few existing scripts. The funny thing is that (unless I'm misremembering the syntax) I already code using that style in other languages. But the idea of forcing that style on everyone annoys me enough to put me off of the language as a whole.
I was really hoping that 3.0 would remove that petty stupidity. Doing so would even retain backwards compatibility with prior versions!
I just don't get it when people say that, its sorta like saying you don't use language X because you have to store numbers as floats or integers instead of char variables.
I honestly like the fact that Python forces a coding format, I hate opening someone else's source and spending the first minutes trying to understand how they layout things if at all. And yes if people were smart it would be easy to pickup anyones code, sadly that world doesn't exist.
No its not petty stupidity, not using Python because of your reasons is sadly what I would call petty stupidity.
Re:Libraries by steveha · 2008-12-04 07:03 · Score: 2, Interesting

I wonder if Fortran may eventually be replaced by Python.
A few years ago, when I was first getting into Python, I read an article where a guy from a science research lab talked about his lab's transition from Fortran to Python. Python has some nifty heavy-duty math modules, written in C; and everyone at the lab who tried out the Python stuff strongly preferred it to Fortran.
Since C code is doing all the heavy lifting, it's nice and fast. Since Python is interactive, scientists can use it as a really-powerful desk calculator. And since Python has a clean and friendly syntax, it's easier to write and debug Python programs than Fortran.
I really wish I had saved a copy of that article, or at least its URL. I've tried Google searching for it, and I find many hits on using Python in labs but I haven't found the article.
steveha

--
lf(1): it's like ls(1) but sorts filenames by extension, tersely
Re:Libraries by bnenning · 2008-12-04 07:05 · Score: 5, Informative

But the idea of forcing that style on everyone annoys me enough to put me off of the language as a whole.
I had that exact reaction when I first came across Python. But after giving it a chance (many years later), I realized that it doesn't force a style any more than C forces the "style" of putting braces around blocks. Indentation levels are just syntax elements that happen to correspond to what most developers naturally do. Really, having to indicate blocks to the compiler in one way and to humans in another way is a DRY violation, which Python eliminates.

--
How to solve most of our problems: 1.Lots of nuclear plants. 2.Cure aging.
Re:Libraries by 644bd346996 · 2008-12-04 07:34 · Score: 3, Informative

Python can't replace Fortran, but C can (and to a large extent, is). For most serious scientific computation, the initial software is written in a language like MATLAB or Python, which make use of number crunching libraries written in C or Fortran. When that code needs to be modified to run on a supercomputer instead of a workstation, it is usually converted to pure C or Fortran.
Interpreted and interactive languages like MATLAB and Python make it easy to prototype and test a new algorithm, but C and Fortran are still necessary to make an efficient implementation.
(Disclosure: I am a mathematician, currently using all the above languages for ongoing research, though I am studiously avoiding having to write any Fortran myself.)
Re:Libraries by Vornzog · 2008-12-04 07:40 · Score: 4, Insightful

I wonder if Fortran may eventually be replaced by Python.
Already has been, in my world. I know plenty of people around the chem department who still use Fortran because 'it is the language of scientific computing, dammit!'
Here is the thing. Most of the time, they were so panicked about how long the program would take to run, they lost sight of how long it took them to write it.
I replaced many Fortran programs with Python in my time, because I could write the data IO so much faster, and then just use the C-level numerical libraries to do the analysis. The program would end up running just as fast, and the code could be written in an hour instead of a week.
Some people will die before they change languages. The rest of us just want our results. Hopefully, the switch to py3k goes easy and the community continues to grow.

--
-V-

Who can decide a priori? Nobody.
-Sartre
Re:Libraries by daver00 · 2008-12-04 11:27 · Score: 2, Informative

You don't use tabs in the first place. And in any case Python enforces no standard of block indent, it simply requires that you use the same indent for all blocks. So you can tab+space all you want so long as all of it is the same. The human reader merely requires that you use a unicode font and everything lines up. What exactly is hard about that? The reason to use braces is to speak to the computer, humans still indent to make it readable.
The recommended way to indent in python is to use 4 spaces, and any half decent text editor can be set up to do this when you press the 'tab' key. Rather than bitch and moan from the sidelines why don't you try it. Python kicks ass in so many ways and I haven't met any coder who has tried it and thinks its a bad language. It has pitfalls and quirks but all languages do, Pythons pros outweigh its cons easily.

good by Anonymous Coward · 2008-12-04 01:04 · Score: 4, Funny

previous releases were incompatibilie with earlier ones unintentinally.

Release notes by Max+Romantschuk · 2008-12-04 01:05 · Score: 4, Informative

The release notes might interest people:
http://docs.python.org/dev/3.0/whatsnew/3.0.html

Also note that in the end of the release notes are info on the migration path from Python 2 to 3. I'll leave the rest to people who bother to RTFA... ;)

--
.: Max Romantschuk :: http://max.romantschuk.fi/

You got time machine! by gzipped_tar · 2008-12-04 01:13 · Score: 5, Informative

The cool thing about Python is it's "time machine". In Python 2.x you can "from __future__ import " to use features scheduled for future releases. With the release of Python 2.6 there's also a "2to3" tool that will point out revisions needed for 2.x code to be 3.0-compatible, and generate patches for you.

The Python developers have been aware of the difficult road of migration long before the release of Python 3, and they did a lot of careful planning and hard work for it. One of them being the __future__ module that has been there for quite long time just for this reason.

As a Python user, my hat off for them. I wish them success heartily.

BTW: In case you don't know, there's an Easter egg in the time machine: "from __future__ import braces" ;)

--
Colorless green Cthulhu waits dreaming furiously.

Re:You got time machine! by makapuf · 2008-12-04 01:23 · Score: 4, Informative

You can also use the python 2.6 "-3" option to have warnings about non future-proof constructs (ie things that can't be handled by 2to3)
in fact there are others python easter eggs :
import this import __hello__
and ... a new one in 3.0, related to xkcd.
Re:You got time machine! by Yvanhoe · 2008-12-04 01:32 · Score: 2, Informative

Ironically, the XKCD referring to python is now false : Hello world is not

print "Hello world"

anymore but in 3.0 :

print("Hello world")

But I guess the point is still valid.

--
The Wise adapts himself to the world. The Fool adapts the world to himself. Therefore, all progress depends on the Fool.
Re:You got time machine! by maxume · 2008-12-04 04:20 · Score: 2, Informative

Sucking it up will not be painful.
If you are using libraries in 2.x and they suddenly decide to only support 3.x, you might have some issues. For the most part, the changes take a few minutes to review (many of the changes are related to removing things that have been replaced (but not yet removed) as of 2.5, so if you pay some attention to how you do things, you won't even notice those).

--
Nerd rage is the funniest rage.

And now to wait by Anonymous Coward · 2008-12-04 01:15 · Score: 2, Interesting

Sounds great! Now to wait a few weeks while smart people find and fix all the security holes, so I can go and safely get version 3.1.

Re:And now to wait by morgan_greywolf · 2008-12-04 01:45 · Score: 5, Funny

Nope. Python 3.11 for Workgroups.

--
My blog
Re:And now to wait by Random+BedHead+Ed · 2008-12-04 04:41 · Score: 4, Funny

This post was reserved for the Python NT 3.5 joke, but it has been postponed until the next release (along with a database-driven filesystem the Python developers swear they're working on).

No mac version yet? by neoform · 2008-12-04 01:18 · Score: 2, Funny

Where's the mac version..?

--
MABASPLOOM!

Re:No mac version yet? by hawk · 2008-12-04 07:14 · Score: 2, Funny

>You download the .tart.gz or .tar.bz2 source packages and build it. \
At last, what the world has been waiting for: a language for bimbos and airheads! :)
hawk
Re:No mac version yet? by geminidomino · 2008-12-04 07:42 · Score: 2, Funny

WTF are you talking about? Visual Basic has been around since 1991!

Unfair headline there, Bubba by Ancient_Hacker · 2008-12-04 01:18 · Score: 3, Interesting

Yes, Python 3.0 is a break.

But in the past and forseeable future, Python has been exceedingly helpful, much more than most languages, during upgrades.

Usually one has several months to try out new features-- they're in the current version but turned off until you ask for them with "future_builtins".

Plus there's often a backwards feature in the next version to revert back to old behavior.

Not to mention a -3 option to point out the lines in your old program that will need changing for version 3.

But sometimes the changes are so big they can't be encompassed by a compiler switch. Such it is with 3.0.

Re:Unfair headline there, Bubba by makapuf · 2008-12-04 01:31 · Score: 2, Interesting

But sometimes the changes are so big they can't be encompassed by a compiler switch. Such it is with 3.0.
While I agree with your post, here it's not a problem with implementation but with syntax and backward compatibility within a given python version.
The idea is that some needed changes cannot be made backward-compatible (new keywords, ...). So you group them and call that a new version of the language. I doubt you couldn't implement most of it with compiler switches.

Re:woohoo by Anonymous Coward · 2008-12-04 01:31 · Score: 5, Funny

But I just came in here for an argument!

Re:That marks my end of use for Python by neoform · 2008-12-04 01:32 · Score: 4, Informative

I'm fairly certain they got all these non-backward compatibility issues out of the way with this release so they don't have to do this kind of thing again for a long while. My guess is, they wont ever put out a non-backwards compatible release, since those changes were mostly to fix poor coding practices like being able to run certain functions without braces (e.g print "hi").

--
MABASPLOOM!

RTFA by jopet · 2008-12-04 01:38 · Score: 2, Informative

OK, never mind, I just saw it, there seems to be such a beast: http://docs.python.org/dev/3.0/library/2to3.html#to3-reference

Re:Is it possible to do automatic code migration? by stuntpope · 2008-12-04 01:39 · Score: 3, Informative

Like http://docs.python.org/library/2to3.html, perhaps?

from __future__ import braces by slumberheart · 2008-12-04 01:39 · Score: 5, Funny

SyntaxError: maybe in 3.5

Yay, Unicode! by shutdown+-p+now · 2008-12-04 01:44 · Score: 4, Interesting

Reworked Unicode support is a big deal. It was there before, of course (unlike Ruby - meh), but all those Unicode strings vs 8-bit strings, and the associated comparison issues, complicated things overmuch. Not to mention the ugly u"" syntax for Unicode string literals which was too eerily like C++ in that respect. Good to see it move to doing things the Right Way by clearly separating strings and byte arrays, and standardizing on Unicode for the former.

Now, if only we could convince Matz that his idea for Unicode support in Ruby 2.0 - where every string is a sequence of bytes with an associated encoding, so every string in the program can have its own encoding (and two arbitrary objects of type "string" may not even be comparable as a result) - is a recipe for disaster, and take hint from Python 3...

Re:Yay, Unicode! by shutdown+-p+now · 2008-12-04 06:08 · Score: 2, Interesting

since methods exist to examine what the encoding of a string is, and to change it, how would there be a disaster unless the coder was sloppy?
Assume a simple case: a function taking two strings as arguments. In Ruby 2.0, you cannot safely concatenate those two strings, or even compare them (because encodings may be incompatible). You cannot properly interpret it, because the set of possible encodings is not closed (the client may pass you a string with an encoding he defined himself). You cannot convert it to some common encoding that is safe to process, because there may not be a common encoding (Ruby intends to support some Japanese encodings that do not have a well-defined Unicode mapping). You cannot even safely pass it on another library function, because it may not be able to handle a string in arbitrary encoding for the reasons mentioned above. In effect, it means that Ruby 2.0 are "arrays of characters", where a "character" is some opaque value from which no meaning can be derived in a general case.
Note that the above means that this Ruby code has a bug of sorts for 1.9.1+:

def foo(str) if str == "abc" # oops! who says str encoding is compatible with ASCII? end

Cute, eh?
Re:Yay, Unicode! by shutdown+-p+now · 2008-12-04 06:15 · Score: 2, Interesting

If I understand Unicode correctly, the entire point is that Unicode provides a code point space, which defines all the possible characters available.

You understand almost correctly :) The problem here is, what is a "possible character"? It is in many ways a political issue, and apparently some people aren't happy about the way Unicode handled some characters. One particular sore point is that of Han unification - basically, Unicode assigned a single codepoint for every Han glyph, whether it's used in Chinese, Japanese, or Korean. Japanese were particularly unhappy about it.
Re:Yay, Unicode! by shutdown+-p+now · 2008-12-04 08:22 · Score: 2, Interesting

The statement is an error as the types don't match. Quite a few people claimed this in response to my previous posts.

They are correct. "UTF-8 String" is not really an UTF-8 constant, it's just a plain Unicode string now. It makes sense, too, as comparing a byte array with a string is not generally well-defined operation. And yes, of course, it's a breaking change, and is on the changelog.
Now you can still have byte array literals if you want them, but they are opt-in via "b" prefix (much like Unicode strings were opt-in via "u" in 2.x). So:

if byte_string == b"UTF-8 constant":

works.

It also appears to be impossible to make an unadorned string constant that contains an *invalid* UTF-8 encoding, since the translation is done at compile time, so no changes to the current encoding will help.
Well, if it's invalid, it's no longer UTF-8, right? So not a valid Unicode string anyway - why would you want it to pretend to be one? You can still make a byte array like that (though of course it will fail if you then try to decode it as if it was UTF-8 - because it's not).

In Python 2.0 and in most other languages "\xC2\xA2" is a cent-sign

True for Python 2.0, false for "most other languages". It's not true for most post-Java mainstream and/or generally well-known languages (C#, VB, Haskell, R6RS - to name a few). So Python is simply standardizing on what's already widely accepted. Of course, it also makes most sense when you deal with Unicode strings - forget about bytes, work with codepoints. In-memory representation of the string shouldn't be your concern, anyway.

Also the documentation claims that b"\u00A2" is invalid, but that makes it really difficult to make byte string constants containing arbitrary UTF-8 in a more readable way.
Well, of course it's invalid - it's a byte array, not a string! And why do you think that it would have to be UTF-8 even if it was allowed? Why not UTF-16 or UCS4?
Of course, nothing stops you from using str.encode, e.g.: "\u00A2".encode("utf-8") - which is quite explicit about what's going on, and yet short enough at the same time. By the way, if you omit the argument to encode, it will just use the default system encoding for non-wide-chars, which is usually precisely what you want on Unix.
Re:Yay, Unicode! by spitzak · 2008-12-04 10:59 · Score: 2, Interesting

Reading the changelog, it sure does sound like b"abc"=="abc" will produce an error. I do find this extremely suprising as I would think this would break enormous amounts of software.
It sounds like Python 3.0 will throw an error if you read a file that contains invalid UTF-8, until the program is rewritten to read the file as "bytes". Then it will throw errors when you convert the bytes to "str", until you rewrite the functions reading the files to return bytes instead of str. Then the users will hit this problem in that their code will no longer compile. I can't see this being any good.
Checking the web pages, I am certainly not alone in this worry. A more popular solution however seems to be to stop throwing errors. The conversion to Unicode would instead translate invalid bytes to U+DCxx (ie unpaired UTF-16 lower-half surrogates). This would avoid the exceptions and also make the translation lossless. I have examined this before and it has a big problem in that the translation of (possibly invalid) UTF-16 to UTF-8 is no longer lossless (imagine the UTF-16 had a sequence of these invalid symbols that actually match a valid UTF-8 encoding), which might lead to bad security holes.
if it's invalid, it's no longer UTF-8, right?
You are parroting the same crap used by people who don't like UTF-8 and try to make it more difficult than it really is. It is indeed UTF-8, just because it has errors in it does not make it not be UTF-8, anymore than a misspelled word makes this post not be English.
It's not true for most post-Java mainstream and/or generally well-known languages
You seem to have forgotten languages called "C" and "C++". I heard they were pretty popular...
I think you might also check exactly what some of those languages do, you can't put more than \xff into most of them so they are actually doing exactly what I am saying, except they are assuming ISO-8859-1 as the encoding. If the encoding can be changed to UTF-8 then it would work exactly like I am stating. (if values greater than 0xff are accepted they could ignore the encoding and you would remain compatible).
What you are saying is that there is no difference between \x and \u, which seems pretty stupid to me.
The main reason I want this is so that a string constant can be changed between bytes and unicode by just changing the 'b' to a 'u'. This is also why I want \uXXXX to work in byte strings.
On b"\u00A2": Well, of course it's invalid - it's a byte array, not a string! And why do you think that it would have to be UTF-8 even if it was allowed? Why not UTF-16 or UCS4?
The compiler is already assuming UTF-8 when it parses u"abÂ" so I see no reason it can't assume UTF-8 here as well.
Re:Yay, Unicode! by shutdown+-p+now · 2008-12-04 17:52 · Score: 2, Informative

Unfortunately, they just abandoned some critical byte-string interfaces, which makes it impossible to write non-"toy" programs in Python 3.0. E.g. there's no way to get the original argv[], which is a pretty fundamental omission.

Given that you can always do encode() on the Unicode string to get its byte representation in default encoding of the current locale, what's the problem?
Re:Yay, Unicode! by shutdown+-p+now · 2008-12-04 18:12 · Score: 3, Informative

It sounds like Python 3.0 will throw an error if you read a file that contains invalid UTF-8, until the program is rewritten to read the file as "bytes". Then it will throw errors when you convert the bytes to "str", until you rewrite the functions reading the files to return bytes instead of str. Then the users will hit this problem in that their code will no longer compile. I can't see this being any good.
Why isn't it? If your input file is supposed to be UTF-8 text, and is not, then surely it's an error? As you say yourself, you can always load it as raw bytes if you want to work with it nonetheless. But, of course, as soon as you want to start treating it as an actual string - so that you can say things such as "give me the 10th character" (and not "10th byte") - it has to be valid, otherwise all string-specific operations would simply be undefined.

You are parroting the same crap used by people who don't like UTF-8 and try to make it more difficult than it really is. It is indeed UTF-8, just because it has errors in it does not make it not be UTF-8, anymore than a misspelled word makes this post not be English.
I like UTF-8, but UTF-8 with errors in it is clearly not valid UTF-8, no more than XML with a missing closing tag in the middle of the file is valid XML. The problem with such UTF-8, as I've mentioned earlier, is that no string processing function would know what to do with it. If you, say, try to convert it to uppercase, what should it do with invalid sequences? What about the earlier example of indexing by characters, or taking the leftmost or rightmost N characters - how should it could the unterminated sequence?

You seem to have forgotten languages called "C" and "C++". I heard they were pretty popular...
No, I did not. C and C++ actually work in precisely the way Python 3 does. The only difference is that in them, a plain unadorned string literal is a "byte array", and you have to explicitly request wide chars (to simplify, let's assume it means always means "Unicode" for now) by prefixing it with "L". Otherwise, it's precisely the same. In particular, L"\xC2\xA2" is not a cent sign in either C or C++. It's a wide (string with two characters. Plain "\xC2\xA2" is a non-Unicode (i.e. byte) string of two bytes, which produces a cent sign when treated as UTF-8 - and so is byte string b"\xC2\xA2" in Python.

I think you might also check exactly what some of those languages do, you can't put more than \xff into most of them so they are actually doing exactly what I am saying
It's a legacy of C/C++ - they couldn't extend the "\x" escape sequence to use more than 2 digits without breaking existing string literals, so they left it as is. In C/C++, Java, C# etc, if you want a full-length Unicode escape, you use "\u1234". However, note that it doesn't really change anything - inside a Unicode string literal, in all these languages, "\xFF" is the same as "\u00FF", which is the same as "\U000000FF". None of them allows to define individual bytes in Unicode string literals.

What you are saying is that there is no difference between \x and \u, which seems pretty stupid to me.
Yet that's how it is. Do you want quotes from the respective language specifications?

The compiler is already assuming UTF-8 when it parses u"abÂ" so I see no reason it can't assume UTF-8 here as well.
This decision is made on different levels. The compiler isn't assuming UTF-8, the code which reads the file as a sequence of characters (before lexing, much less parsing, takes place) does that. On the other hand, processing the content of the string literal is (most likely) done by the lexer, including character escapes. Also, keep in mind that non-UTF input files are still legal - should escape sequences in literals suddenly change meaning for them?
Re:Yay, Unicode! by spitzak · 2008-12-05 09:12 · Score: 2, Interesting

If your input file is supposed to be UTF-8 text, and is not, then surely it's an error?
UTF-8 with errors is STILL UTF-8. It just is not "valid UTF-8" which is a mostly uninteresting subset. The set of UTF-8 strings is every single possible byte sequence. The set of "valid UTF-8" strings is a SUBSET that a tiny portion of software (mostly validators) should have to care about.
People are trying to make this far more difficult than it really is by somehow saying that we must restrict ourselves to that subset at a very low level. That is wrong and is the main reason why there is so much confusion about UTF-8. Nobody seems to care that UTF-16 can have illegal sequences (Python handles them without complaint) and nobody cared for 10 years that the Japanese encodings could have illegal sequences. But for some reason UTF-8 brings out this complaint over and over again. I suspect the problem is that people have invested too much effort in UTF-16 and don't want to admit they made a huge mistake, and the only way is to try to make UTF-8 hard.
But, of course, as soon as you want to start treating it as an actual string - so that you can say things such as "give me the 10th character" (and not "10th byte") - it has to be valid, otherwise all string-specific operations would simply be undefined.
Well of course. Therefore THAT function should throw the damn exception! Not every single string manipulation!!!!
Also you amazingly did the same bogus example of "move by 10 characters" I have seen before. Please look at real software and you will see that NOBODY EVER MOVES BY "10 CHARACTERS". 1 maybe. Otherwise the only use EVER of such code is because "10 characters" was previously calculated by another function looking at the EXACT SAME STRING and therefore a byte offset or UTF-16 word offset or whatever will work just as well.
L"\xC2\xA2" is not a cent sign in either C or C++. It's a wide (string with two characters.
It is byte values converted using ISO-8859-1 encoding. What I want is the ability to change that encoding.
The compiler isn't assuming UTF-8, the code which reads the file as a sequence of characters (before lexing, much less parsing, takes place) does that.
That is wrong, because it would not be possible to create a byte string containing an invalid UTF-8 sequence. This would break any software that has a string constant with ISO-8859-1 encoding in it (the programmer will still need to put a 'b' in front of it, but that is a lot easier and readable than going and replacing all the foreign letters with \x sequences).
In any case I don't see any reason why the Lexer should assume a different locale than the parser. That would be pretty confusing.

Re:That marks my end of use for Python by lahvak · 2008-12-04 01:45 · Score: 4, Funny

So what are you going to do, take all your existing Python applications and rewrite them in a different language, in order to avoid the "significant amount of work to maintain existing functionality with new language version"?

--
AccountKiller

Re:Hey! by drewness · 2008-12-04 01:46 · Score: 4, Funny

As someone mentioned above, try

from __future__ import braces

and see what happens. ;)

As for Ruby, I don't really follow its development or use it, but I was reading just the other day that they're really focused on finishing 1.9, which does byte-compiling and some optimization. The current version (like JS before spidermonkey, V8, and squirrelfish) walks and executes the AST (as I understand it), which is slooow.

Re:That marks my end of use for Python by Pope+Raymond+Lama · 2008-12-04 01:48 · Score: 2, Insightful

Besides teh above remark of well thoguth migration paths - it is importante to remakr that support for python 2.x has not ended in any way.

As far as Iam aware, the recomendation is to keep working with python 2.6 - and use the py2to3 script to regularly to make 3.0 releases if you you can (i.e. if your dependencies have 3.0 versions already).

No need to worry about anything, this will be a smooth, years long, transition. Chances are we will even see a python 2.7 before 2.x is officially deprecated.

--
-><- no .sig is good sig.

Re:Hey! by Constantine+XVI · 2008-12-04 01:54 · Score: 4, Informative

For the lazy (or those who don't have python installed at work): >>> from __future__ import braces File "<stdin>", line 1 SyntaxError: not a chance

--
"I think an etch-a-sketch with an ethernet port would beat IE7 in web standards compliance."

Re:That marks my end of use for Python by shutdown+-p+now · 2008-12-04 01:57 · Score: 5, Informative

It's also cleanup of some stupid syntax that was there for ages. For example, exception handling. Old style:

try: ... except (TypeError, ValueError): # catch both types of exceptions ... except os.error, e: # catch exception and store into variable 'e' ...

New style:

try: ... except (TypeError, ValueError): # catch both types of exceptions ... except os.error as e: # catch exception and store into variable 'e' ...

It's fairly obvious that the latter is much clearer.

print function by togofspookware · 2008-12-04 02:03 · Score: 4, Interesting

First thing mentioned on the 'what's new' page (http://docs.python.org/dev/3.0/whatsnew/3.0.html)is that you'll have to change your code from

print x, y, z,

to

print(x, y, z, end="")

I can see the value of making things more consistent, but it seems to me whenever they update things in Python, it's usually to make programming in it a little bit harder.

Why not make print a function, but then change the language to not require parentheses for any function call? You'd still have to use them when calling a function with zero arguments, and in sub-expressions, but to not require parens for top-level function calls would, if nothing else, make playing around in interactive mode or with short scripts a lot more pleasant.

Granted, I come from a Ruby background, so I may not know what I'm talking about. My experience with Python is trying to write some scripts on my OLPC, where the craptacular rubber keyboard made typing parentheses all the more agonizing. I finally caved and installed Ruby so I could get some work done. Maybe people who prefer Python really like typing parens. And underscores.

--
Duct tape, XML, democracy: Not doing the job? Use more.

Re:print function by gzipped_tar · 2008-12-04 02:19 · Score: 2, Interesting

The IPython (nothing Apple-related) interactive shell hacked the Python lexer to allow exactly this. You type this at the shell prompt:
foo a, b, c
it will be interpreted as a call foo(a, b, c).
IPython still has some bugs with this feature, though. It can be turned out, but I still prefer it in interactive use just as you've mentioned.
Anyway, I think the current Python syntax is OK.

--
Colorless green Cthulhu waits dreaming furiously.
Re:print function by maxume · 2008-12-04 02:23 · Score: 4, Interesting

I would say that it makes typing python a little bit harder, but I would also argue that it makes programming python easier, not harder (it eliminates print as a statement, but it also eliminates special syntax that existed only for redirecting print output, and makes it trivial to change the default behavior of print within a module (by defining a local print function)).

--
Nerd rage is the funniest rage.
Re:print function by moranar · 2008-12-04 03:39 · Score: 2, Funny

You seem to want Perl. You can find it at http://www.perl.org/

--
"I think it would be a good idea!"
Gandhi, about Internet Security
Re:print function by jonaskoelker · 2008-12-04 04:28 · Score: 2, Insightful

Why not make print a function, but then change the language to not require parentheses for any function call?
A good argument is that it would make the grammar ambiguous.
What's the parse tree of "f x, g y"? I think it can be both (tuple (f x) (g y)) and (f x (g y)).
One could of course detect and disallow ambiguous strings, at the expense of the parser having to do a little more work. It may be a little, or it may be a lot. ...
Re:print function by togofspookware · 2008-12-04 07:47 · Score: 2, Insightful

While we're at it, let's stop attacking people's ideas with straw man arguments.

--
Duct tape, XML, democracy: Not doing the job? Use more.

I don't know why this story's flagged "endofdays" by Slartibartfast · 2008-12-04 02:23 · Score: 4, Funny

That'll be when Perl 6.0 ships.

Re:I don't know why this story's flagged "endofday by Anonymous Coward · 2008-12-04 02:35 · Score: 2, Informative

AFAIK Perl 6.0 is already there, in the form of Pugs (which is said to be compatible with all the specs), and it's just the implementation of Perl 6 in perl6 itself what people are waiting for. You can go and write actual Perl 6 code, and run it on Pugs, and it'll work.

Porting? Instantly! by jonaskoelker · 2008-12-04 02:47 · Score: 2, Funny

I heard they're going to use Python 3.0 for the impending from-scratch rewrite of DNF.

Re:woohoo by Hal_Porter · 2008-12-04 02:49 · Score: 5, Funny

No you didn't.

--
echo -e 'global _start\n _start:\n mov eax, 2\n int 80h\n jmp _start' > a.asm; nasm a.asm -f elf; ld a.o -o a;

Re:I'll still avoid it by m50d · 2008-12-04 02:51 · Score: 2, Interesting

It does make it a pain in the ass to play around and test with because often cut-n-paste (from random sources) completely fucks up the indention which you then have to fix.

Cut-n-paste is not a good way to learn.

Between Python's extremely verbose syntax (not very script-friendly-like)

It's not extremely verbose; take a look at Java if you want that. If you compare with e.g. perl, yes it's longer, but the difference is because it's using words rather than random characters, which in my book is worth it for the ease of remembering wtf to write. Compare it with Ruby or, *struggles to think of another scripting language* TCL, say, and the verbosity is pretty similar.

and relatively poor performance...

Really? It's not going to win races against C, but performance is very much on a par with say Perl (which yes, has a lot of improvements coming in v6, but that's not here yet), and ahead of other similar languages. Couple with the fact that it's easier to bind from python than any of the alternatives, and you end up with code that in practice is as fast as you could write anywhere (because you use e.g. NumPy, which just binds to the fastest libraries available for doing what it does).

Of course python does sacrifice some things - but the ease of code writing and most of all maintainability are well worth it in most cases, in my experience.

--
I am trolling

Re:I'll still avoid it by MightyYar · 2008-12-04 02:53 · Score: 3, Insightful

As they didn't fixed the stupid forced-indentation thing.

Same reason I don't use C... that stupid forced-curly-brace thing. Why can't the language just know what I want to do?

</sarcasm>

--
W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.

Re:Hey! by Ragzouken · 2008-12-04 03:22 · Score: 3, Funny

What whitespace problem?

Re:I'll still avoid it by horza · 2008-12-04 03:27 · Score: 2, Insightful

It's good thing when you get used to it as it makes source code much clearer. If you find that the forced indentation is bulking up your code too much then you are probably missing a trick... in Python there is always a short-cut and you just have to think more Python-like. For example in C/PHP I would type:
x=1; y=2; z=3;
When you first look at Python you are tempted to write:
x=1
y=2
z=3
Quickly you find you can:
x,y,z = 1,2,3

Phillip.

--
Property for sale in Nice, France

Re:Hey! by caldodge · 2008-12-04 03:29 · Score: 3, Insightful

Care to cite a reference for the Rossum's alleged comment? I think "the whitespace problem" is actually one of Python's big advantages, since it greatly enhances program readability.

Re:That marks my end of use for Python by moranar · 2008-12-04 03:37 · Score: 4, Funny

Besides teh above remark of well thoguth migration paths - it is importante to remakr that support for python 2.x has not ended in any way.

As far as Iam aware, the recomendation is to keep working with python 2.6 - and use the py2to3 script to regularly to make 3.0 releases if you you can ...

Are you typing while drunk?

--
"I think it would be a good idea!"
Gandhi, about Internet Security

Re:I'll still avoid it by Andy+Dodd · 2008-12-04 03:40 · Score: 2, Insightful

For a little bit I avoided Python because of the whitespace sensitivity.

At some point I gave it a try, at which point I was already using emacs. It took 5 minutes to get used to the whitespace sensitivity since emacs took care of indentation for me.

--
retrorocket.o not found, launch anyway?

Re:woohoo by Chapter80 · 2008-12-04 04:09 · Score: 4, Funny

GREAT!

Interestingly, it IS backwards compatible in areas that you wouldn't think it should be. For instance, the following program takes the version number, adds one to it, and divides by two. You'd think it'd give a different answer between version 3 and version 2. Glad they kept this program working for me, as it's the secret production code that runs my multi-million dollar business.
import sys version=int(sys.version[0]) print (version+1)/2

Prints 1 in either version. (on the bright side, 1/2 is now 0.5!)

Re:woohoo by Chapter80 · 2008-12-04 04:10 · Score: 3, Funny

oops I really screwed that joke up... crap. somebody fix it.. you know what I was trying to do!

Re:I'll still avoid it by cleatsupkeep · 2008-12-04 04:14 · Score: 4, Insightful

Well, the big issue I've run into with Python is when you are editing across multiple text editors, where some might use tabs, and some might use spaces. This seems to trip up Python where it wouldn't mess with a brace delimited language or something with an "end" syntax like Ruby.

ubuntu make fail by rla3rd · 2008-12-04 04:18 · Score: 2, Interesting

too bad it doesnt install from source out of the box, even with libgdbm-dev installed

make
running build
running build_ext

Failed to find the necessary bits to build these modules:
_dbm
To find the necessary bits, look in setup.py in detect_modules() for the module's name.

see bug here. Why they would announce a release that wouldn't build for a major distribution such as ubuntu baffles me.

Re:ubuntu make fail by gzipped_tar · 2008-12-04 16:10 · Score: 2, Insightful

It seems some headers are not installed (BerkeleyDB? Dunno what's the Ubuntu package name for that. It's "db4-devel" here on Fedora). Just check them out and rebuild?
Anyway, I never expect some 3rd party source tarball to be able to "build right out of the box" for me. If you do something outside a distro's package management system, you'll have to manage the dependencies all by yourself.

--
Colorless green Cthulhu waits dreaming furiously.

Re:I don't know why this story's flagged "endofday by Abreu · 2008-12-04 04:27 · Score: 4, Funny

Signs of the apocalypse:

* A black man was elected President of the US - November 4, 2008
* Chinese Democracy was released - November 23, 2008
* Python 3000 is released - December 4, 2008
* ?
* ?
* Large Hadron Collider starts operations - ?
* Duke Nukem Forever is released - ?

--
No sig for the moment.

Re:woohoo by Random+BedHead+Ed · 2008-12-04 04:31 · Score: 5, Funny

I think you should use a few more posts to explain the joke. The more you go on the funnier it gets. :)

Re:That marks my end of use for Python by Random+BedHead+Ed · 2008-12-04 04:45 · Score: 2, Funny

Besides teh above remark of well thoguth migration paths - it is importante to remakr that support for python 2.x has not ended in any way.

As far as Iam aware, the recomendation is to keep working with python 2.6 - and use the py2to3 script to regularly to make 3.0 releases if you you can ...

Are you typing while drunk?

No, he generated that comment with Python 2.6 code but ran it with the new release.

Re:woohoo by ragefan · 2008-12-04 04:46 · Score: 3, Funny

An argument isn't just contradiction!

Re:woohoo by ValuJet · 2008-12-04 04:56 · Score: 5, Funny

Yes it is.

Re:I'll still avoid it by doti · 2008-12-04 05:22 · Score: 2, Insightful

the biggest problem is not copying from external sources, but moving your own code around.

of course the final code should get the right indentation anyway, but it's annoying to force the indentation when you just want to do a quick test.

and I don't write messy code. on the contrary, I'm a perfectionist zealot when it comes to the details of code aesthetics. it's just that forcing it is a bad design decision.

--
factor 966971: 966971

Re:I'll still avoid it by Abcd1234 · 2008-12-04 05:26 · Score: 3, Insightful

Cut-n-paste is not a good way to learn.

Ah, I see, you've never refactored code before. Well, good for you, apparently everything you write is either immediately perfect, or you never have to maintain it!

Here in the real world, however, we *do* have to cut and paste blocks of code occasionally, and Python makes that annoyingly difficult.

Re:woohoo by Chapter80 · 2008-12-04 06:49 · Score: 4, Funny

OK. well, what I was aiming for was:

True Part:
In Python version 2, 1/2 = 1 (integer math)
In Python version 3, 1/2 =0.5 (floating point math)

Funny part:
You can do some math on the version number and it comes out the same, even though the version number has changed. Because the divide operation changed too.

wait, it's not so funny after all. What was I smoking?

Re:That marks my end of use for Python by Anonymous Coward · 2008-12-04 07:05 · Score: 2, Insightful

those changes were mostly to fix poor coding practices like being able to run certain functions without braces (e.g print "hi").

Minor quibble: print wasn't a function, it was a statement. That is, it was on the same level as if/then, while, import, etc.

The thought is, though, that print is nowhere near as "central" to the language as if/while/import, and its functionality could just as well be handled with a function. - Which is what they did with 3.0

Re:I'll still avoid it by Abcd1234 · 2008-12-04 08:18 · Score: 2, Insightful

I have this code:

def myfunc()
if some_thing:
do_something

do_something_else

last_thing

def myfunc2()
while another_thing:
myfunc()

one_other_thing

And I decide i want to collapse those loops, so I copy and paste the code:

def myfunc2()
while another_thing:
if some_thing:
do_something

do_something_else

last_thing

one_other_thing

There is *no way the editor can handle this correctly*. It will always get it wrong somehow. After all, how can it know that the if block *and* last_thing should be indented so it's included in the while statement? Worse, when it gets it wrong, it'll change the semantics of the code. And *you won't know*, because the code will continue to parse correctly.

Of course, this is just one, somewhat contrived example. But I have, on numerous occasions, endured cases where refactoring has been made *much* harder thanks to Python's lack of a block termination marker. If you haven't encountered such cases, I would contend you've never had to maintain a non-trivial Python codebase.

Re:woohoo by nd · 2008-12-04 09:14 · Score: 2, Informative

True Part:
In Python version 2, 1/2 = 1 (integer math)

1/2 in Python 2.x is actually 0.

Re:woohoo by Capsaicin · 2008-12-04 15:18 · Score: 3, Funny

It's scary to code something while drunk then come back the next day and think "god, whoever wrote this is clever".

I don't even need to be drunk! That happens to me regularly ... ah the ageing process.

--
Better to be despised for too anxious apprehensions, than ruined by too confident a security. --Edmund Burke

Why whitespace really is a problem by walterbyrd · 2008-12-04 15:40 · Score: 2, Insightful

>>Any coder worth his salt is already indenting his code

As long as you are using your own code, and stick to your own conventions, then it's not a problem.

But what about when you are working with code from somebody else? You can not just look at it and tell if the original developer used spaces, or tabs. You have to do a hex dump, or something - what a pain.

And what if you want to cut and paste from a website? Or email code? Or post code to a news group, or whatever? Whitespace can be an issue in any of those cases.

I believe that even Guido has admitted the forced whitespace was a mistake.

Re:I'll still avoid it by Abcd1234 · 2008-12-05 02:35 · Score: 2, Insightful

If you're the one doing the refactoring, then you'll know how far the indentation is wrong, and you can apply the correction.

I *shouldn't have to*. Besides which, the fact that I do introduces a major source of potential error: because indentation is semantically significant in Python, if I screw up during the refactoring process (particularly large scale refactorings), I can actually introduce bugs simply by not getting the indentation right. That's just unacceptable.

So no, I wouldn't think anyone should be even slightly inconvenienced by this when refactoring their own project's code.

Except, of course, I already have, so you're demonstrably wrong.

The markers will all be there though so the editor should be able to get it right, and if not the programmer should.

And those markers would be what? Oh, right... there aren't any, which was my original point. 'course, if the Python devs simply added an 'end' keyword, this entire conversation would be moot.

but in practice it doesn't seem like a big enough issue to avoid using the language.

Given the plethora of competing offerings, I humbly disagree. Why deal with Python's silliness when I could just use, say, Ruby instead? Or Perl (assuming you're not an undisciplined hack)?

86 of 357 comments (clear)