Interviews: Q&A With Guido van Rossum
Guido van Rossum is best known as the creator of Python, and he remains the BDFL (Benevolent Dictator For Life) in the community. The recipient of many awards for his work, and author of numerous books, he left Google in December and started working for Dropbox early this year. A lot has happened in the 12 years since we talked to Guido and he's agreed to answer your questions. As usual, ask as many as you'd like, but please, one question per post.
Hi,
What prompted the move from Google to Dropbox? What did you do at Google, and what are you going to do at Dropbox?
When will you remove the GIL?
What is your stance in the debate about privacy for internet users. Not necessarily regarding the NSA but also commercial companies. Especially because both Google and Dropbox handle huge amount of private data of their customers. Does this stroke with the philosophy in which Python was developed?
Does the NSA have access to our Dropbox contents, as is apparently the case with Microsoft Skydrive?
Do you regret the swath of backwards incompatible changes in version 3 that have lead to such slow uptake, or do you feel it was the best move for the language moving forward?
paul reinheimer
Guido
When you interviewed at Google - did they ask you brainteaser or hard algorithmic questions, and if so, what did you think of it?
Cheers!
grisha.org
When is python going to support parallel processing and multiple threads?
Every other modern programming language does.
Python does support threading. The issue is that CPython implementation has the GIL to contend with. IronPython did away with the GIL, but it's no longer being maintained. Also, http://interviews.slashdot.org/comments.pl?sid=4105821&cid=44608065 beat you too it.
Interfaces, abstract classes, private members, etc... Why did python avoid all this?
I'm god, but it's a bit of a drag really...
One of the most common complaints about Python is the limitations of its lambdas, namely being one line only without the ability to do assignments. Obviously, Python's whitespace treatment is a major part of that (and, IIRC, I've read comments from you to that effect). I've spent quite a bit of time thinking about possible syntax for a multi-line lambda, and the best I've come up with is trying to shoehorn some unused (or little used) symbol into a C-style curly brace, but that's messy at best. Is there a better way, and do you see this functionality ever being added?
if you want to joke about indentation:
at least make it syntactically sound
while you're at it:
follow pep8
`echo $[0x853204FA81]|tr 0-9 ionbsdeaml`@gmail.com
Hate to throw a curveball here, but answer honestly: why is Python such an awesome language?
Compound question here: what part of Python is or became your favorite, and what's the worst thing, something that you would change if you knew now what you did then?
If you could go back to the very start and change one thing about Python, what would it be and why?
Do you see PyPy as the future ? http://pypy.org/
Or do you remain unconvinced, and -- if so -- why ?
Do you take on interns or devs that want to learn by doing while sitting in the same room w/you?
You've stated (http://neopythonic.blogspot.com/2009/04/tail-recursion-elimination.html) that you consider tail recursion to be "unpythonic." Could you elaborate on that a little more. What about algorithms that are by their very nature recusive? I've found that a combination of memoization and tail recursion can, under the right circumstance, provide wonderfully clear code without taking too much of a performance hit, and Python's decorator facility always seemed more or less ideal to me for that sort of application.
Over the years, there have been several attempts to create a sandboxed version of python that will safely run in a web browser.
Mostly this was because of problems with Javascript.
Now that Javascript works -- and we have nice things like CoffeeScript -- is it time to give up on python in the browser ?
Why doesn't range(['a', 'b', 'c']) equal [0, 1, 2]?
-- Just another Perl hacker
What is your view on the tone that Linus uses on the LKML? Do you think it actually provides any benefits or just drives away would-be contributors?
Are there any plans for providing/incorporating an interface for python and C interaction that isn't CPython specific?
The main thing that keeps Python from being really useful for my projects is the Global Interpreter Lock (GIL). I would love to write Python for my data-intensive code, but it is impossible to get really good parallelism with Python; the multiprocessing library isn't a magic fix because then I have to move all my data back and forth between processes.
When, if ever, should I expect to be able to use Python to do parallel data processing? What is the priority for this, and what would need to be done to make thread-level parallelism possible?
Did the usage of Go! at Google influence your decision to leave Google?
If you could go back in time, what, if anything, would you do differently WRT to developing and releasing Python 3?
-73, de n1ywb
www.n1ywb.com
Do you wish you'd named your language after a different type of snake?
P.S. Yes I know it's from "Monty Python". If it'd been my language I would have called it Groucho.
Have the prospects of Python in any way improved since you grew a beard? To what degree does language success correlate to beard length?
I am officially gone from
Are you aware of any attempts by the NSA to add a backdoor in Python ? ;)
Of course, if you did get an NSA letter, you wouldn''t be allowed to say.
You are welcome to NOT ANSWER this question.
We will take note of that
Python 3.4 will feature enum types built-in into the language. Up to now, Python mostly avoided features that promoted explicit type checking (through isinstance). Is the inclusion of enums a sign that we can expect more of such features in the future (say, contracts on function parameters)?
Lets imagine you have not created Python, but some other programming language. Relying on your judgment, what existing (but not necessarily popular) programming language would be the closest to your mindset / principles / preferences? Or maybe there is none - in such case, how would it differ from Python?
That question baffles me, as you are guru of significant part of developers and surely there is some language that either follows good parts of Python, or avoids Python mistakes. Could you share your thoughts on what would be your choice for Python replacement, if creation of Python would not be possible?
Ps. Many people seem to treat Ruby as being very close to Python. If you will pick Ruby, could you pick also another language, that you would pick after Python and Ruby?
Some people claim that Python is, at least partly, a functional language. You disagree, as do I. Simply having a few map and filter type functions does not make for a functional language. As I understand it those functions were added to the libraries by a homesick Lisper, and that several times you've been tempted to eliminate them. In general it seems you're not a fan of functional programming, at least for Python.
Question: do you feel that the functional programming approach is not very useful in general, or simply that it's not appropriate for Python? It would be nice to hear your reasons either way.
What *does* a National Security Letter look like?
Asynchronous programming allows for high performance networking (servers, services...).
gevent vs twisted vs eventlet: do you have an opinion? Where do you think it's headed? What is using dropbox?
web2py is one of the famous python frameworks and also one of the most criticized for its "different" architecture. What do you think about it?
What are the big features/improvements of python 3 that could/should convince me to make the effort to switch over from python 2.x to python 3.x?
I am not able to do the switch now as I rely on some libraries that have not finished converting to python 3 yet, but having something to look forward to other than the pain of backwards-incompatibility could go a long way in getting me to prepare for the change instead of ignoring the issue.
Do you have any personal tricks or methods to learning new things, and what are the benefits/shortfalls or situations where it didn't work?
Good leaders run toward problems, bad leaders hide from them.
A major reason I recently started to use Python is that there's so much already out there, for free, in terms of packages and support. Was this widespread adoption an original goal, a positive after-effect or logical outcome?
I've commented on python and whitespace enough times, that it became more practical to create a small web page about it: Read it here.
Python the language definition, supports threads fine. CPython the reference implementation, supports threads, but while they work fine for I/O bound workloads, they are poor for CPU bound workloads. However, CPython supports multiprocessing, which uses multiple processes and shared memory; multiprocessing tends to give looser coupling between parallel code units than threading. Jython and IronPython support threads for both I/O bound and CPU bound workloads.
Do you foresee adding first-class language support for immutable object variables? Something like Java's "final" for member variables. I understand that something could be done with a @decorator to implement this, but I'm interested in baked-in support.
Why socks with sandals?
Tools like ipython and fabric go a long way into making python into something that can replace my bash shell in many situations.
The main obstacle to this use-case is python's semantic spacing and lack of braces (or something):
- it is hard to do even a fairly simple if/else or loop in a single line so it will interact nicely with the terminal's history
- it is hard to cut&paste code into the terminal because you have to be wary of leading spaces
Ipython tries to solve some of this with shortcuts to bring up a built-in editor, which is an approach that works but is quite cumbersome.
Do you think convenient usage on the interactive shell is a worthy goal that the language should support? if so, is there any direction the language or libraries could develop to better support it?
Have you ever thought about merging some of the ideas of the Stackless python interpreter into some future version of python to make the whole argument mute?
I've played with stackless and depending on what you're doing it can leverage huge benefits.
Yes Francis, the world has gone crazy.
Guido, if you were to design Python from scratch now (without any constraint and legacy code using the old Python), what would you change and how different would it be from Python 3?
Can you put it on a website that doesn't look like shit?
There seem to be reasonably successful efforts in making javascript run fast and even PHP too.
Why is Python still so much slower? Lack of money? Lack of talent?
How often do you get a chance to write serious code ?
What's your default OS ?
Command shell ?
Version control ?
Editor ?
IDE ?
Web browser ?
IM client ?
email client ?
late nights or early rise ?
Hi Guido, I've been writing software for a long time, and without a doubt the language I most enjoy writing in is Python. As of late, I'm seeing troubling performance comparisons between Python and Go. Sometimes claims of 10x faster with Go. That, by the way, is about the right threshold of performance gain for developers to switch (even if it is begrudgingly). Is there a way for programs written in Python to be 10x faster?
How do you feel about the current state of the migration to Python 3 (Py3k)?
From a user perspective it seems that the conversion of popular libraries has lagged far behind, which has impeded the transition. In my professional capacity, nearly every single system I use lacks an installed 3.x interpreter. In fact, 2.7 is a rarity. I'd like to get your thoughts.
"Here Lies Philip J. Fry, named for his uncle, to carry on his spirit"
Did you guys come up with any more honking great ideas like namespaces?
("Namespaces are one honking great idea -- let's do more of those!" is the last line of wisdom printed when importing "this" in Python.)
I'd like to get your thoughts on Python as it compares to some of the newer developments in programming languages. In the past few years, the hyperactive growth of the web in the mobile space and in so-called "cloud computing" have spurred all kinds of new languages from new upstarts and companies like Google (go, dart); as well as new features in established languages like C#, Java, C++. How do you think the cutting edge of Python will compare to people who might be lured away by those new toys? Or, in an alternate form, is there anything big you want to add to Python that we have seen emerge recently?
My Slashdot brethren: I realize that many of the concepts that are new to the younger generation were actually hashed out back in the 60's, 70's, and 80's. Don't flame me please.
"Here Lies Philip J. Fry, named for his uncle, to carry on his spirit"
Perl 6 is an ambitious new programming language in the Perl family.
Have you read the Perl 6 specifications?
What do you think of the Perl 6, the language (syntax changes) and its new features (meta-operators, advanced OOP, built-in paralelism, signatures as first class objects, grammars)?
What doesn't Python work for?
Emergent OOism -- that everything is an object, including the variable types -- can provide continual surprises of what is possible, even to veteran programmers in other languages. As you were developing and using Python, Guido, what was your favorite surprise? What was now easily possible using Python that would have been very difficult with another language (at the time, or even nowadays)?
Mine: a dictionary of lambda functions for parsing text, and writing a custom MapReduce capability for AWS in 372 lines.
Favorite
Eh, I can give or take the lambdas. Just give me a halfway decent switch-case statement so I don't have to write huge, rolling if-elif-elif-elif-elif-elif-else-finally blobs. Using built in language reflection to parse a parameter into a function name you call later isn't an answer and has led to many, many python scripts crapping out on me when you actually want to use string data containing '-' or other strange, exotic ASCII characters in them as your key into the switch statement.
Having enthusiastically embraced Python in the last two years, I found I hit a wall. It seems Python lacks a simple, unified GUI environment. While creating simple programs works great, most programmers want to have a real GUI, dialog boxes, images, etc. Do you think leaving this functionality out of the core of the language is hampering Python's potential?
But multiprocessing is heavyweight. It is very expensive to launch new processes. Which leads to my question to Guido:
When will Python support lightweight threads for CPU bound workloads?
just curious
But spawning processes is very slow. And more importantly communication between processes means pickling and unpickling objects which in my experience can be a showstopper due to the performance penalty. I guess this is a consequence of the fact that the multiprocessing module is very general and can run on several nodes. So my question is:
Will Python get a fast parallellization module for CPU bound problems on shared memory architectures?
I've commented on python and whitespace enough times, that it became more practical to create a small web page about it: Read it here.
Much as I love python, I see nothing to refute my objections to python's use of semantic spaces and lack of braces or other equivalent construct.
- It makes python ineffective as a shell-replacement language, because you can't easily do a complex one-liner and then retrieve it from history
- It makes cutting and pasting code difficult, which makes refactoring unnecessarily painful
- It makes it impossible to fully auto-format code: all formatters I have seen for python do not trust themselves to touch leading spaces, nor should they. This leads to ugly, inconsistent-looking code.
It's not just about the obvious problem of tripping up anyone who is not paying attention to this peculiarity of the language. If python had semantic line ends (but not semantic leading spaces) with optional semi-colons (for when you don't use a line end) plus braces, the amount of extra typing and visual clutter would be minimal, while these problems would go away, making python a more useful language.
Yes, this is very true, the overhead of process spawning is vast. Especially when you factor in the garbage collection if you are passing large objects to your spawned processes. You need to be looking at processes that last minutes to make multiprocessing beneficial. TBH what I'm working on (engineering analysis) does fit this case, but it's still surprising just how big the process needs to be before you see any benefit.
"Our opponent is an alien starship packed with atomic bombs," I said. "we have a protractor"
...do you really feel you have a handle on all of it? It seems to an outsider to be as gargantuan a task as Linus has with the kernel.
"Our opponent is an alien starship packed with atomic bombs," I said. "we have a protractor"
Hello,
The Git distributed version control system is today the most used for the OSS projects. This was not the case back to the day when the Python project selected Mercurial to store hi source code. After all, at this time the mass of users of a specific SCM was not a important parameter for the decision since a bunch of a new generation of SCM was relatively new. Now, several years later, the Git audience is several order of magnitude bigger than the Mercurial audience. It has also proved to be appropriate for a lot of project of the size of Python. When will Python source code migrate to Git ?
I have often wondered if it is possible to create a program that would automatically generate bindings to C/C++ libraries. The binding issue is a huge problem for all languages. It seems doable to me, but I fear I am missing something because it has not been done. There is SWIG, but it needs an IDL like description file. I frequently read comments like, oh it it so easy to call a 'C' function. However, when you look at something like the Windows API with over a million functions and a massive number of struct types, and millions of constants, writing an IDL file becomes prohibitive. I know of no language with a very thorough binding to the Windows API. As of right now PySide needs a binding to Qt5. The first problem is just getting any kind of binding, but one would prefer a Pythonic binding. For example in 'C' an array is often followed by a size parameter. A good binding would understand how to take a list and make the underlying call with the array and the size parameter. So this involves an IDL file again. However, I envision a tool that could highlight possible ambiguities like this and provide some sort of web interface to crowd source the answers to the ambiguities. I guess my question is, is this possible? or what makes this so hard that it has not been done? I can see C++ definitely being harder, but dealing with a 'C' header file doesn't seem so complicated.
The Python functions map, filter and reduce provided me a gentle introduction to the ideas of functional programming. I find the arguments for functional programming very compelling. I frequently use Python in a functional style. However, the reality is, Python is not really the best vehicle for this style of programming, and it appears that Python is going to remain at heart an imperative language. I really like the Python standard library though and this is one of the main things keeping me from switching to a functional language. What are your thoughts on porting a functional language to the Python virtual machine? Is there too much of an impedance mismatch to interface with the libraries. Historically the libraries often took lists(dynamic arrays), whereas functional programming usually works with linked lists. On the other hand Python is moving towards iterators for everything which would seem to solve this mismatch.
Do you miss the Netherlands and / or the CWI?
Can you put it on a website that doesn't look like shit?
Like this perhaps?
http://preview.python.org/
I have been using python for over 10 years, but recently find that python has not progressed much beyond a 'glue' language. One of my biggest issues is that python currently cannot be statically typed, even though a lot of progress has been made in this direction by other languages.
90% of my errors could have been found by a F#-like type checker while typing.
Unit tests are no replacement for static type checking, because you have to run the code, which might take a very long time, interact with the environment (break something if the program controls hardware), and generally only tests a very very small space of possiblities.
Further the intellisense-like editor support is severely lacking, because so little information is available.
Do you think we will see optional static typing for python in the future?
Any chance to get custom operators?
Would be extremely useful for mathematics, Matlab for example provides .* ./ .\ .^ for elementwise operations. The python code is very terse by comparison.
It would be great if new operators could be defined just like functions.
I want to learn one of the best OOPS language, is python a perfect match for it? If no which one you recommend.
Though I've since grown accustomed to it, when PEP 308 was first resolved the grammar of the conditional expression ("X if C else Y"), placing the conditional as it does in between the choices, struck me (and not me alone) as particularly odd. Does this reflect your mischievous sense of humour, or am I missing something because I'm not Dutch?
Better to be despised for too anxious apprehensions, than ruined by too confident a security. --Edmund Burke
GIL stands for Global Interpreter Lock, see: https://en.wikipedia.org/wiki/Global_Interpreter_Lock
Nae king! Nae laird! Nae yurrupiean pressedent! We willna be fooled again!
Would you settle for PHP "type hints" (see http://php.net/manual/en/language.oop5.typehinting.php ) ? They add legible safeguards to function calls without introducing needless clutter to the language.
Nae king! Nae laird! Nae yurrupiean pressedent! We willna be fooled again!
If you want to avoid the overhead of spawning new processes you might want to look into IPython Parallel:
http://ipython.org/ipython-doc/dev/parallel/
If you use that you can keep your "engines" (= processes) running to avoid the overhead of spawning processes. But the inter-process communication will still be slow (I believe they also use pickling) unless you use MPI for communication (which limits the datatypes that you can transfer and adds some extra programming overhead).
When I started with Python, over a decade ago, it was relatively simple. However, as new features have been added, it has become increasingly complex, then complicated. For example, it now includes gems such as "weak references", and the book "Programming Python" now uses more than 1600 pages to explain it. Meanwhile, C, the language Python is implemented in, is nearly unchanged. When, if ever, will Python be complete?
Python sucks a lot.
Are there any plans to replace the channel between you and Tim Peters with a chunnel?
It makes python ineffective as a shell-replacement language, because you can't easily do a complex one-liner and then retrieve it from history
Obviously, it's not intended as a shell replacement language. We already have Perl and, well, the actual shells for that.
Something like this, completely independent of the core language. One should be able to strip all type annotations and run the program dynamically.
What I don't like about the type hints is that they don't compose and don't allow for basic types and complex types.
You can't represent something like a function that takes two integers and a float and returns a string, which could be written easily like this (int -> int -> float) -> str.
https://github.com/kennknowles/python-rightarrow
Syntactically, all of this can already be done with decorators and function calls, as well as function annotations, which I don't like, because they don't compose either.
@type_hint(...) ...
def f(a,b,c):
# type_hint is identity in first arg
x=type_hint(a, int)
fun = pickle.load(...)
type_hint(fun, '(int -> int -> float) -> str.')
For expressions that can't be typed at compile time (like the pickled function), python can now create unit tests and runtime checks.
It is important to have type annotations that can be composed, as a variable can be a member of more than one type. An example would be a value that is of type float, and of type physical unit kg.
Being able to check for physical units would be a huge selling point in the scientific community.
Do you see adding some type of decimal math option to properly support business applications? I have and now of several business applications that require decimal math or at least the minimum of a decimal floating point type. Currently all these applications are running under one or more emulators with some of the code going back to the 70's.
No floating binary WILL NOT WORK. I've looked into it. Double binary fails to maintain enough accuracy within the first few calculations and even quad binary fails fairly quickly. When there are millions of calculations and data items involved binary just does not work. long integers do not work when you may have calculations where unit pricing is required to be accurate to less then 1/10,000 of a cent for quantities of over 1million units.
Have you considered including PyPy alongside CPython in the standard distribution of Python 2.7 and 3.x (if PyPy were to be pushed forward to 3.x)?
Mainly to gain exposure, since many think PyPy is the future of Python. But given as an officially unsupported until the day the switchover comes.
http://www.accountkiller.com/en/delete-slashdot-account Stop visiting Slashdot.
Is taking Python, the language semantics and placing it on top of another language (as Jython/IronPython does) been considered as a way forward for a future Python version?
Leaving both the CPython version intact and developed by those who wish to do so, and move those who are interested onto the 'Jython' or 'PyPy' version (Python4000)?
http://www.accountkiller.com/en/delete-slashdot-account Stop visiting Slashdot.
Numeric/Numpy is one of the most useful libraries; it is by far superior to Python's native arrays. Why not incorporating it into the language?
Interesting tip. I'll check that out, thanks!
"Our opponent is an alien starship packed with atomic bombs," I said. "we have a protractor"
RE you talk on proposed standard async IO and tulip. I still don't understand why callbacks are bad, and how coroutines are better? They seem quite obscure to me (although I have to admit I have very modest experience with Python), and difficult to reason about. More specific example: my current project is I/O heavy, I use a dedicated IO thread with libcurl/pycurl multi interface, simple event loop & timers - i.e., pretty much all logic is event-driven / implemented in callbacks. Why would I want to use tulip (or anything coroutine-based) instead?
Three questions:
1.) What is your name?
2.) What is your quest?
3.) What is your favorite color?
And three Python questions:
1.) When are you going to respond to all these questions?!?!
2.) What is your favorite editor?
3.) this has been asked prolly, Python in the browser, do you think it will ever happen?
*** optional
4.) What is your favorite sci-fi movie?
Thanks!