Slashdot Mirror


Interview With Python Creator Guido Van Rossum (techrocket.com)

The online programming school Tech Rocket just published a new interview with Guido van Rossum, the creator of Python. "Looking back I don't think I ever really doubted Python, and I always had fun," he tells the site. "I had a lot of doubts about myself, but Python's ever-increasing success, and encouragement from people to whom I looked up (even Larry Wall!), made me forget that."

He describes what it's like being Python's Benevolent Dictator for Life, and says that the most astonishing thing he's seen built with Python is "probaby the Dropbox server. Two million lines of code and counting, and it serves hundreds of millions of users." And he leaves aspiring programmers with this advice. "Don't do something you don't enjoy just because it looks lucrative -- that's where the competition will be fiercest, and because you don't enjoy it, you'll lose out to others who are more motivated."

19 of 222 comments (clear)

  1. Re: Why in the heck should a file server need 2M l by Anonymous Coward · · Score: 5, Insightful

    Because he understands the problem Dropbox is trying to solve better than you do.

  2. Re:Interpreted languages should cease by Anonymous Coward · · Score: 3, Funny

    You had me up till you said Java.

  3. Re: Only One Question by __aaclcg7560 · · Score: 4, Informative

    Works as long as what you make in that language won't get too big.

    Two million lines of code for DropBox is pretty impressive for a script kiddie language.

  4. Re: Only One Question by Anonymous Coward · · Score: 5, Interesting

    Works as long as what you make in that language won't get too big.

    I think that's true of all languages.

    My impression (having written O(10k) line Python programs for fun) is that it's quite capable of implementing the kind of O(1m) line COBOL, PL/1 & C systems I worked on in the 80s and 90s - probably in a tenth of the size too. It's also infinitely better for "Rapid Application Development" (is RAD still a thing?) than the 4GLs and BASIC variants we used back then.

    Python is one of the few bright spots in the evolution of programming languages over the past 30 years. Too bad van Rossum fucked up the transition to version 3.

  5. Re:Interpreted languages should cease by __aaclcg7560 · · Score: 4, Informative

    Learn how to use a compiled language like Java.

    If you have a need for speed, compile Python code to C binary with Cython.

    http://cython.org/

  6. Re: Why in the heck should a file server need 2M by slazzy · · Score: 3, Informative

    "seen built" didn't say he built it himself.

    --
    Website Just Down For Me? Find out
  7. Re:Interpreted languages should cease by __aaclcg7560 · · Score: 3, Interesting

    Like the java virtual machine, for which you compile to BYTE CODE!!!1!

    Serpent, maybe?

    Serpent is a real-time scripting language inspired by Python but completely reimplemented to support real-time garbage collection and multiple instances of the virtual machines running on independent threads.

    https://sourceforge.net/projects/serpent/

  8. Re:Why in the heck should a file server need 2M li by 93+Escort+Wagon · · Score: 4, Insightful

    He meant to say 2 million functions in one line of code.

    We're talking about python, not perl.

    --
    #DeleteChrome
  9. Re: Only One Question by speedplane · · Score: 3, Interesting

    If you're still running Python 2 code, you should into the mirror see where the problem lies.

    The problem lies far beyond the mirror. It lies in the dozens of libraries that I rely on that have not upgraded to Python 3 (despite pleas to the original authors). It relies on the hundreds of hours of porting, testing, and double checking that I will need to do to move my code over to Python 3. The improvements in Python 3 are real, but they are not worth the enormous burden imposed on everyone to get their code to be Python 3 compliant.

    --
    Fast Federal Court and I.T.C. updates
  10. He Didn't Ask the Big Question by speedplane · · Score: 5, Interesting

    I know this article was focused on Guido's softer side, but would have liked them to mention the elephant in the room: the move from Python 2 to 3. This been a huge resource drain on the entire community and many (including me) remain unconvinced that it was the right decision. It would have been nice if the topic was broached.

    --
    Fast Federal Court and I.T.C. updates
  11. Re: Only One Question by __aaclcg7560 · · Score: 5, Informative

    Here's a list. Only 30 out of the top 360 packages aren't ready for Python 3.

    http://py3readiness.org/

  12. Re:Python community so much nicer than Rust's? by im_thatoneguy · · Score: 5, Insightful

    What I'm most curious about is why the Python community is so much nicer to deal with than the Rust programming language's community.

    In my limited experience in the VFX world it's because people using Python are focused on actually creating usable products that solve people's problems. And I use the word "people" not "developers" in this instance because a lot of them are "non-programmers" solving problems that they themselves face. Python is a tool to them to make their lives better.

  13. Funnily by Greyfox · · Score: 3, Insightful

    I was just talking to an old Co-Worker from a C++ company I worked at a few years back. He asked "So what are you doing lately" and I told him I'm working on my thesis, which is titled "Ruby's a Terrible Programming Language, And You're A Terrible Programmer For Liking It". Then I cited a number of my complaints -- being able to add arbitrary functions to a live object, never knowing where to look for the interface definition of parameter objects, need to extensively test all execution paths of production code (Which no one ever does,) odd syntactic quirks and changes in syntax between language versions. He laughed and said he had exactly the same complaints about Python. You see, Object Oriented Programming was invented to reduce maintenance costs for completed projects, because that's where 90% of your expenses with the project will be. Ruby, at least, and apparently Python as well according to my friend's complaints, were invented to make the cheap part of the development process "easier", while at the same time letting the language fanboys pat themselves on the back about what clever programmers they are. This is exactly the opposite of software "engineering".

    --

    I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

  14. Comment removed by account_deleted · · Score: 4, Insightful

    Comment removed based on user account deletion

  15. Re: Why in the heck should a file server need 2M l by silentcoder · · Score: 3, Informative

    I can easily see half that change coming just from gaining access to Jinja2. I've worked extensively with a huge array of templating languages in recent years, and while they all have basically the same syntax none of them are anywhere near the sheer unadulterated power that Jinja2 offers.
    It took me a while to figure out just what it was that Jinja2 had that they all lacked and which made such a massive difference. I finally nailed it down. All of them have, right in the beginning of the manual guides on how to write helper functions to extend the language. Jinja2 barely mentions that. The reason is simple: it hardly ever needs them. This is because Jinja2 exposes any data you pass to it as the original python object - with all that object's methods available inside the template.
    That's extremely useful. The problem with the "helper methods" approach is that you're screwed whenever the thing running the template is not code you have access to (or can afford to alter). In a corporate environment you may be using a tool like consul-template which you really don't want to maintain a local derivative off, so you can't use any functions in your templates that their derivative of golang-template did not already include. With Jinja2 even with somebody else's code you still have access to every method the data type object exposes.
    So if you have to get the overlap of two lists in your template from two disparate upstream data sources you don't have to hope the template language includes the right list operators, you just use the built-in methods of the list objects directly.

    Have you seen what a list-overlap seeker written in pure golang-template looks like ? It's a hellish maze of deeply nested loops and if statements and the output enmeshed inside all that gunk because golang lacks a sufficient set method to create new variables that can let you pre-construct your new list before looping over it.

    I had to do just that recently, the golang template was well over 500 lines of unreadable and barely navigatable junk - I did the same thing in less than 5 lines of Jinja2.

    So I can easily imagine that for a web app that's highly template driven you could end up with a huge amount of code that exists within the templates (or as helper-functions to the template library) which you can throw away if you redo it in python.

    --
    Unicode killed the ASCII-art *
  16. Re: Why in the heck should a file server need 2M l by silentcoder · · Score: 5, Interesting

    I suspect a lot of that code lies in redundancy and load-sharing systems. Python is notoriously bad at multithreading (so much so that most python coders prefer a library called multiprocess that fakes multithreading by spawning new python processses entirely) - so load-balancing, load-sharing and redundancy under heavy use are problems python is particularly bad at (not the language so much as the architecture of the implementation to be fair).

    So it's likely a great deal of the code is dedicated to solutions to those difficult problems. It also rather depends on how you count it. Other things that could contribute large amounts of code:
    - They likely use a custom application server (in order to implement all that redundancy and load-sharing)
    - There's likely a significant amount of debug logs in there, and extensive logging throughout (you need that if you're going to keep something like this maintained and find/fix problems quickly).

    Finally - the comparison is not actually very fair. The Kernel is written in C - a language designed for brevity, while python is much more verbose. Python is not a language that encourages lots of one-liners, except where they can be used to avoid deep nesting (which is actively discouraged) or in purely functional calls (which are available but only encouraged for specific use-case determined algorithms where the non-functional version would actually be harder to read).

    So the exact same algorithm is likely to use more lines in python than in C - and be a lot easier to read. I was not surprised to learn from this interview that a lot of his own early work was in Pascal and Algol - you can see a lot of the Wirth philosophy in Python. Python in a very real sense struck a perfect balance between readable verbosity and cruft. Java is too far to the other side, you need about 100 lines just to show a simple "exit on click" button on the screen and almost none of them have anything to do with the task at hand. The same thing in TCL/TK will take one line, in python it's about 3 lines (depending what GUI library you use).
    I recently developed a fairly comprehensive GUI library in python. I was building an RPG in Pygame (need to pick that up again sometime) and pygame has no gui elements and none of the libraries are maintained, so I wrote my own. Python made this ridiculously easy for the most part. Hell at one point I actually wrote a recursive object (that is to say - an class in which one of the methods would instantiate another instance of the same class it's a method off). The class was a box containing a list of items you can select (used for things like the inventory) but when you need to add something to a box like that (say in the game editor - adding treaures into a chest) it instantiates another instance of itself, with different parameters to show you the list of possible things you can add.
    That took no clever trickery whatsoever, the method just instantiated the class and operated on the object, python handled all the recursion magic entirely transparently.

    That gave me (as a long time python developer) an entirely new respect for just how powerful the language really is. I also strongly suspect that in most languages that activity would have been far more complex, many languages don't even allow recursive classes after all.

    --
    Unicode killed the ASCII-art *
  17. "syntax error: go to hell and hunt for it" by Pseudonymous+Powers · · Score: 5, Funny

    Interview With Python Creator Guido Van Rossum

    Well, I tried to read it, since I'm a huge fan of Python. But one of the paragraphs was indented in a slightly different way than all the others, so I couldn't.

  18. Re: Why in the heck should a file server need 2M l by dbrueck · · Score: 3, Insightful

    The Kernel is written in C - a language designed for brevity, while python is much more verbose. ... So the exact same algorithm is likely to use more lines in python than in C

    I dunno, YMMV, but to me the opposite seems to pretty much always be the case - for any non-trivial chunk of code, the Python version tends to require far fewer lines than the C equivalent. At several different companies we've ported various C modules to Python and it's common for the Python version to have only 20% (or fewer) LOC vs the C original. The reason is just the usual stuff: Python, being a much higher level language, introduces a lot of overhead but in exchange you get powerful built-in data types and have to do basically zero manual memory management.

    This tends to show up in not just large libraries or apps, but even in small chunks of code. For example, below is a function from one of our network monitoring agents; as background, basically there are a bunch of different server clusters and a job monitor spits out an hourly file that lists on each line the IP address of each server and the number of errors it encountered (this is part of some legacy thing we're hoping to replace, it's kind of goofy). Anyway, those files get aggregated to a central monitor, that in turn looks for various conditions and alarms if e.g. the error rates are too high.

    Anyway, here's a function that reads those files and tells you which servers are seeing the most errors (it returns a list of server IP address and number of errors encountered, in descending order of number of errors):

    def ServerErrorRates(reportFiles):
        counts = {} # ip addr --> total errors
        for filename in reportFiles:
            for line in open(filename):
                ip, errors = line.strip().split()
                counts[ip] = counts.get(ip, 0) + int(errors)
        return sorted(counts.items(), key=lambda x:x[1], reverse=True)

    Nothing fancy, but doing that in straight C is likely going to take far more than 7 code statements. There's just no way to perform the same work in C in anything close to 7 statements. It's not a knock on C, the two tools are just optimized for different things.

    As another example, I just took a peek at our HTTP server library. The whole thing is a single file of less than 800 LOC, and that handles all HTTP request/response handling including header reading/writing, file uploads, cookies, websocket support, request routing, etc., etc. without using any of the HTTP stuff from the Python standard library. The C equivalent would certainly be several times as many lines of code.

    I think you could maybe argue that individual statements in Python are more verbose than C, but it's common for each Python statement to be the equivalent of several C statements (and/or many statements in C are simply not required in Python), so on the whole Python programs end up being way more concise. Occasionally I'll see an exception, but it's just that - an exception.

  19. Python, 2to3 and retesting by fyngyrz · · Score: 3, Informative

    There's probably working, and there's actually working. When you have a complex system that a business actually depends on, running 2to3 absolutely requires that everything be re-tested. The larger the system is -- and that can mean interactions with libraries, databases, other languages, etc. -- the larger that testing job is, and it gets larger in a nonlinear fashion with the amount of code as interactions multiply.

    However, Python 2.x isn't going to go anywhere. If you have a substantial system or systems written in it, and it's doing what it supposed to be doing, there's actually no reason to move it over. If you want to, you can write new stuff in 3.x; no reason they both can't exist on the same machine(s), either. Either one can call the other. Nothing to it.

    Personally, so far at least, I have no specific need for 3.x, and so haven't bothered with it for anything serious, but I'm not against using it if some reason arises that makes it more useful than 2.x. I can't say I really object to 2.x becoming stable because development is going towards 3.x, either. Again, it reduces the need to re-test, and it keeps the unit testing mechanisms stable as well.

    --
    I've fallen off your lawn, and I can't get up.