Slashdot Mirror


Sophistication in Web Applications?

whit537 asks: "Anyone who uses Gmail for 5 minutes can see that it's a pretty dern sophisticated web application. But just how sophisticated? Well, first of all, the UI is composed of no less than nine iframes (try turning off the page styles in Firefox with View > Page Style). But then consider that these iframes are generated and controlled by a 1149 line javascript. This script includes no less than 1001 functions, and 998 of these functions have one- or two-letter names! They're obviously not maintaining this script by hand in that form. So do they write human-readable javascripts and then run them all together, or have they developed some sort of web app UI toolkit in Python? Does Gmail need to be this complex or is the obfuscation a deliberate attempt to prevent reverse-engineering? And are there any other web apps that are this sophisticated?"

24 of 197 comments (clear)

  1. not simply obfuscation by Glog · · Score: 5, Insightful

    I don't believe that the naming of the functions and variables is simply an effort to obfuscate the code. There is that, of course, but the main reason is probably to save money on bandwidth. When you have millions of people hitting your servers you can scrape quite a few bucks by removing white space and reducing the size of your files the way they have.

    1. Re:not simply obfuscation by eggstasy · · Score: 3, Informative

      Google serves ads on every page. Assuming they are paid a fixed fee per page, then minimizing their per-page costs is the only way they can increase their revenue. Offering more free services draws in more people, who are served more ads. If they optimize those pages as well, they will earn more profit there.
      BTW... ever noticed how google uses text ads? Do you think the only reason they do that is because it's less intrusive? Wrong again - it also saves a lot of bandwidth compared to an image ad :)
      When you serve billions and billions of pages, shaving off a single byte on each page saves you GIGABYTES of traffic.

  2. Re:Huh? by Cecil · · Score: 3, Informative

    I think he may have derived "in Python" from the fact that Google has been hiring many Python programmers in the past couple years.

    However, it was completely uncalled for speculation that had no place in a Slashdot article. ... just like the rest of this article.

    I'm with you, "huh?"

  3. Gmail not that impressive by Anonymous Coward · · Score: 2, Insightful

    1200 lines of Javascript might seem like an enormous amount to a dreamweaver monkey, but it's hardly any code for anyone who does real programming*. Check out the scripts for Outlook Web Access for example. Or any other intranet/portal type application.

    (*Well, many 'real programmers' are loath to do rich client stuff in JS, perferring their server side frameworks instead. But once you get the hang of it, it's pretty nice.)

    1. Re:Gmail not that impressive by PurpleFloyd · · Score: 2, Insightful
      I don't think it's accurate to say that any arbitrary amount of code defines the line between code grinders and real programmers. After all, there's a world of difference between 1200 verbose, well-commented lines and 1200 highly optimized lines full of obscure tricks to squeeze every bit of performance possible out of the hardware. The first could be cranked out in a few hours by almost anyone with some programming knowledge, and the second might take weeks or even months to get right.

      Looking at the Google code, we can see that while it appears to be machine generated, it definately tends towards the latter; Google has obviously tuned the code to save bytes and run as quickly as possible. Bandwidth and processor power aren't so important in a corporate environment where everyone has a LAN connection to the server and decent machines to work on, but when you have to deal with customers on dialup links and using old machines, every bit and every instruction counts. In that scenario, 1200 lines can be an enormous amount of code.

      --

      That's it. I'm no longer part of Team Sanity.
  4. MSN's Web Messenger Is Impressive by bdash · · Score: 2, Interesting

    MSN's http://webmessenger.msn.com/ is a web-based MSN client implemented using a combination of HTML and Javascript. The source for the javascript is available here. I was looking into how it worked the other day and tidied the source into a more readable form. At least MSN had the decency to leave human-readable function names... this fact alone makes the code reasonably understandable.

  5. Re:If they are smart, and they are, by T-Ranger · · Score: 3, Insightful

    "pretty clever little language" is the understatement of the year. The entire UI of the browser that I am using now is done in javascript.

  6. Re:If they are smart, and they are, by I_Love_Pocky! · · Score: 3, Informative

    Both Perl and Javascript can be maintainable if the programmer designs their code with that goal in mind. Besides, Gmail has slightly more than 1000 lines of code. That really isn't really a maintenance nightmare.

    Java is an object oriented language, but I could certainly write Java code that would be a major headache to maintain if I chose to do so. I think most maintenance problems come from poor coding habits, and not the language its self.

  7. Re:Huh? by sethadam1 · · Score: 2, Insightful

    Actually, if you browse around Google and Gmail, you'll find tons of links like this one - the file has a .py extension.

    Google writes A LOT in Javascript. It would not surprise me, although I have no evidence of this, if they wrote the code in their choice editor and then ran a python app that condensed the code to remove space, renamed the functions, and replaced all function references. At 1000+ functions, if the function names had just 5 letters each (not much if you're not being terse), that would be an extra 3000 characters (3k) PER PAGE LOAD. Multiply that times thousands (tens of thousands after general release?), and you'll see A LOT of extra bandwidth.

  8. why Python if you have JavaScript? by Anonymous Coward · · Score: 2, Interesting

    developed some sort of web app UI toolkit in Python?

    This is why I call Python "Java of the open source world".. so many people think all programming begins and ends in Python.

    JavaScript is *already* a sophisticated, object-oriented language. In fact the design of the language is somewhat cleaner than Python. Why do you think they would write it again in Python somehow?

  9. Re:Just quick and easy by ComputerSlicer23 · · Score: 3, Interesting
    Ironically, the one of the best tips from one the best programmers I've ever known, is use the form "iii" for all incremented variables. Why? Because if you use English Language descriptions, "iii" should never occur when searching except for in the case of your variable. However, if you use i,j,k (the standard loop variables), when you search for them, you constantly have to search, and search again because you find those letters embedded in function names or keywords.

    Kirby

  10. Re:Just quick and easy by Karora · · Score: 3, Funny


    For the number of times "ii" occurs in english you could save yourself a character, right there.

    Now, I'm off for a bit of skiing...

    --

    ...heellpppp! I've been captured by little green penguins!
  11. It's all a matter of perception by sakusha · · Score: 2, Interesting

    People can be fooled into thinking things are sophisticated apps when they're really not. I'm reminded of a famous anecdote from Danny Hillis. He was trying to sell his Connection Machine with WAIS software to a CEO for enterprise-level data mining. He gave the CEO a demo at his shop, the CM1 did its thing, and the CEO was totally unimpressed, and said, "hell, my IBM PC back at the office can do that!" Hillis couldn't believe a 286 could do something that requires a CM1 supercomputer, so he asked the CEO to take him back to his office and show him.
    So they get back to the CEO's office, and he uses his PC to dial up Dow Jones News Retrieval service and runs a monster WAIS search.. which used a CM1 that Hillis sold to Dow Jones.

  12. Re:Just quick and easy by JabberWokky · · Score: 2, Interesting
    Most editors have a search that takes into account the various word breaks. Most programmers editors are smart enough to automatically identify the language and know where "words" break in variables (i.e, car->door is one word, while truck>car is two words split by an operator... although I hate spaceless operators myself). Very helpful when you're using word by word selectors to cut and paste.

    Aside: I generally use c for my incrementing variables, and foo for my unknown type variables (common in returns in un- or loosely typed languages.

    OnTopic: Looking at the code, Google likely has a more verbose code base and runs it through a stripper that minimizes the code and removes comments and excess whitespace. Since they use Python internally, especially in their GMail site, it would be a likely choice for such an app.

    While the one and two letter variable names are not at all unmaintainable, I'd imagine that comments and most importantly, indentation are maintained in the original.

    --
    Evan

    --
    "$30 for the One True Ring. $10 each additional ring!" -- JRR "Bob" Tolkien
  13. Yah, good for Javascript! by QuantumG · · Score: 2, Interesting
    Every single language you've mentioned there are NOT maintainable. Why? Cause they're all interpreted dynamic languages. It's fun and all to write in these languages and get stuff done with them but as soon as you spot a bug you have a hell of a time to:
    • reproducing the bug
    • debug the problem without changing it
    • convincing yourself you've fixed it
    • looking for the same class of bug elsewhere

    Dynamic languages are kick ass, I really really like them, but they're for prototyping, not writing maintainable production systems. Static, type checked languages are currently the best way to write maintainable code. In the future we'll have even more formal methods to play with and we can make even more maintainable code.. that doesn't mean we'll leave dynamic languages behind, it just means it will be even more evident that they're not suitable for production systems.

    --
    How we know is more important than what we know.
    1. Re:Yah, good for Javascript! by Earlybird · · Score: 3, Interesting
      • Every single language you've mentioned there are NOT maintainable. Why? Cause they're all interpreted dynamic languages. It's fun and all to write in these languages and get stuff done with them but as soon as you spot a bug you have a hell of a time to ... blah blah ... not suitable for production systems.

      This is a myth, and has been proven false countless of times, such as by these guys, or these guys, or even these guys, or, God forbid, you may have heard of these guys.

      First, the term "interpreted dynamic language" is vague and misleading. Interpretation has nothing to do with code maintainability. (You can interpret C, and you can compile putatively interpreted languages such as Java and Python to native code; indeed Java has been natively compiled for years, and the fact that it is just-in-time compilation is irrelevant).

      And what does "dynamic" mean? Do you mean a dynamically, as opposed to statically, typed language? Do you mean runtime introspection? Self-modification and metaprogramming? Runtime name resolution? What? I suspect you mean a combination of these. Python, Perl, Ruby, JavaScript, PHP, Haskell, Lisp and OCaml have these features. C++ can be considered a "dynamic" language, as can Java, C#, etc. So why do you claim that these languages are not maintainable?

      These newfangled languages are more rapid to develop in than lower-level languages. Maintenance is simpler because the languages are simpler, higher-level and more easily maintained. For example, the absence of a separate compile/link cycle means I can get from changing a source line to testing the source line quicker.

      In many cases, reproducing or debugging a bug is simpler in, say, Python than in C, because the infrastructure itself is simpler. Pure Python, for example, does not have memory access violation errors; there's no way your Python code can read or write an invalid pointer, write beyond the end of a buffer and so on; a whole class of pointer errors, most of which have security repercussions, are annihilated by this feature. Similarly, Python uses exceptions, so nobody can forget to check and propagate a function's error return value.

      More often than not, errors that surface in these languages are high-level problems, which is good, because those are simpler than the ones involving someone forgetting to call free() on an allocated buffer or accounting for overflow when shifting a bit mask.

      The uncertainty involved in the dynamic typing/late binding model of such languages is compensated for through unit testing.

      Oh, and JavaScript, a "dynamic language", is being used by Google in a production system, and Google is known to use Python and Ruby in their systems. I suggest you call them up and tell them their languages aren't suitable.

    2. Re:Yah, good for Javascript! by Earlybird · · Score: 2, Interesting
      • So the fact that there are absolutely no static type checking tools for Javascript has no affect on its maintainability?
      Yes. JavaScript is a poor language, but for other reasons than a lack of a static type system.

      • Get a grip sunsine.

      You're obviously trolling. Present us with arguments supporting to the proposition that dynamic typing decreases maintainability, and we'll have a discussion. Until then, you're just spouting FUD.

      I have already given ample explanations for my view, but here's a counter-argument to the specific case of typing: Static typing, unless implemented from the bottom up with Ocaml-style type inference (which leads to other interesting problems), adds more metadata to the program, which adds to the amount of text that must be typed, read, digested. Compare:

      int a = 1;
      with
      a = 1
      The intent, in both cases, is absolutely clear. The second syntax is simpler, shorter, and more readable.

      Readability is a huge maintenance consideration. Ease of refactoring is another; if a should be changed to be short, much use of the variable needs to update to accomodate this change -- expressions, function calls, function prototypes and so on. This is why nascent IDEs such as Eclipse focus so much on automatic refactoring, because refactoring is a pain to do in statically-typed languages.

    3. Re:Yah, good for Javascript! by QuantumG · · Score: 2, Interesting
      The reason I'm not taking you seriously is because you don't even seem to be aware of the arguments for and against dynamically types languages. I mean, you havn't even mentioned unit testing yet. People who know what they are talking about understand that you can't do serious programming in a dynamically typed language without a strong unit test framework and then you live and die by that framework. The unit testing consists almost 100% of manual type checking and verification of invariants, pre-conditions and post-conditions. The argument for dynamically typed languages is that the lack of a static type checker forces you to write the unit tests which then encourages you to add this verification of the program to them. Whereas people who use exclusively statically typed languages often don't feel the need for unit tests and therefore let more bugs slip into production.

      That's a good argument for dynamically typed languages which honestly addresses its shortcomings and suggests ways to avoid them. You havn't said anything similar, so I have to assume that you're not aware of the shortcomings of dynamically typed languages and have no idea how to avoid them.

      Now that I've actually made a sensible argument for you, I'll indulge myself with a retort. Unit testing is indeed important and when used with a statically typed language even more effective than a dynamically typed language - because at least half of the bulk of the unit tests can be dropped because it is done with a type checker. As more formal methods become available the bulk of unit tests will become even smaller for statically typed languages but will remain the same for dynamically typed languages. This has already happened for languages like Eiffel where verification of object contracts is now automated. These methods are becoming available for Java too.

      --
      How we know is more important than what we know.
    4. Re:Yah, good for Javascript! by Earlybird · · Score: 2, Interesting
      • The reason I'm not taking you seriously is because you don't even seem to be aware of the arguments for and against dynamically types languages. I mean, you havn't even mentioned unit testing yet.

      Please. I mentioned unit testing in my first reply, where I wrote:

      • The uncertainty involved in the dynamic typing/late binding model of such languages is compensated for through unit testing.

      I also linked to an interview with Guido van Rossum where he talks about this very topic, so if you think I'm ignorant of the issues involved, you must be purposely ignoring what I'm writing. Thanks for trolling again.

      • This has already happened for languages like Eiffel where verification of object contracts is now automated. These methods are becoming available for Java too.

      Sorry, but interface contracts have very little to do with static type checking.

      A pre-condition is typically something like whether a value is within a range, or that an argument is not null, or an array has a certain length, or that an instance is of a certain class at runtime; not whether it's an integer or a string.

      Design by contract, being unrelated to static type checking, is therefore a concept that is equally applicable to both statically- and dynamically-typed languages, the main difference being that in a dynamically typed language, the checks may only occur at runtime.

      There is nothing preventing dynamically typed languages from doing automated type checking. This and this make a good start. The latter is similar to Java 5.0's annotation system.

      As for unit tests consisting of type checks, you will probably find that the overlap is larger than you think. Even if a method in Java returns a StringBuffer object, the Java interface can never explain what the contract is: whether it's allowed to return null, whether it always returns a different instance, what that object is supposed to contain.

      You will find that in Python, for example, the checks are more or less the same; if the method returned Fnarg instead of the expected object, your test will fail -- unless Fnarg happens to behave like what you wanted, in which case everything is all right. As for input, regardless of the type of language, throwing garbage at functions is always useful; with dynamically typed languages, you might just end up throwing a little more garbage.

      Unit tests are supposed to be simple and quick to write. Languages like Python are known to support much more rapid development. Even if you add 20% more checks, you'll still come out on top. Ever worked with lists and maps in Java? They're not first-class objects, so they're a nightmare to manipulate. Such structures are extremely common in tests. (I have been writing Java and Python unit tests every day for four years, so I know where the differences are.)

      (As an aside, if in your mention of Eiffel you're referring to design by contract, the concept was invented for Eiffel, so saying it's "now automated" is like saying the Eiffel tower is now made out of steel.)

  14. Re:Just quick and easy by Pig+Hogger · · Score: 3, Funny
    For the number of times "ii" occurs in english you could save yourself a character, right there.
    Now, I'm off for a bit of skiing...
    He was working for a ski resort, you insensitive clod!!!
  15. Re:Huh? by Anonymous+Custard · · Score: 2, Insightful

    that would be an extra 3000 characters (3k) PER PAGE LOAD

    Well, that js file would be cached by the browser, hopefully, not reloaded with every single page load.

  16. Python Programmers by bill_mcgonigle · · Score: 3, Funny

    When you sign up to be a Python Programmer you have to promise to evangelize rabidly. At least that's been my experience. Evangelism must include suggesting that python is the best language for every job, trashing every other scripting language, and suggesting that Python is what makes the Holy Grail work. ...and to mod-down any Slashdot post that doesn't beatify Python, I'm sure. There goes some karma...

    --
    My God, it's Full of Source!
    OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
  17. Re:Just quick and easy by fiftyfly · · Score: 2, Funny

    *correction - a water skiing resort... in hawaii... working on an ascii front end for a diiodide metering device... Just a though: /usr/share/dict $ grep -ic ii words 419

    --
    "Sanity is not statistical", George Orwell, "1984"
  18. Re:Just quick and easy by ComputerSlicer23 · · Score: 3, Informative
    Well, I'll point out that "static is evil", is a flat out wrong and a statement with no context. At the time I was well aware of the 2 types of things that static can be applied to, and the two different contexts in which you can it on variables. That's not even to discuss the meanings it has in C++.

    In regular C, for a function, it's actually highly useful, and extremely desirable. Go look in any extremely large body of C code. You'll find out just how desireable C static functions are. They are used all the time inside of the Linux kernel. They guarantee that the linker won't make that symbol available externally. Which is great for avoiding two different functions with the same symbol.

    In regular C, using static on a global variable also makes it have no external linkage. It also moves where the memory is actually set aside. Which changes when it gets initialized, and how large your executables are. I believe it's also a very good way to ensure your variables are safe to use in signal handler context. This is common go look at the Linux kernel. Happens all the time. Highly useful application of static.

    In regular C, using static on a variable in function scope, can be useful, if it is also a constant. In that case, you can move the space off the stack and into the BSS section (at least under UNIX, I forget the equivilent under a Win32 platform). This again is used all the time in the Linux kernel. It shrinks the stack usage and saves space. As I recall, you want to declare all strings that are constants as:

    static const char foo[] = "foobar";

    It saves space in the kernel. I forget exactly why it does right off hand, but it has to do with the assembly that GCC outputs. I might have that wrong, you might want to do "*foo" instead of "foo[]", but you get the idea.

    Now, in regular C using static on a variable in function scope to store state between function calls is an excellent way to introduce a race condition. So your blanket statement that "static is evil" is blantantly wrong. Whoever or where ever you learned it from was mimicing what they'd heard before without any understanding. Next you'll be telling me there's no good use of "goto's" either. Which isn't the case, they are few and far between, but they do exist. I've never come across one, but I am aware of when they would be useful. I could make use of them, but generally performance, cache coherency, and "the fast path" aren't things I generally have to worry about. Those are the types of problems we just throw more hardware at and keep the code highly maintainable.

    I learned about a lot of thinks from my old boss, but the "iii" was one of the truely unique things, I've never seen anywhere else. Any good text on C programming will explain what I just did. The small useful things that you just deal with all the time because you never stop to think of a better solution. Paul didn't have too many of those. He recongized anytime something was difficult, and pondered the problem until it was easy. Just like, he knew when static was a good idea. Although I actually picked that up at school, during the explaination of Ada. You can do something named modules in Ada, that you can somewhat duplicate in C with static and separate translations units.

    The truly most useful thing Paul ever taught me was just assume all of your code leaks memory. Assume your code will segfault because there is a bug in it. Assume you'll get crappy data that will lead to pathelogical cases. Design your code so that as much as possible that doesn't make any difference in terms of your ability to process data that doesn't cause a memory leak, a segfault, or isn't malformed. Never get stuck when there is more data to try. If some data elements blow up your code. Timestamp when you tried them and don't retry them for a while so you can work on other data. Any time you are doing batch processing, you should always have a parent that is deathly simple that will spawn children that do the real work.