Slashdot Mirror


Web Programming by printf()

An anonymous reader writes "Art & Logic has posted an article titled 'Why CGI is Evil'. CGI might be an obvious way to create a simple web application, but this article provides some excellent high-level reasons why CGI rarely makes long-term sense. A nice review especially for new web programmers."

22 of 104 comments (clear)

  1. CGI still has uses by LordNimon · · Score: 2, Insightful

    What if your CGI programs need to get data from libraries that only have C and C++ interfaces? On the embedded system I'm working on, everything that talks to hardware is written in C or C++, and in many cases the only way to get the data my CGI programs need to is to call a C++ class library. No one would take me seriously if I proposed writing this stuff in PHP or Perl.

    --
    And the men who hold high places must be the ones who start
    To mold a new reality... closer to the heart
    1. Re:CGI still has uses by highcaffeine · · Score: 3, Informative

      Without much hassle, you can use XS to allow your Perl scripts to interface with C/C++ libraries in a natural way. In fact, many Perl modules are written in C/C++ and wrapped with XS. This allows you to get the best of both worlds in some situations -- streamlined C code to offload intensive operations into separate libraries, then rapidly prototypable Perl front ends to those libraries (whether it be mod_perl for web apps, Perl/GTK for GUI, or any other type of front end you need).

      It's not always the perfect solution, but your post makes it seems as if you don't think there's any way to get C and Perl to talk. Few things are further from the truth. And this isn't just some nifty Perl thing -- many of the "scripting" languages (I can't speak for PHP) offer similar ways to accessing C/C++/etc. I'm just most familiar with Perl's way.

  2. Misuse of an acronym? by Masa · · Score: 2, Interesting
    I thought that CGI stands for Common Gateway Interface and is a standard method of transferring information between a server and a client. The method doesn't say anything about the language used. The article gives an impression that CGI is equal to C programming. Isn't CGI a base for almost all WWW browser based client-server communication? What other methods are available? RMI? Bare TCP/UDP sockets? How .NET transfers data between the browser and the server? How Java Server Pages do it?

    As you can see, I'm not a web programmer and the article actually made things more unclear for me. Did I misunderstood something? Can someone clarify things up for me, please?

    1. Re:Misuse of an acronym? by aridhol · · Score: 4, Informative
      No, CGI is between the server and its subprocesses. The connection between your browser and the server is HTTP, and the connection between the server and scripts is CGI.

      I think CGI also refers only to those programs that the server calls directly passing headers in a certain format. While the script may be written in any language, they tend to be written in a language that is primarily a programming language (Perl, C, Python, etc) rather than a content-oriented language (ASP, JSP, PHP, etc).

      JSP usually runs as a separate server that may be contacted by the web server, but is also capable of answering web requests by itself. ASP and PHP are run in the same process space as the web server, and thus can spare the overhead of spawning a new script interpreter for every request. All of these languages are content-oriented, meaning that the scripting code is embedded in the document, rather than having the document embedded in the code.

      Hopefully I haven't confused matters even more ;)

      --
      I can't say that I don't give a fuck. I've just run out of fuck to give.
    2. Re:Misuse of an acronym? by Sentry21 · · Score: 4, Informative

      I think CGI also refers only to those programs that the server calls directly passing headers in a certain format.

      CGI refers to the method of executing external programs, passing parameters to them as environment variables, and then getting the result back from standard output. CGI tends to be external binaries, and thus PHP can be essentially CGI if you use it in its CGI binary form (needlessly complex, and even worse of a performance hit, but it works).

      JSP usually runs as a separate server that may be contacted by the web server, but is also capable of answering web requests by itself.

      Most of the servers I've seen tend to glob everything together, so I'd be inclined to disagree with the 'usually' here. I've usually run it as the 'tomcat' process, which is its own thing.

      You neglected to mention servlets, which are a nice cross between CGI and markup languages like ASP/JSP/PHP. Servlets are small Java programs that are mapped to a directory on the website (like /administration/). They're compiled and placed into the classpath, and then the server executes them in the JVM when a request is made. The benefit is that the code is precompiled and the JVM is preloaded, so the execution time once loaded is almost as fast as a CGI applet would be once loaded, but the load time is less because it doesn't have to fork a separate process.

      That's my imput for the day. If Aridhol hasn't confused matters enough, this hopefully will.

      --Dan

    3. Re:Misuse of an acronym? by aridhol · · Score: 4, Insightful
      Servlets are small Java programs that are mapped to a directory on the website (like /administration/). They're compiled and placed into the classpath, and then the server executes them in the JVM when a request is made.
      Yes, servlets exist, but I feel that they are closer to CGI than to template- or content-based languages. In order to create any output, you still have to use a print()-like function (haven't done much with Servlets, so I can't remember the actual method used here).

      However, a JSP is automatically preprocessed into a servlet before it's compiled into bytecode, so it actually is a halfway point ;)

      --
      I can't say that I don't give a fuck. I've just run out of fuck to give.
    4. Re:Misuse of an acronym? by jrumney · · Score: 2, Insightful
      Unfortunately, your rant is wrong. CGI is not an interface between a server and a browser at all. That interface is called HTTP.

      CGI is an interface between a webserver and external subprocesses that has been in decline since faster alternatives started appearing in about 1995. While it is technically possible to write CGI programs in Java, PHP or Perl, it is not common. Servlets, PHP, mod_perl and other modern ways of generating dynamic content do not use the CGI interface, instead they have their own more efficient interfaces.

    5. Re:Misuse of an acronym? by jrumney · · Score: 2, Informative
      You can beleive all you want, the HTTP protocol is what defines GET and POST methods, and URL encoding.

      CGI on the other hand defines how that information can be passed to external programs by the server (using environment variables and standard in), and how the external program passes information back to the server to be sent to the browser (via standard out).

      A full specification is available from NCSA

    6. Re:Misuse of an acronym? by cant_get_a_good_nick · · Score: 4, Informative

      CGI is pretty much the oldest method of a web server interacting with outside code, and is kind of the only standard way. The server fork/exec's a process and has the CGI process' stdin, stdout, and stderr are pointing back to the server. The web server passes information either through environment variables (for a GET request) or additionally through the process' stdin (for POST and PUT requests).
      Advantages: very clean, the process goes away after the request, so resource leaks aren't a problem. Very simple interaction, if your programming language understands stdin, stdout, and environment variables (hard to find one that doesn't) you can do CGI with it (though some are better than others obviously).
      Disadvantages: fork/exec for every request. That has some overhead, sometimes more important is that the process can't use persistent state. Resources have to be acquired on every request. Anything with a great deal of overhead, say opening a database connection, has a HUGE impact on the server.

      FastCGI is a pseudo-standard in that it has multiple people implementing it, but it never got approved by, say, the IETF. It's a client server kind of thing, where the first instantiation starts the server up and initializes it. Subsequent requests get sent to the CGI server and get returned. The CGI server never goes away. The cool thing is that there are no startup costs, and you can keep something like a DB pool. Never really used it, so can't comment much.

      Some things just get embedded into the server, like Apache modules. You can write C code and it becomes part of the server itself.
      Advantages: wicked fast. Full access to anything in the server.
      Disadvantages: if your CGI has a problem, it can bring down your server. The API's can be pretty arcane. Also, it's a different API for each web server (NSAPI vs. ISAPI vs....) and even between variants (it's wildly different from Apache 1.3 and apache 2.0).

      mod_perl is kind of like above, it's a perl interpreter bound into Apache. Gives a perl interface to pretty much everything apache has to offer. Not only requests, but configurations.
      Advantages: perl is very flexible, can do a lot of things and use perl modules from CPAN, including templating systems and the such. You're also insulated a bit from stray pointers and such. mod_perl also precompiles all of the perl code, so you don't have the compiler overhead on every request.
      Disadvantages: mod_perl can be huge sometimes, and if you have several mod_perl instances running around, you can eat memory pretty quick.

      Everything else is kind of "do what you want". Tomcat itself has several protocols to talk to Apache; some are just old, some are supported on certain platforms only, yadda yadda.

      That said, the article did confuse a bunch of things, CGI, C, and content management systems. He was pushing an agenda, hopoing that people couldn't see through it.

    7. Re:Misuse of an acronym? by Vengeance · · Score: 2, Informative

      Servlets and .JSPs become the same thing behind the scenes, but they are conceptually two different things.

      (The following assumes that developers are using a reasonable architecture, rather than ad-hoc throwing together of technologies)

      To take an MVC view of this, servlets are primarily used for the controller portion of an application. As straight Java code written to a specific interface, they provide a natural linking point between .JSP view components and JavaBean model components. Typical use is one or a few controller servlets, which receive requests from .JSP forms and invoke logical operations in JavaBeans or Enterprise Java Beans. You want to avoid putting VERY much logic into the servlets, instead deferring processing to the business object layer.

      A framework like Struts enforces this by providing you with a pre-packaged controller servlet, which you extend with a combination of classes that are servlet-like and XML to bind 'em all together.

      --
      It was a joke! When you give me that look it was a joke.
  3. hmm... by gyratedotorg · · Score: 5, Insightful

    instead of "why cgi is evil," maybe this article should have been named "why you should buy our product."

    --
    Gyrate Dot Org - "Where high-tech meets low-life"
  4. This is a press release by legLess · · Score: 5, Insightful

    Notice how the author compares CGI unfavorably with something he calls DMF? Here it is, and it looks like one of the flagship products of this company. Imagine that.

    He's setting up a straw man, then claiming that his own (proprietary, for-profit ... not that there's anything wrong with that) solution is better. When he says "CGI" he's talking about something that few people use for anything but toys. Slashdot (e.g.) uses the Perl CGI module, but runs it under mod_perl, thus obviating most of his arguments (CGI is slow, must be compiled at run-time, and has no access to the web server internals). Slashdot, again, uses a templating system, thus taking care of his second argument (programmers must copy-paste HTML into their code).

    Both these problems have been solved for over 5 years, yet he's trying to make it sound like his beautiful DMF is the first to even discover them. *Yawn* - another press release day on /.

    --
    This isn't as much "normalization" as it is "don't take so many drugs when you're designing tables."
  5. Bunk! by HRbnjR · · Score: 4, Insightful

    This is bunk! Pure FUD!

    CGI is slower?? I write CGI in C++. Compiled C++ is fast. It's on par with interpreted Perl or PHP execution speed at the very least... more likely vastly faster. And this myth about process spawn time is starting to bother me. Linux can spawn processes VERY fast. Threads on Linux have traditionally been implemented as a special form of process anyhow. And even if this was a problem...the author mentions the solution, FastCGI - but seems to somehow ignore that it obliviates his whole point!

    What people really like to ignore when developing on Windows is COM, Apartment Threading, and the whole process model. When you call a COM object in an ASP page or whatever, you are crossing process boundaries - all arguments are martialled through COM by VALUE. All these ASP programmers that create a COM object to use, for example, MSXML have obviously never tried creating a LARGE DOM tree. Let me tell you, it does NOT scale. Compiling all my code into a single CGI allows me to keep everything in the same process space, and vastly improves performance when things get large.

    And who the hell debugs using printf? For one, I like CGI because it's easy to launch one directly from GDB! Ever tried attaching a debugger to a thread for your process inside a web server? HA! GDB lets me easily script the piping of a file to stdin of my CGI. If you are still using printf, you have more problems in learning about programming than will be solved by not using CGI.

    Now, if your application is heavily template based, then yes, PHP definitely makes more sense than CGI!!! The other has a point in that you shouldn't be embedding HTML in your C code. Which brings me to my last beef...

    Their product is about using pre written features rather than writing them yourself as you need to with CGI? Uh, DUH!! There are like umpteen billion CGI LIBRARIES out there!!! I happen to like GNU CGICC. It does everything, form uploading (mime and file uploads too), cookie handling, templating, etc - and it works with FastCGI too! Write it yourself??? As if! And I have no problem linking in libraries for database access, and everything else under the sun (Boost, etc) into my CGI, just like I would link them into any other program. Who the hell writes software without using any libraries?

    This article is basically a bunch of FUD just to sell their product. You can safely ignore the whole thing!

    1. Re:Bunk! by Slorf · · Score: 2, Interesting

      Agreed, totally bunk. If you're actually doing what they suggest, cutting and pasting blocks of someone else's html into your code and/or mixing static pages and code sections, then yeah, you might be inefficient. However, most intelligent CGI programmers will write functions/methods to output the majority of the content, and then supply the data to those functions. If you write your presentation code in this manner, you can put it in a library and call that function again and again from different programs to get a consistent look and feel.

      Of course, as the previous poster implies, there are a lot of pre-written libraries out there as well, and not only in C but in Perl, Python, Ruby, and every other langauge that someone has used for building CGI programs. Templating is fine for those who really need to separate the presentation from the business logic, but if you're doing both parts it is far easier to maintain one pure CGI code base that handles the presentation through well thought out functions.

  6. Hey man ... by torpor · · Score: 5, Funny

    ... back off printf();

    I'm with you on everything else though bro'!

    --
    ; -- the corruption of government starts with its secrets. a truly free people keep no secrets. --
  7. CGI is not slow, and mySQL DB connections are fast by daviddennis · · Score: 2, Interesting

    I've been using C-based CGIs for years - once I started doing it and got used to coding everything in C, and developing my own libraries and what-not, it was very difficult to change.

    Seems to me the overhead's pretty low compared to the competition. People argue for things like JSP as more efficient and cleaner than CGI, but I've never seen a JSP web site that's not dog slow.

    When I started reading that opening a database connection had enormous overhead, I got worried that I was really doing something dumb. So I wrote a program to open and close a mySQL database 1000 times and it took a total of 2 seconds. That's 0.002 seconds per open/close combination.

    Then I stopped worrying and went back to work. I'd rather be right in practice and wrong in theory than vice versa.

    D

  8. Re:First argument doesn't stand up by darkpurpleblob · · Score: 2, Interesting
    Fair enough, I didn't reread the title when I viewed the article - would have helped if the full title, "Why CGI is Evil (in the context of embedded systems)" was given in the /. posting.

    I know nothing about programming embedded systems, but surely you could devise and use a very basic/simple templating system in embedded devices. Could anyone out there with experience in programming embedded devices fill me in?

  9. Re:cgi underrated by WoTG · · Score: 2, Interesting

    True, there is a penalty to interpreted code, but a lot of this penalty can be mitigated (in PHP at least). The folks at Zend offer their PHP accelerator, and I use the ion cube (free as in beer) on my puny home server.

    Both of these compile the scripts, and parse, and validate _once_ (per reboot). For every subsequent call to the script, the server just runs the cached copy of the compiled code. So, you get all of the benefits of the scripted language, i.e. speed of coding, and fewer of the downsides.

  10. Debugging with printfs by p3d0 · · Score: 4, Insightful
    Actually, sometimes debugging with printfs is much better than gdb. Two situations occur to me:
    • The problem is nondeterministic. If the failure occurs only once in 100 runs, you probably won't see it when you run it in gdb, especially if it's timing-dependent (in which case gdb may stop it from occurring at all). However, with a proper debug trace, you can just grab the log file when the crash does occur, and do a lot of diagnosis from that (plus possibly the core file if you get one of those).
    • The program state and state transitions are too complex. For instance, try to debug a compiler without a good debug trace facility. It's just not very practical to walk complex IR data structures manually after various optimizations have been performed.
    --
    Patrick Doyle
    I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
  11. Re:But why does the Linux kernel use so many 'goto by aridhol · · Score: 3, Insightful
    When you're initializing a driver, it's generally easier to do something like this:
    int driver_init() {
    if (!init_step_1()) goto error_1;
    if (!init_step_2()) goto error_2;
    if (!init_step_3()) goto error_3;
    return success;
    error_3:
    cleanup_step_2();
    error_2:
    cleanup_step_1();
    error_1:
    return failure;
    }
    Because you can't just exit a failure with half-initialized resources that won't be freed automatically on exit ('cause it won't exit until you shut down).

    Dammit, <ecode> killed my indents. Anybody know how to prevent that?

    --
    I can't say that I don't give a fuck. I've just run out of fuck to give.
  12. Internal Server Error by mikey504 · · Score: 2, Funny

    Premature End of Script Headers

  13. Re:But why does the Linux kernel use so many 'goto by Mr.+Slippery · · Score: 3, Insightful
    Because you can't just exit a failure with half-initialized resources that won't be freed automatically on exit ('cause it won't exit until you shut down).
    Right. Basically, well-used C gotos provide the functionality that a try-catch block would in C++. Like many C constructs, they can be dangerous in the hands of the ignorant and elegant in the hands of the wise.
    --
    Tom Swiss | the infamous tms | my blog
    You cannot wash away blood with blood