Slashdot Mirror


Building Scalable Web Sites

briandon writes "It's not a step-by-step guide (and doesn't claim to be one), but Building Scalable Web Sites is the closest thing available to a nuts-and-bolts look at managing the technical aspects of doing a Web-based startup. There's lots of code inside, but the book isn't built around a single, extremely contrived, case study like an online wine store. Instead, most of the chapters follow a general pattern: a topic (like bottlenecks in your application and platform, scaling, or monitoring) is addressed and some rules of thumb that describe the way that the author feels things should be done are set forth and explained, with lots of very specific hints and factoids mixed in along the way. Tools for other languages (in most cases, Perl) are mentioned in passing, nearly all of the code snippets are in PHP. MySQL 4.1 is the basis for most of the database-centered material." Read the rest of Brian's review. Building Scalable Web Sites : Building, Scaling, and Optimizing the Next Generation of Web Applications author Cal Henderson pages 330 publisher O'Reilly Media, Inc. rating 9/10 reviewer Brian Donovan ISBN 0596102356 summary If you've been kicking around the idea of doing a Web startup, then you should definitely give this book a read.

Henderson's resume, which can be found on his personal website, indicates that he joined Ludicorp about a year before they shut down GNE, their Web-based roleplaying game, to focus on Flickr (which had originally begun as an ofshoot of the game) and it's his role as web development lead at Ludicorp that led to the inclusion of the "The Flickr Way" sub-subtitle that runs diagonally across the upper right corner of the book's front cover.

The five-page-long first chapter sets the stage for the rest of the book with section headings that are all questions: "What is a Web application?", "How do you build Web applications?", "What is architecture?", and "How do I get started?".

Chapter two, "Web Application Architecture", begins with Henderson drawing an analogy between a web app and a type of multi-tiered dessert known as a trifle - the sponge cake at the bottom of the dish is the database, the next layer up, jell-o, is the business logic, and so on. The black and white image in the text is identical to the color image included in a slide from an eight-hour workshop that the author gave in San Francisco titled "How We Built Flickr". Having read the book and some reviews of his workshops and looked at the list of talks on Henderson's site (some with Powerpoint decks for download), it seems likely that a lot of the ideas expressed in the book were developed over an extended period of time through repeated presentations.

Next up are the considerations around development environments, beginning with a 3-point list of guidelines for building small-scale web apps up into big ones: use source control, have a one-step build process (literally, if possible, a single button), and track bugs (as well as non-bug items like features and support requests). Readers get to feast their eyes on a cropped screenshot of Flickr's build control panel (two buttons, "perform staging" and "perform deployment", to match the last two steps in the release sequence in an HTML form). For small teams, the author is in favor of allowing multiple developers to trigger releases and he suggests several ways of trying to keep that workable. In version control, Subversion gets the nod and, though no bugtracking tool is singled out as the best, FogBugz garners the highest praise ("extremely effective") and has the shortest list of "cons". The author never comes out and says what the Flickr / Flickr-Yahoo team uses in either area, however.

Chapter four is the most readable introduction to internationalization, localization, and Unicode that I've seen up to this point. MySQL's currently incomplete implementation of UTF-8, sarcastically referred to by some as "UTF-7½" (Google for it), is mentioned in enough detail that a reader can decide whether or not it's likely to be an issue for their app. The book as a whole is packed with little nuggets of information like that - things you might not have otherwise been even peripherally aware of until they bit you.

Input filtering and strategies for avoiding building cross-site-scripting and SQL injection vulnerabilities into your app are addressed in a chapter on data integrity and security that shows the same attention to detail as the rest of the book. The section on UTF-8 filtering, for example, features a three-way benchmark of UTF-8 validation techniques (using regular expressions, iconv, and ord()) and the merits of each approach are considered.

The coverage of handling emails programmatically in chapter six is also quite good. Henderson does the basics and then delves into a number of possible pitfalls in considerable detail. The salient aspects of the TNEF (media type application/ms-tnef) format, used by MS Outlook for attachments and metadata, for instance, are explained and pointers are given to open source TNEF parser implementations. I also got a lot out of the section on dealing with email from wireless devices like mobile phones, titled "Wireless Carriers Hate You" (there's that dry British wit again).

The second half of the book (chapters seven through eleven) focuses more on scalability. It's also where you'll find the most material on using MySQL, including but not limited to query profiling and optimization, a discussion of the merits of denormalizing once you begin to reach a certain scale, and a comparison of the different MySQL backends. There's an entire chapter devoted to finding and dealing with bottlenecks - how to determine whether your app is CPU-bound, I/O-bound, or context-switching-bound and what to do about it. The chapter on scaling begins by debunking the "scaling myth" (but he actually tackles several misconceptions at once - namely that scalability is synonymous with speed, that scalability is a byproduct of having written your app in Java, etc.) before getting into vertical vs. horizontal scaling (buying more powerful and expensive servers vs. adding more cheap cheap servers), load balancing, and more. Monitoring (both of web stats and your application itself) and APIs (RSS/RDF/Atom feeds, mobile content delivery formats like WAP and XHTML mobile, and REST/XML-RPC, and SOAP Web services) both get chapters of their own.

Henderson's sense of humor is evident throughout the book, but not in the annoying overly cutesey way that made me want to toss "Extreme Programming Installed" into the circular filing drawer. In the section on software interface design (where he means the interfaces between the layers of the trifle), for example, there's a "Web Application Scale of Stupidity" that places "sanity" in the center and OGF (one giant function) and OOP at the extremes. The process of separating web app logic from presentation is broken down into 3 steps: separating logic code from markup, splitting the markup into per-page files, and moving to a templating system. He closes out the chapter with a breakdown of the hosting, hardware, and networking issues involved in serving up web apps.

Technically, I think that Building Scalable Web Sites is 100%. There were just a few niggling flaws. Two dates given (both on page 155), 1990 for the creation of libxml and 1995 for the design of XML-RPC, are incorrect and I spotted a handful of grammatical mistakes (probably proportionately fewer than in this review) that I've already submitted, along with the date mistakes, as errata through the form linked from the O'Reilly catalog page for the book.

Additionally, though the cover does say "The Flickr Way", you won't find many sentences that begin "At Flickr, we [...]". Aside from the "Rolling Your Own" section in chapter seven describing some custom middleware and a protocol that they whipped up for moving files around within their system, there aren't a lot of explicit details about the way that Flickr operates in the book. You'll actually get more insider info from Tim O'Reilly's "Database War Stories" entry regarding Flickr, which is based on Henderson's answers to questions posed by O'Reilly, than from this book.

If you'd like to get a feel for Henderson's style, chapter five ("Data Integrity and Security") is available as a PDF on the O'Reilly catalog page for the book and Henderson has also put some articles online (all PDFs, not much overlap with the material in BSWS) at his website.

You can purchase Building Scalable Web Sites : Building, Scaling, and Optimizing the Next Generation of Web Applications from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

124 comments

  1. It's not a real... by _PimpDaddy7_ · · Score: 5, Funny

    It's not a real book on "Building Scalable Web Sites" unless there's a chapter titled :
    "Preparing for Slashdotting: Burn, Baby, Burn" ;)

    1. Re:It's not a real... by rtconner · · Score: 1

      How about they name it "Building Non-Scalpable Web Sites"

      --
      023AD01("Child", "Evil");
    2. Re:It's not a real... by hotdiggitydawg · · Score: 4, Funny

      It's not a real book on "Building Scalable Web Sites" unless there's a chapter titled :
      "Preparing for Slashdotting: Burn, Baby, Burn" ;)


      Cisco inferno?

    3. Re:It's not a real... by NcF · · Score: 1

      No, that would be a five (5) volume set entitled "Preparing Your Website For Hell: The Slashdot Effect"

    4. Re:It's not a real... by PHPfanboy · · Score: 2, Funny

      More likelihood of Dell freezing over

      --
      29 mpg. YMMV.
    5. Re:It's not a real... by Andrew+Kismet · · Score: 2, Funny

      The Slashdot Effect

      To mass flames, yes! One hundred servers high
      Nerds gettin' loose y'all gettin' down on the root - Do you hear?
      (the geeks are flaming) Users were screamin' - out of control
      It was so entertainin' - when the server started to explode
      I heard somebody say

      Burn, baby burn! - Cisco inferno!
      Burn baby burn! - Burn that motherboard
      Burn, baby burn! - Cisco inferno!
      Burn baby burn! - Burn that motherboard
      Burnin'!

  2. Re:I know they get kickbacks, but... by truthsearch · · Score: 2, Insightful

    Maybe /. doesn't want to support Amazon due to their stance on patents. Both Amazon and B&N have affiliate programs. B&N has less (if any) software patents.

  3. Third rule of scalable, reliable, websites: by Anonymous Coward · · Score: 5, Informative

    The underlying architecture doesn't matter as much as most people think it does.

    I can write you a scalable, reliable website using MS-Access. It will be slower and require a lot more code, but it will scale and remain up.

    The whole "You use platform X, so your app won't scale" has been proven wrong by many large companies running large apps for almost every platform.

    To reply to your flames, I'm currently finishing up an educational web app using PHP 5 and MySQL/Cluster 5. Redundant servers in the datacenter with load balancing, backup datacenter with a valid dataset always within a minute of the primary (specs allowed this), and it is *very* scalable. We hammered this with all manner of stress tools, very rarely had a problem. Added another server to the cluster, and went 5x beyond our max projected usage.

    I prefer PHP/MySQL, have done ColdFusion, ASP, JSP, Postgres, MSSQL, and Oracle. Each has a cost/benefit that needs to be evaluated. Most projects, though, the platform just doesn't matter so much. PHP/MySQL examples are generally easy to read by everyone and work well for examples in this book.

    1. Re:Third rule of scalable, reliable, websites: by Lao-Tzu · · Score: 3, Informative
      I can write you a scalable, reliable website using MS-Access. It will be slower and require a lot more code, but it will scale and remain up.

      You're absolutely correct. However, the easiest way to build a scalable website with MS-Access would be to start caching the entire contents of the database in the application layer. At this point, you might as well be using a flat text file for a database, since the database engine is not solving any problems for you. It's not contributing to developing your application. It's not useful.

      Any good discussion of building scalable websites should start with discussing tools. You can make do with a heavy rock, but when you have the choice to use a hammer instead...

      Just as an example, let's take your educational web application. Kudos for getting it working well. Would you like to port it to Access, with the same availability requirements? If there was an ideal DatabaseDbSQLServer software that could meet the same requirements with half the hardware, would you consider using it?

      The point I'm trying to make is that the underlying architecture does matter. It won't matter to the user when the application is done, but it does matter to people who are Building Scalable Web Sites, the title of this book.

    2. Re:Third rule of scalable, reliable, websites: by Anonymous Coward · · Score: 0

      (Posted anon, well, because.)

      You bring back painful memories. At one point I was doing consulting work for a ... large resource extraction company which shall remain nameless. Their main corporate memo distribution facility was transitioning from a fax tree to a web site (categories for various business units, subscription profiles so you'd get auto-emailed with headline summaries if something changed, etc.), with a projected daily user load in the million-ish range (give or take a couple hundred kilo-users). ASP/nt4/iis4/MS ACCESS DATA FILE. :(

      Oh, and it had to be a proto content management system too because the content-injecting users wanted the ability to modify things on the fly, edit past posts, etc. In other words, it had to not only handle a lot of reads from the RO users but concurrent, transactional RW activity from the higher-privs set of users.

      Calling it ugly would be a stunning understatement, but we got the bastard to fly somehow. I can only thank god in retrospect that they didn't mandate vbscript ASP, at least we got to use something else for the server-side language.

    3. Re:Third rule of scalable, reliable, websites: by wdr1 · · Score: 1

      I can write you a scalable, reliable website using MS-Access. It will be slower and require a lot more code, but it will scale and remain up.

      Is that true? I ask not to be snarky, but because of my ignorance of Access. To scale past one machine, it would have to have replication, which I thought was the point at which MS wanted you to move to SQL Server?

      -Bill

      --
      SlashSig Karma: Excellent (mostly affected by moderatio
    4. Re:Third rule of scalable, reliable, websites: by ben+there... · · Score: 1

      He's pretty much full of it. I've built small databases (100,000 recs, 100-400 peak users) in Access for internal use and after 50 or so concurrent connections you get extreme sluggishness. Moving the backend from Access to SQL Server brought the search engine's speed from 5-20 secs to 0.02 secs, without any other changes.

    5. Re:Third rule of scalable, reliable, websites: by ultranova · · Score: 1

      To scale past one machine, it would have to have replication, which I thought was the point at which MS wanted you to move to SQL Server?

      Not neccessarily. Not all solutions require a single coherent database where all changes are instantly visible to everyone. An online shop, for example, could have multiple database machines serving the catalog and simply have the catalog manager propagate any changes to them all. This means that some of the machines are serving the new catalog a few minutes earlier than others, but it won't make any difference. Similarly, the order system could pick an "order processing server" at random. In order to keep the inventory coherent, make it have one server which allocates certain number of inventory items to each order processing server, and when that server has used them up (or a certain time has passed, or the inventory server asks for it), it reports to the inventory server to get more.

      Of course this is really just programming replication / caching by yourself, instead of using a database feature for it, but it is possible.

      --

      Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

  4. PHP and Industry by crnbrdeater · · Score: 4, Insightful

    All my personal sites plus all my side contracting sites run on LAMP.

    I really enjoy working with PHP but...do a search on any tech job board and you will find all two job openings for people with LAMP experience. Embarrassed to say it but I went out and learned ASP.Net/C# so I could make a living.

    I realize there are VERY large PHP/MySQL site out there but I haven't had that many opportunities to scale a PHP app in a commercial environment. I wonder how many full time PHP developers there are out there and how many of those work on enterprise level websites. Can't be that many can it?

    (Perhaps we never see these types of openings(LAMP) because developers are so happy with their job that new positions rarely open - heh)

    --
    ~CrnbrdEater
    1. Re:PHP and Industry by suggsjc · · Score: 5, Insightful

      Or...could it be because the LAMP sites don't need to continually add new developers?

      One of the reasons that you don't find openings specifically looking LAMP experience is probably because of "the right tool for the right job" and large scale sites aren't going to use strictly LAMP or any specific architecture, instead a mix of tools. Also, large scale sites will probably want people for specific tasks (each aspect of LAMP indivudually) instead of a jack of all trades.

      --
      When I have a kid, I want to put him in one of those strollers for twins and then run around the mall looking frantic.
    2. Re:PHP and Industry by crnbrdeater · · Score: 1

      Ok. forget LAMP. Do the same search for jobs that require PHP as the primary language. Again just a handful. Normally when you see PHP is at the end of a long list of random technologies. You know,"Must be an expert in Java, C++, ASP.Net, COBOL, COM. Also useful to have experience with PHP and PERL."

      --
      ~CrnbrdEater
    3. Re:PHP and Industry by truthsearch · · Score: 4, Informative

      *raises hand*

      My company creates very large sites with LAMP. We also do Python, Flash, etc. so it's not because we only know PHP. If you don't see the job postings it's for one of 2 reasons: you're looking in the wrong places; or the jobs are largely found by networking and other methods.

      Large financial companies will find developers through job posting sites and head hunters. These companies usually develop on commercial platforms (.NET, websphere, etc.). But large web sites are usually owned by relatively small companies who use more networking and direct contact with open source developers.

      PHP and MySQL are quite capable of running large web sites. They were not created with large scale in mind, however, so there are special considerations you need to keep in mind. I don't recommend it for every lage site, but in the right situations it works.

    4. Re:PHP and Industry by cartel · · Score: 1

      I may be flaming here...but IMHO ASP/C# (i.e., Microsoft) is not the way to go for developing web sites - same thing with JSP. Object-oriented is just not necessary for most things on the web. Especially when you use a front-end to develop your HTML/XHTML...that produces some real crappy HTML - and it relies on Javascript/JScript for functionality, which is BAD.

      And the Microsoft stuff is not made to be portable. Of course you have things like Mono, but PHP, Perl, and Python are designed with portability.

    5. Re:PHP and Industry by drinkypoo · · Score: 4, Insightful
      Object-oriented is just not necessary for most things on the web.

      you do realize that for most people, non-OO vs. OO largely boils down to replace(mystring, "foo", "bar") vs. mystring.replace("foo", "bar"). (Whether that's actually correct syntax in any language is another discussion.) You don't have to program in an OO manner just because you're using an OO-capable language.

      With that said, why wouldn't you want to do OO? It's highly useful in a web environment, especially since we tend to think in terms of objects on pages anyway, even if we mean something slightly different by "object".

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    6. Re:PHP and Industry by crnbrdeater · · Score: 2, Insightful

      Depends. Code behind files really help seperate the logic from the display. If you use the built in data readers and such you do rely on the ASP.Nets built in javaScript functions but that is no way to build high demand sites. We end up not using many of ASP.Net's built-in front-end features because of performance reasons.

      While your comment about OO for front-end stuff is true most of our business is back-end processing. The website itself is just a pretty front for all the work that is being done behind the scenes. With 100+ developers it is helpful to use OO.

      Right or not most large companies prefere MS over OSS.

      --
      ~CrnbrdEater
    7. Re:PHP and Industry by ukpyr · · Score: 3, Interesting

      To add emphasis to the "best for the job" approach

      We are about to move from a LAMP environment (which we are happy with, I made it!) to a Java enviroment. Why? Because we are about to start developing a VERY complex product that would be unpleasant to manage in PHP. There is nothing wrong with PHP, there is nothing wrong with perl. Heck, there is nothing wrong with running BASIC programs and naming them with a .cgi extension. If all that's happening is your serving up some vacation photos randomly.

      That's kind of the cool thing about web services and the whole "SOA" thing: fast turn around, one-shot development can achive ROI easier with tools like PHP. Using the service idealogy, PHP can fit very well into a larger, vastly complex framework. AKA best of both worlds. Just don't go and build a massive framework and hope to have an easy time of it. (Not saying it can't be done)

      It's all got to do with context.

    8. Re:PHP and Industry by cartel · · Score: 1

      One of the things I really don't like about Microsoft is that they're way of doing things is not following standards (i.e., not ones set by them). What Microsoft likes to do is corrupt good things by permeating them with their own extensions, then people get used to them and it corrupts the original way of doing things. For example, I heard that Microsoft is going to be providing MySQL functionality in the future (in Office or Visual Studio).

    9. Re:PHP and Industry by cartel · · Score: 2, Insightful
      you do realize that for most people, non-OO vs. OO largely boils down to replace(mystring, "foo", "bar") vs. mystring.replace("foo", "bar"). (Whether that's actually correct syntax in any language is another discussion.) You don't have to program in an OO manner just because you're using an OO-capable language.

      I forgot about that. Maybe it's good for small things like that, but with web sites (unless you decide to use Javascript and make it interactive) you generally don't have the need for objects (i.e., people, cars, etc.). You're more or less just spitting out content, and then you're done. And you also have no event handling going on which requires the use of objects.

      I'm curious though. Besides working with strings, can you give me example of some objects (OO objects, that is) you might have in a normal web page?

    10. Re:PHP and Industry by Anonymous Coward · · Score: 0

      Business Logic? Order, OrderItem, etc??

    11. Re:PHP and Industry by thevoice99 · · Score: 1

      The problem of scalability is language independent. You end up finding that your bottleneck is your database where most of the complex logic is happening. The main things that make sites scalable is caching whats on a disk to memory and having lots of disks for when you do hit them. These are the real bottlenecks and will always be because the slowest part of a of a PC, is hard drives.

    12. Re:PHP and Industry by ramunas · · Score: 1

      well in my website engine I have the Core object, that does the work of parsing the request, the Page object, that is used as a parent for most of the other objects. Database is another object (so that I can use same code for pages that are based on differend DBs with only small changes in SQL). Template is an object. Then there are the objects that inherit from page - eg. a Poll object handles everything poll related, a news object handles posting/displaying of news (and comments). A gallery object, a Forum object. You name it. Web and OOP mix quite well if you do it correctly, that is :)

      --
      ./R My blog
    13. Re:PHP and Industry by cartel · · Score: 1

      I guess I should look into that. What language do you use when you do OO?

    14. Re:PHP and Industry by wzzzzrd · · Score: 1

      I'm curious though. Besides working with strings, can you give me example of some objects (OO objects, that is) you might have in a normal web page?

      session objects for example. used to store things across pages. request objects, used to retrieve properties from the client (ie user agent) and for retrieving parameters. it all boils down to: what is your preferred kind of abstraction? for only dynamic pages, the paradigm does not really matter, but when it comes to real server applications where the html spat out is just a frontend, a kind of view, the oo abstraction is far more useful, at least for me.

      btw, the object oriented objects, that is... i like that one...

      --
      On second thought, let's not go to Camelot. It is a silly place.
    15. Re:PHP and Industry by ramunas · · Score: 1

      This model has been implemented in Java (using servlets) and PHP5

      --
      ./R My blog
    16. Re:PHP and Industry by jbplou · · Score: 1

      Maybe it's good for small things like that, but with web sites (unless you decide to use Javascript and make it interactive) you generally don't have the need for objects

      This is oposite of the truth. OO gains more and more value for larger sites because you benefit from resusable code more and more. All "business logic" is best encased in objects.

    17. Re:PHP and Industry by Kent+Recal · · Score: 1
      Right or not most large companies prefere MS over OSS.


      huh?!
      flickr, slashdot, google, myspace, amazon, delicious, youtube, ebay... all unix.
      who uses MS? microsoft.com?
    18. Re:PHP and Industry by drinkypoo · · Score: 1

      Just FYI all this is quite normal... For instance Microsoft's ASP (which supports any language you can plug in as ISAPI, but I'm using Jscript/ECMAscript) provides you with session, database connection, response, and other objects. When I got here, the website was already running on IIS, and getting anything changed would likely have been a nightmare, so I just went ahead and started learning ASP and Jscript, neither one is that hard except that the ASP documentation is poop (especially for ADO, which you need for database access) and that the Jscript debugger is complete and unmitigated crap. However, I find using the crap debugger less masochistic than using vbscript. w3schools has been my friend.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    19. Re:PHP and Industry by cartel · · Score: 1

      Point taken.

      What about using plain-old functions for the business logic and putting those functions in their appropriate library files? How is OO on a web site any better than this, besides possible naming conflicts?

      For example: security-specific functions could go into a "security" library file, generic functions could go into a "general" library file, page-specific functions could go into a "pages" library file. Then you could do the same with business logic functions that do all your calculations.

      Do you think this is pretty much the same as using an object-oriented approach, except that the functions are encapsulated in files rather than object classes?

    20. Re:PHP and Industry by ArgieNomad · · Score: 1

      Every airline, every bank, every govenrment agency, every healthcare org, every big company you can think of uses proprietary software and languages instead of OSS.

      (yes, it is a generalization, just like the one in the parent assuming 8 web companies comprise "most large companies")

      --
      I just read /. for the sigs
    21. Re:PHP and Industry by Doctor+Memory · · Score: 1

      In the page, I can't think of much, unless you're doing some AJAX-y stuff and want to keep the DOM in memory so you can manipulate/transform it. If you're trying to be scalable, though, you're probably not doing much client-side, the bulk of your work is going to be done on the server, and that's where OOP really helps out. I think Java actually suffers from a surfeit of web frameworks (I've seen "analysis paralysis" set in when teams try to evaluate a few to figure out which is "the best one"), but they're pretty handy once you know one. Common objects would be the HTTP request/response, a hash table for session variables, you'll probably have some kind of form object (assuming you're writing a web app and not a web site), and if you're serious about scalability you'll have some kind of template class for handing your boilerplate and generating the actual web page you return to the user.

      --
      Just junk food for thought...
    22. Re:PHP and Industry by cartel · · Score: 1

      Well that's some good info. I will keep those things in mind. I bought a JSP book a couple months ago thta I haven't looked at too much yet, so I'll look at that.

      How long, might I ask, does it take on average for you - using JSP in the way you said - to develop a basic web site, say one like this one where, in addition to the normal company content, it lets you fill out an order form or send emails via a form (it has about 10 - 12 pages)?

    23. Re:PHP and Industry by Decaff · · Score: 1

      you do realize that for most people, non-OO vs. OO largely boils down to replace(mystring, "foo", "bar") vs. mystring.replace("foo", "bar").

      If that is the case, it is a depressing statement about the current quality of developers.

      The advantages of OOP have been very well researched and understood for nearly 30 years. It is unquestionably a major advance in software development, with relevance to all areas, including websites, allowing re-use, encapsulation, isolation and testing of code (such as, for example, the use of mock objects).

      To read about OOP being labelled as an 'extreme' in any context in 2006 is truly shocking.

      (And also, there is is a big difference between a "scalable" website and a "reliable" website).

    24. Re:PHP and Industry by hazah · · Score: 1

      No, this is not the same. First thing's first, objects are based on classes. What this means is that you can have several objects running in your program, sharing the same code. Second, the internal state of the objects can be very complex, however, you do not need to worry about it when using the object, since you are presented with a simple interface. Third, since having an internal state means it is completely isolated from the rest of the program, management and maintenance of complicated abstraction is a lot simpler.

      HOWEVER, if the objects are not properly designed, you can, and probably will end up with very bad code. Arguably worse then bad procedural code, but I've never had that experience. That is, I did (sometimes still do) write bad OO code, but I hardly ever wrote any procedural code, at all, for serious work.

    25. Re:PHP and Industry by Kent+Recal · · Score: 1
      Every airline, every bank, every govenrment agency, every healthcare org, every big company you can think of uses proprietary software and languages instead of OSS.

      (yes, it is a generalization, just like the one in the parent assuming 8 web companies comprise "most large companies")


      Well, most large corps do indeed use Windows in some areas. Usually on Desktops.
      But "every big company you can think of uses proprietary software and languages instead of OSS." is just wrong.
      Hardly anyone uses windows for critical services for obvious reasons. Your ATM frontend may be running windows.
      The backend machine it talks to will more likely be some mainframe or unix derivate.
    26. Re:PHP and Industry by yerfatma · · Score: 1

      This may be the greatest troll of all times. Here's an object you have in most sites: User.

    27. Re:PHP and Industry by cartel · · Score: 1

      But even there (I assume this stores either permissions or preferences) you could have session variables that are associative arrays rather than objects.

    28. Re:PHP and Industry by Doctor+Memory · · Score: 2, Insightful

      If you're up to speed with a basic web framework (e.g. Struts), doing the basic development for a site like that would take a day or so (less if you have a previous project you can cannibalize). That's all the programming; fighting with the designer to lock down the design and codify the CSS and make it all look pretty will take another week (at least!). Seriously, though, if you have a good designer who can mock stuff up in HTML and write the CSS, a couple of weeks should be plenty of time. That doesn't include round-trips with the client when they have new ideas based on what you show them originally, and it doesn't include end-to-end testing and integration with any of their existing systems (inventory / product catalog / sales).

      From what I've seen, professional web site development proceeds at the client's pace, regardless of the language used. Yeah, you can bang out a prototype faster in some technologies, but 80% of the time is spent reworking the site to comply with changing specifications, and it's really a wash whether it's faster to add or move a form field with PHP, Perl, Ruby or Java. If you're doing "proper" development (separating presentation and logic, using CSS to define look-and-feel, writing modular code), change costs are largely invariant across technologies for simple sites (like the one you pointed out).

      --
      Just junk food for thought...
    29. Re:PHP and Industry by kchrist · · Score: 1

      Not to nitpick, but MySpace very famously runs Windows. It was developed in Cold Fusion and was later ported to some sort of CF/.Net hybrid.

    30. Re:PHP and Industry by ArgieNomad · · Score: 1

      Mainframe or unix derivate, that's why I wrote "proprietary software" and not just Windows.

      --
      I just read /. for the sigs
    31. Re:PHP and Industry by Bogtha · · Score: 1

      First thing's first, objects are based on classes.

      Not necessarily. Object-oriented languages do not necessarily have any concept of "class". For example, you can use JScript for server-side scripting with ASP. JScript is an ECMAScript (JavaScript) implementation, and as such, is a prototype-based object-oriented language not a class-based object-oriented language.

      --
      Bogtha Bogtha Bogtha
    32. Re:PHP and Industry by Kent+Recal · · Score: 1

      Okay you have a point. I read too much "microsoft" into your post. ;-)

    33. Re:PHP and Industry by Kent+Recal · · Score: 1

      Thanks for correcting me, I didn't know that. But it makes sense anyhow...

    34. Re:PHP and Industry by Pete · · Score: 1
      The advantages of OOP have been very well researched and understood for nearly 30 years.

      Well, the so-called "advantages" of OOP have at least been well propagandised for at least some of the last 30 years. :)

      It is unquestionably a major advance in software development, with relevance to all areas, including websites, allowing re-use, encapsulation, isolation and testing of code [...]

      None of those concepts you mention have anything in particular to do with OOP. And you shouldn't really be terribly surprised that more than a few programmers find OOP evangelism more than a little extreme. See Andrei Stepanov and Paul Graham for a couple of good examples.

  5. Alright, I know this may be flamebait... by confusednoise · · Score: 4, Insightful

    ...but I really can't take any book seriously titled "Building Scalable Web Sites" that explains itself using PHP and mySQL. I know PHP/mySQL have their place but I just don't think of them as industrial strength.

    No doubt there will be multiple posts following to tell me how wrong I am, but that's how I see it.

    1. Re:Alright, I know this may be flamebait... by Anonymous+Crowhead · · Score: 3, Insightful

      Another thing to note: If you are in charge of "building a scalable website", and you do not know how to "build a scalable website" and thus resort to reading a book entitled "building scalable websites", then you should probably not be "building a scalable website."

    2. Re:Alright, I know this may be flamebait... by telbij · · Score: 1
      ...but I really can't take any book seriously titled "Building Scalable Web Sites" that explains itself using PHP and mySQL. I know PHP/mySQL have their place but I just don't think of them as industrial strength.


      Look, you're either going to use PHP or you're not. If you are going to use it then this book probably will come in handy. Hey, it's not like the author is talking out of his ass on this one. Flickr is bigger than probably anything you or I will ever work on. My biggest problem with PHP is the absolute shit code that is found in the open source community, thereby teaching new generations of programmers how NOT to do it. Beyond that, I consider the language a little watered down, but if you can use PHP 5 then its really a fairly capable language, especially if your benchmark is Java.
    3. Re:Alright, I know this may be flamebait... by Anonymous Coward · · Score: 3, Interesting

      ...clearly an ignorant statement. There are a LOT of people, myself included, who are the sole web developer (or a member of a small team) for a company or organization that all of a sudden find themselves having prepare a scalable site. In my case it's a university lab and we now have more hardware to serve our data analyses applications that are being developed in-house. One server works fine as long as you only have a small number of people accessing data, but once the project begins to grow there needs to be some amount of scaling built into the site architecture, no matter how small. Sure, I'm never going to have to worry about high-traffic sites like Slashdot or Digg, but it doesn't negate the fact that scalability is needed.

    4. Re:Alright, I know this may be flamebait... by PornMaster · · Score: 3, Informative

      Having read (and enjoyed) the book... despite using PHP for the examples, there's relatively little dependent on PHP in the text. This isn't a "write really fast PHP code" book. It's about designing systems and process instead of just a web site. It's about setting things up in a way that they'll be maintainable, and you won't have hogtied yourself by putting the logic and the HTML together. It mentions the importance of defining a coding style, whatever that is, so when you have a bunch of developers, there will be consistency... and that the choice of style isn't as important as defining one.

      There's lots missing still... and the long focus on unicode, localization, etc is a bit tedious to get through... but overall, it's a book that I wish that people at $WORK were forced to read.

    5. Re:Alright, I know this may be flamebait... by budgenator · · Score: 2, Insightful

      If what you are really after is an iron-clad, explosion proof, enterprise scaled, industrial strength web application where any down time result in horrendous lose of revenue; your not going to grab PHP/MySQL of the shelf and run with it any more than you are going to grab anything off the shelf and run with it. The bottom line will always be a well designed and coded application connecting to a mid-level database on commodity hardware will out preform the best database running on the best hardware when the connecting application is poorly designed and coded. When the stakes are high every commponent needs to be critically analysed and tested; nothing gets a free pass based on he-said she-said or I think.

      --
      Apocalypse Cancelled, Sorry, No Ticket Refunds
    6. Re:Alright, I know this may be flamebait... by drinkypoo · · Score: 1
      Beyond that, I consider the language a little watered down

      My problem with it is that it has no reason to exist. It's basically a fucked over version of perl, down to some of the syntax, but since it's an exceptionally pathetic replacement for perl, I have to wonder what they were thinking. Instead of writing the Zend script engine, they could have been using perl.

      It's not that perl is the ultimate language or anything, but websites are text, and perl is good at mangling text. Plus, CPAN makes PEAR look more like the stem.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    7. Re:Alright, I know this may be flamebait... by dubl-u · · Score: 1

      but I really can't take any book seriously titled "Building Scalable Web Sites" that explains itself using PHP and mySQL. I know PHP/mySQL have their place but I just don't think of them as industrial strength.

      Flickr is the 41st most used website in the world, so scoff at your peril. And I say that as a heavy OO guy who does a lot of Java work.

      I've seen Cal's day-long presentation that appears to have been turned into this book. He's very smart, and has done great work making the technologies suit their needs. He has proven that you can scale these technologies up to handle massive traffic while building a lot of neat stuff along the way. If the book is half as good as his presentation, it's well worth it.

      My main scaling concern with this approach is instead that I don't see how they'll scale up the number of developers. Database-centric apps end up mainly being about the care and feeding of your database servers, and Flickr's no exception to that. PHP seems to be just fine for page rendering, but I haven't seen anything to convince me that it's a good general-purpose language like Java, Ruby, or Python. So with databases screwing up your abstractions on one end and PHP limiting them on the other, I'd worry that the code base is on the long-term path to being both rigid and fragile.

      So with Flickr, my real measure for them will be whether they can keep innovating like they did in their first couple of years. If not, I'll suspect that the two-tier PHP/MySQL architecture is at fault.

    8. Re:Alright, I know this may be flamebait... by Kent+Recal · · Score: 1

      And perl is even faster than PHP in most situations - if only by a small margin.

      I guess the main reason everybody uses PHP is because, well, everybody else uses PHP.

      Perl just doesn't have the ease of "just drop your .php file into docroot" nor an established, ready to go
      "build a website in 3h" framework like ruby-on-rails.
      To get decent results with perl you need not only some intimate knowledge of the perl-language itself
      but also a pretty good idea of how a webapp (or even "servlet engine") smells from
      the inside because, basically, you'll be writing one from scratch.

      Anyways, as I was taught recently, real men write their webapp in LUA. ;-)

  6. Re:I know they get kickbacks, but... by rtconner · · Score: 1

    and overstock is cheaper still

    --
    023AD01("Child", "Evil");
  7. Re:I know they get kickbacks, but... by Anonymous Coward · · Score: 1, Interesting

    The Free Software community decided a long time back that Amazon is not threatening anyone with its patents. GNU ended its boycott four years ago.

  8. Well, you're not alone by Dekortage · · Score: 2, Insightful

    Your opinion is shared by many... but see this other post on Slashdot for my response.

    --
    $nice = $webHosting + $domainNames + $sslCerts
  9. Apache's mod_cache by tcopeland · · Score: 1

    This was mentioned in Rich Bowen's excellent lightning talk and is listed as "experimental" in the Apache 2.0 docs but as an "extension" in the Apache 2.2 docs. Anyone have experience with this? Seems very tweakable...

    1. Re:Apache's mod_cache by Anonymous Coward · · Score: 0

      We've only just started using it, but it's handy for keeping the load off the database for when you get Slashdotted or whatever. Even with dynamic content you can usually cache for a few seconds without serving stale responses, and if you do that, then when a single page gets hammered, it'll all be coming out of main memory instead of the database, reducing the hits to your database to one every few seconds.

      The disk cache is also handy for when you are generating relatively expensive content that changes infreqently - you can get rid of your application-specific caching and all you have to worry about is setting the HTTP headers correctly.

    2. Re:Apache's mod_cache by tcopeland · · Score: 1

      > Even with dynamic content you can usually cache for a few seconds without serving stale responses

      That's an excellent point... I was thinking about the front page of RubyForge, and caching that for 30 seconds or so shouldn't inconvenience anyone. Maybe I'll give that a whirl. Thanks for posting!

  10. Nothing about Caching? by zigamorph · · Score: 3, Insightful

    Without any caching the M in LAMP quickly becomes a bottleneck.

    1. Re:Nothing about Caching? by Haeleth · · Score: 1

      Without any caching the M in LAMP quickly becomes a bottleneck.

      Don't be silly. Everyone knows LAMP stands for "Linux, Apache, Machine code, and PostgreSQL". You aren't going to tell me my hand-assembled Apache module is a bottleneck, are you?

    2. Re:Nothing about Caching? by ultranova · · Score: 1

      Don't be silly. Everyone knows LAMP stands for "Linux, Apache, Machine code, and PostgreSQL". You aren't going to tell me my hand-assembled Apache module is a bottleneck, are you?

      In all likelihood, yes it is. Doing assembly by hand means that you're forced to use your time for dealing with low-level optimizations, and therefore don't as much time to perform high-level ones than you would if you'd use a higher level language.

      Apart from this, a hand-assembled module myst be re-optimized for a new processor type manually, while a C one can simply be recompiled. This means that upgrading your machine doesn't neccessarily get you as much of a performance benefit than you'd get otherwise. There might be issues with upgrading Apache and PostgreSQL (and its libs) as well.

      Finally, in a typical browser -> apache -> database -application the module doesn't do any heavy computing; the majority of the load is used by the database, interprocess communication between Apache and PostgreSQL, and HTTP parsing by Apache. Therefore, if you feel like doing low-level optimization, you might want to concentreate on the OS kernel - I/O scheduler is absolutely critical for database performance, after all. Also, PostgreSQL seems to be quite picky on which conditions it uses indexes or not, so checking your SQL (with the "explain" PSQL command) and optimizing it is likely to yield far greater performance benefits than hand-optimizing the middle module.

      And yes, I understand that you were joking, but there's always people who think that using lower-level languages is the solution to all performance problems; just look at any story mentioning Java.

      --

      Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

  11. Re:Second rule of scalable, reliable, websites: by iamdrscience · · Score: 3, Funny

    Your both wrong, the first and second rules of scalable, reliable websites is "Do not talk about scalable, reliable websites".

  12. Related Titles? by Anonymous Coward · · Score: 0

    This text seems to balance general philosophy with LAMP implementation. Anyone got suggestions for other related books, preferably with a .NET or Java specific bent?

  13. Here's an article on actual scalability by Anonymous Coward · · Score: 1, Insightful
    1. Re:Here's an article on actual scalability by Anonymous Coward · · Score: 0

      You didn't just quote an article that was written in January 2003 did you? You must be new here.

    2. Re:Here's an article on actual scalability by Tumbleweed · · Score: 1

      Did you happen to notice the year that was published? Is this going to be relevant 3.5 years later?

  14. It's amazing how many people break these rules by Reverend528 · · Score: 2, Insightful

    As we all know, MySQL databases and the PHP (Personal Home Page) language can't be used for building robust enterprise apps (there's not even an enterprise version of PHP). It baffles me to see people building large reliable websites with these technologies. I'm very tempted e-mail their webmasters and ask them how they are defying logic.

    1. Re:It's amazing how many people break these rules by ukpyr · · Score: 2, Insightful

      ideal != workable

      I work in Perl (mod_perl), PHP, and Java on a daily basis. For simple one-shot applications or very narrow focused projects (see your examples) PHP works fine and is a fairly speedy tool to use. When you introduce an enviroment like and Intranet or interacting with non-mysql database, complex procesess, interacting with (SHOCK!) non-web applications, Java has a huge advantage due to it's rigid structure. For complex, larger team applications or groups of applications, PHP falls short very quickly.

      Best tool for the job rules the day.

    2. Re:It's amazing how many people break these rules by crnbrdeater · · Score: 1

      <RecursiveLoop>
      PHP(PHP(PHP(PHP(PHP...: Hypertext Preprocessor): Hypertext Preprocessor): Hypertext Preprocessor): Hypertext Preprocessor)
      [infinity]
      </RecursiveLoop>

      --
      ~CrnbrdEater
    3. Re:It's amazing how many people break these rules by Anonymous Coward · · Score: 0

      It baffles me to see people building large reliable websites with these technologies.

      The straw house in the "The Three Little Pigs" was a perfectly functional house.

    4. Re:It's amazing how many people break these rules by Reverend528 · · Score: 1
      The straw house in the "The Three Little Pigs" was a perfectly functional house.

      Well, the wolves have yet to blow down wikipedia.

    5. Re:It's amazing how many people break these rules by Anonymous Coward · · Score: 0
      | The straw house in the "The Three Little Pigs" was a perfectly functional house.


      Well, the wolves have yet to blow down wikipedia.


      Ah, I see... it hasn't happened yet, therefore it never will.

    6. Re:It's amazing how many people break these rules by taosystems · · Score: 2, Insightful

      You're confusing webapp data with commerical data. Sites like wikipedia.org and friendster.com don't need to keep multi-indexes on multi-tables, they only need to update ONE record at a time, with little interaction involved with other data tables. And most of the data stored is static, and combined with simpler db fetches, makes for a faster site than you think. And let's not forget about caching the compiled php code itself. It may not be for the purist, but for the job, it works well.

    7. Re:It's amazing how many people break these rules by Reverend528 · · Score: 1

      Security is not inherent to a platform or language. A brick house is no safer than a straw one if the door locks don't work.

    8. Re:It's amazing how many people break these rules by WWWWolf · · Score: 1
      Well, the wolves have yet to blow down wikipedia.

      And here I am, speedy-deleting stuff every day! Thanks, I was kind of wondering why I was compelled to do that... =)

    9. Re:It's amazing how many people break these rules by acb · · Score: 1

      Wasn't it originally Perl Hypertext Processor or Perl Home Page, being based on/forked from an early version of Perl (hence the similar syntax)?

    10. Re:It's amazing how many people break these rules by crnbrdeater · · Score: 1

      yeah. PHP started as a collection of PERL scripts called "Personal Home Page Tools" or something. Of course, this was just a precursor to PHP as its own scripting language.

      --
      ~CrnbrdEater
  15. Re:I know they get kickbacks, but... by BionicPimp · · Score: 1
  16. depends on where you look Re:PHP and Industry by StandardDeviant · · Score: 2, Interesting

    Heh. Funny that you should say that, given that my last two jobs involved (among other things) building Really Big web apps using LAMP for places like Dell (internal but vast amounts of data and traffic) and Dun and Bradstreet (publically facing information service that was frequently in the top 500 busiest sites according to alexa, for what their stats are worth). I know we weren't alone in those projects, either. It really depends on where you are looking, different cities have vastly different characters. If you're in a place with lots of startup/R&D/academia, you'll see a higher % of listings using open source toolchains. If you're in a place where most of the openings are from "traditional"/old-line businesses (say shipping or insurance) you'll see a much higher % listing of things like AS/400, MSFT, etc. Java's the one toolchain that seems to do a reasonable job of thriving in both of those environments (Tomcat/Jboss for the startups, WebLogic for the megacorps, etc. etc.).

    So fwiw if you're looking to do large corp/site LAMP work all I can say is look around a bit more. You may end up moving to a different metropolis, but sometimes that's a big win from the career/job pool standpoint anyway.

  17. Save $6.80 by buying the book here! by Anonymous Coward · · Score: 0

    Save yourself $6.80 by buying the book here: Building Scalable Web Sites. And if you use the "secret" A9.com discount, you can save an extra 1.57%! That's a total savings of $7.20, or 22.83%!

  18. Here's an article ACTUALLY MENTIONING PHP by warith · · Score: 5, Informative

    "PHP just can't cut it"?

    Um, care to explain just what in the hell that statement is based on, since the article you linked doesn't even mention PHP? It compares different webservers and cache settings. Differences in programming languages don't even enter into it.

    Here's an article on scalability that's actually relevant to PHP, a case study about Digg.

    Conclusion:

    "It turns out that it really is fast and cheap to develop applications in PHP. Most scaling and performance challenges are almost always related to the data layer, and are common across all language platforms. [...] There is simply no truth to the idea that Java is better than scripting languages at writing scalable web applications. [...] it just isn't true to say that PHP doesn't scale, and with the rise of Web 2.0, sites like Digg, Flickr, and even Jobby are proving that large scale applications can be rapidly built and maintained on-the-cheap, by one or two developers."

    1. Re:Here's an article ACTUALLY MENTIONING PHP by Anonymous Coward · · Score: 1, Informative

      So, if PHP is scalable, why is Digg so painfully slow? Seriously, if I open Slashdot (Perl) and Digg (PHP) side by side, I can read about five Slashdot stories and all the 5-rated comments in the time it takes Digg just to start displaying the first page.

      You can't blame the hardware. Scalability is about not needing to keep throwing hardware at the problem...

    2. Re:Here's an article ACTUALLY MENTIONING PHP by warith · · Score: 2, Insightful

      "So, if PHP is scalable, why is Digg so painfully slow?"

      Well, as clearly identified in both the article and the quote I provided, all the scalability issues they encountered were related to their DATABASE LAYER. So my first guess (based on this case study) would be that Digg's database architecture is still inferior to Slashdot's, instead of a knee-jerk condemnation of PHP. YMMV.

      "Scalability is about not needing to keep throwing hardware at the problem..."

      Wrong... scalability is precisely about the ability to meet increased demand gracefully with modest increases in resources (usually hardware).

      Scalability does NOT mean, "the ability to handle increasing load with the same resources".

    3. Re:Here's an article ACTUALLY MENTIONING PHP by mrobinso · · Score: 1

      There are so many factors that determine the scalability of any given
      platform. Making a roundhouse remark like that is, well, painful.

      I've had no trouble at all tweaking an Apache/PHP install so that 450 requests
      per second, including a db handle to mysql for each, doesn't even pop top over
      20% overhead.

      I've seen tons of LAMP installs where PHP was compile with every bloody extension
      under the sun when only a handful were needed. Slow 7200 rpm ide disks. Lack of
      ram. Poorly tweaked apache installs. Literally, all kinds of reasons.

      PHP scales quite well, as does MySQL and Apache.

      The only REAL scalability issue I've seen is dumb admins making stupid,
      very avoidable mistakes.

      --
      -- Karma whore? You betcha. --
    4. Re:Here's an article ACTUALLY MENTIONING PHP by warith · · Score: 1
      There are so many factors that determine the scalability of any given
      platform. [...] I've seen tons of LAMP installs where PHP was compile with every bloody extension
      under the sun when only a handful were needed. Slow 7200 rpm ide disks. Lack of
      ram. Poorly tweaked apache installs. Literally, all kinds of reasons.

      PHP scales quite well, as does MySQL and Apache.


      Exactly... PHP itself is rarely, if ever, any sort of bottleneck. I've never really seen a webapp that wasn't I/O bound in some form or another, unless it's doing something wickedly fun like image processing.

      There are plenty of valid criticisms against PHP, but I've yet to see one that clearly demonstrates poor scalability at the language level. (Which is actually why I was interested in the OP in the first place, only to be disappointed that his link didn't have anything to do with PHP or language scalability whatsoever)
    5. Re:Here's an article ACTUALLY MENTIONING PHP by kashani · · Score: 1

      Slashdot was slow when it was launched back in the day. I remember when the first story or two got over 100 comments and all hell broke loose on the site.

      kashani

      --
      - Why is the ninja... so deadly?
  19. Prepare for massive PHP bashing in 3, 2, 1, ... by Qbertino · · Score: 3, Interesting

    Knock it off allready.
    I've had enough of the eternal Dimwits constantly bashing this or that with "MySQL not scalable" "PHP not scalable", blablabla.
    PHP has arrived in the enterprise market. That's a fact. Yes, I know, Java has been there for 8 years, PHP is messy and quirky (so is Perl), MySQL isn't a DB, we've heard it all before.
    In case you haven't noticed: PHP 5 is out. It's a full blown, mature PL and arguably the 400 pound gorilla of SSI solutions with a long history. MySQL 5 is out aswell. It's a full blown DB and comes with tons of free x-platform admin and design tools that make building the outline of a large webapp a walk in the park and thus scares the living daylights out of Oracle and IBM. You may have noticed IBM virtually giving their DB2 away for free (beer) since just a few months ago. Guess how that happend.
    Imagine someone would come along and tell you that large-scale webapps in Perl are a pipedream. Not to far-fetched in this context, no? And what about slashdot and kuro5hin?

    PHP is as good a technology as any other in use when it comes to building large webapps (point in case: www.rubyonrails.org/index.php/ ). Industry strength PHP Frameworks are poping up left, right and center and other places like mushrooms after the rain. And as for MySQL "not being ready for large, scalable apps" - you're being silly.

    --
    We suffer more in our imagination than in reality. - Seneca
    1. Re:Prepare for massive PHP bashing in 3, 2, 1, ... by Anonymous Coward · · Score: 0

      PHP may be popular (hey, so is Java), but that doesn't make it the best tool. That just makes it the tool preferred by the masses. Perl is still the best tool. Sure, perhaps even more than other languages, Perl seems to invite poor coding practices from naive programmers. But in capable hands, it is the most effective and flexible tool available. Good, modern, best-practices perl web coding always uses some form of templating to enforce a seperate of code and html, whereas PHP seems to be designed to mix the two (with ugly results).

      And as for MySQL - it definitely got better with v5, but its still a toy compared to the less-popular PostgreSQL.

      The masses don't choose the best solution, they choose the solution with the lowest barrier to entry combined with the best marketing.

    2. Re:Prepare for massive PHP bashing in 3, 2, 1, ... by nuzak · · Score: 1

      > PHP has arrived in the enterprise market. That's a fact. Yes, I know, Java has been there for 8 years

      And the two of them together are even better.

      --
      Done with slashdot, done with nerds, getting a life.
    3. Re:Prepare for massive PHP bashing in 3, 2, 1, ... by drew · · Score: 2, Informative

      MySQL 5 is out aswell. It's a full blown DB and comes with tons of free x-platform admin and design tools that make building the outline of a large webapp a walk in the park and thus scares the living daylights out of Oracle and IBM.

      That may be, but when they release MySQL 12, I still won't be using it if it's still written by the same developers that claimed for years that adding referential integrity to a database just slows it down and programmers should be handling that in their application code, implemented "transactions without atomicity", insert statements that return the value of the "inserted" ID whether the insert succeeeded or not, and have otherwise generally demonstrated for at least the last 8 years that they know nothing at all about designing a reliable relational database. (Or maybe I'm just scarred for life from having to truncate corrupted tables once a week the last time I was responsible for maintaining a MySQL database.)

      PHP is a decent platform on the other hand (when combined with a competent database), although I find the language itsef to be rather quirky. Maybe they've improved this in PHP5, which I haven't used very much. Personally, I prefer JavaScript as a server side scripting language, but the only platform I know that has that as an option is ASP. It would be nice if PHP would go the ASP route as far as separating out the scripting language from the rest of the platform. Then you could still use PHP's strengths but not be tied to one language. For example, people who like Ruby but just don't get the hype behind Ruby on Rails could still have a really great web development platform.

      --
      If I don't put anything here, will anyone recognize me anymore?
    4. Re:Prepare for massive PHP bashing in 3, 2, 1, ... by Anonymous Coward · · Score: 0

      That may be, but when they release MySQL 12, I still won't be using it if it's still written by the same developers that claimed for years that...

      I've got exactly the same reservations about MySQL. I posted a similar comment a while back, and one of the MySQL developers admitted that their attitude was all fucked up.

    5. Re:Prepare for massive PHP bashing in 3, 2, 1, ... by SecondHand · · Score: 1

      And what about slashdot...

      They get ./-ed all the time.

    6. Re:Prepare for massive PHP bashing in 3, 2, 1, ... by BlueYoshi · · Score: 1
      you may want to check this wikipedia article http://en.wikipedia.org/wiki/Server-side_JavaScrip t

      by the way i love javascript a lot and i ve played a little with rhino in a servlet you can do great stuff:

      versatility of javascript + library and framework of java

      --
      "Use cases are fairy tales..." I. S. 2005
  20. You are stupid. by Anonymous Coward · · Score: 0

    Yes, people can make high traffic sites in PHP. Notice how those sites are incredibly simple, served almost entirely from cache, and could run off of a single machine instead of several dozen had it been written by people with brains.

    1. Re:You are stupid. by Reverend528 · · Score: 1
      Yes, people can make high traffic sites in PHP. Notice how those sites are incredibly simple, served almost entirely from cache, and could run off of a single machine instead of several dozen that would be required if it was written in Java.

      If people can make high-traffic sites that run on a single machine, then why on earth would they want to add the complexity of making it run off of several machines? I fail to see how simplicity and ease of use are design flaws.

    2. Re:You are stupid. by Kent+Recal · · Score: 1
      If people can make high-traffic sites that run on a single machine, then why on earth would they want to add the complexity of making it run off of several machines? I fail to see how simplicity and ease of use are design flaws.


      Because otherwise your game will end once your load exceeds the capacity of the biggest single machine that you can afford.
      Get a quote from Sun/IBM for one of their bigbabies and you'll know what I mean.
    3. Re:You are stupid. by Anonymous Coward · · Score: 0

      Simple answer, redundancy.

      If our website goes down, we lose hundreds to thousands of dollars per minute. If we manage to avoid even a few minutes per year of downtime by having more machines, those machines have paid for themselves.

      A side bonus is that having multiple machines makes upgrades easy, just take half the machines out of rotation, roll out new code, restart the webserver, swap who is online, and repeat on the other half of your servers.

    4. Re:You are stupid. by Reverend528 · · Score: 1

      Well, just for the record, wikimedia is very open about their server configurations, and (big surprise) they're LAMP solution is multiplexed across many machines. So, before you claim that a technology won't scale, you should make sure there's not a widely known example of its scalability.

    5. Re:You are stupid. by Kent+Recal · · Score: 1

      You either replied to the wrong post or you misread mine...

  21. Re:Wait, FLICKR? by hotfireball · · Score: 1
    > The website that takes FIFTEEN SECONDS to display a photograph

    Better throw away your 14.400 analog modem...

  22. And yet you shouldn't. by Ayanami+Rei · · Score: 1

    1) Mod_perl
    2) FastCGI
    3) FastCGI or a daemon process using Apache2::* to integrate with Apache as a in-any-capacity servlet engine.

    You use these to create idioms for how your cgis handle requests.

    Then you move on to your Object persistance, Session handling... (may I suggest memcached?)
    And you have choices there.

    I guess that's what makes Perl nice in this sense... you can pick and choose from all different parts and put it together how you feel comfortable.
    You can use HTML::Inline or Mason or SimpleTemplates or XSLTs or straight friggin prints for the View portion and they are all on equal footing.

    OTH if you are going to be cutting and pasting code from the web, then by all means, use Zend or Rails or some vertically integrated system. (Well I shouldn't lump rails in there so much but the ActiveRecord thing rubs me the wrong way)

    --
    THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
  23. It's not the best in the world but ..... by crivens · · Score: 1

    It's not the best in the world but I still enjoy tinkering with http://wsframework.sf.net./ The theme is BAD but I'm working on it!

  24. Netspeed by aersixb9 · · Score: 1

    I'm a noob at this, but isn't the only factor in website speed the speed of the websites internet connection? I'm sure dinkySQL, mySQL, and MS SQL can all handle 10 simultaneous DSL clients connecting at 768k...and 768k * 10 is 7Mbits...those above programs can probably handle 100 or 1000 users, although this increases the bandwith usage to 70Mbits and 700Mbits/second...for the price of those connections, can the website be written in any language with any database, and the speed difference could simply be accounted for with a $10k quad-processor raid-5 10k rpm system? Or is that even necessary, since most websites have close to zero processing required and can probably cache all the resources in RAM, and a P2 would be fast enough to host across a 1.4Mbit T-1? Or is this book aimed at people with 10+Mbit connections, who may or may not need 3 or 4 computers to host send their web pages fast enough across the internet connection bottleneck?

    1. Re:Netspeed by Anonymous Coward · · Score: 0

      You're right.

      I should tell the folks up at corporate that we've clearly spent too much money on webservers, load balancers, databases, etc.

      (Note: I work at eBay.)

    2. Re:Netspeed by Decaff · · Score: 1

      I'm a noob at this, but isn't the only factor in website speed the speed of the websites internet connection?

      No, it isn't. From personal experience, many commercial websites require pretty complex queries for each customer, which may require significant calculations and searching of many thousands (or hundreds of thousands) of records (all within a single transaction). If this is the case, and you have many thousand active users, the database is the slowest factor. Really high-load websites often use complex data caching techniques, which work across clustered machines, in order to reduce to a minimum the need to drop down to the database. This is way beyond the capabilities of PHP.

    3. Re:Netspeed by Anonymous Coward · · Score: 0

      You're right about the way that complex sites use databases and caching.

      You're wrong that this is beyond the capabilities of PHP.

      However it is beyond the capabilities of the average PHP "programmer".

    4. Re:Netspeed by Anonymous Coward · · Score: 0
      Really high-load websites often use complex data caching techniques, which work across clustered machines, in order to reduce to a minimum the need to drop down to the database. This is way beyond the capabilities of PHP.


      I wouldn't hold it against PHP. Replication is the database server's job.
    5. Re:Netspeed by Decaff · · Score: 1

      You're wrong that this is beyond the capabilities of PHP.

      No, I am not. Pure PHP does not have application-scope cacheing. You can't do this kind of stuff simply with session scope.

      However it is beyond the capabilities of the average PHP "programmer".

      Why should it be? This is trivial in other web development languages, such as Cold Fusion.

    6. Re:Netspeed by Decaff · · Score: 1

      I wouldn't hold it against PHP. Replication is the database server's job.

      No it isn't, and if you assume it is you are going to end up with a slow system if you want to really deal with very high traffic. All serious modern high-end web development assumes that you should only drop down to the database when absolutely necessary, and cache data in the middle tier. Take a look at approaches like the very widely used Tangosol Coherence to see what I mean.

    7. Re:Netspeed by Anonymous Coward · · Score: 0

      We may be talking past each other.

      First of all making something cached is (or should be) trivial. So my comment about PHP programmers was meant as a (probably fairly well deserved) insult to the average person who calls themselves a PHP programmer.

      However the truth is that knowing when to cache, and where to cache, is not so trivial. Caching isn't magic, and I've seen sites go down because of a poorly thought out cache.

      Now you're talking about application scope caching. I'm not entirely sure what you mean by that. At Rent.com we cache different things including at all of the following levels:

      - On the browser
      - (Very temporarily) in lightweight Apache children in a reverse proxy configuration
      - Per Apache child
      - Per webserver in shared memory backed by disk (I strongly suspect this is equivalent to what Cold Fusion provides)
      - In memory at Apache startup (so most of it is in shared memory for each webserver on Linux)
      - On webserver disk with mirroring to keep the webservers in sync
      - On an application server that all webservers connect to.

      We have not yet gone to the step of having a read-only mirror of our database which is queried for some things, but we have begun thinking about the day when we may need to do that.

      Figuring out which kind of caching is useful where and when is not so easy. It depends on many things.

      Guess what? None of these caching mechanisms are built into our development language (Perl), and most are not built into our development environment (Mason). I don't know how many of these are built into PHP, but I am confident that if I was using PHP for anything serious I'd be able to figure out how to do all of these, whether it is built in or not. I am also confident that the work it would take to make this easy would be minor compared to the effort of building and maintaining the application that needed it. I'd also be willing to bet money that all of these have already been done in PHP. For instance I can't imagine that a company like Yahoo could use PHP as much as they do without encountering the need for these.

    8. Re:Netspeed by Anonymous Coward · · Score: 0

      I'm still sceptical about your claims.

      However, you do seem to know your stuff so I shall investigate facts instead of reiterating the-world-according-to-me. Thanks for the pointer on that J2EE solution.

  25. Also AJAX needs to be used ... by Anonymous Coward · · Score: 0

    Source - http://www.kaneva.com/channel/channelPage.aspx?com munityId=12834&pageId=13293 [kaneva.com] Tools like ajaxWrite is a web-based word processor that can read and write Microsoft Word and other standard document formats. Anytime you need to open, read or write a word processor file, simply point your Firefox browser to www.ajaxwrite.com and in seconds a full-featured program will be available for you to open, edit, print and save. ajaxWrite has been designed to look like Microsoft Word, making it easy for anyone to start using it without needing to learn a new program. ajaxWrite also handles all the popular document formats so it's easy to share your files and collaborate with your co-workers and friends. Once finished with your document, you can easily save your work right to your hard drive. This keeps you organized and works in the same way that you're already accustomed to.

  26. PHP and MySQL by C_Kode · · Score: 1

    I see all these people bash PHP claiming this that and the other.

    Security: PHP isn't the problem, poor implementation is the problem (the coder) PHP's only hand in that is easily giving you the ability to do it. All languages are dangerous to the security ignorant.

    Slow: PHP can be slow. So can straight C. If you know what your doing, PHP can be blazing fast. There is a reason that so many large companies are picking it up. IBM, Oracle, Yahoo, etc...

    MySQL gets the same treatment sometimes.

    Both of these technologies are tremendous in what they provide. It's said "Knowledge is Power". If you think they're not powerful tools, maybe it's you that lack the knowledge and therefore the power.

    In the end, it's still the best tool for the job. If you rule out PHP and MySQL without looking at them, it's your loss.

  27. Well, that was that then by bytesex · · Score: 1

    "Tools for other languages (in most cases, Perl) are mentioned in passing, nearly all of the code snippets are in PHP. MySQL 4.1 is the basis for most of the database-centered material."

    I somehow thought that this was about building serious webapps for serious companies (i.e. the ones with the money to create scalable infrastructure). The semi-last sentence in the write-up killed all that. I have a title for other editions in the same series: 'Using legos to build skyscrapers', or 'Building scalable rockets using play-doh'.

    Seriously though - such books mustn't focus on one technology or the other. I'm no java-fanboy myself (per se, that is; I like java). But scalability is much better documented using abstraction, divided in all of its relevant parts; networking, hardware, operating systems (lots of it should be here) and software. And not even mentioning php or mysql. Describing scalability in this way immediately erases all your claims of being a professional.

    --
    Religion is what happens when nature strikes and groupthink goes wrong.