Taco Bell Programming
theodp writes "Think outside the box? Nah, think outside the bun. Ted Dziuba argues there's a programming lesson to be learned from observing how Taco Bell manages to pull down $1.9 billion by mixing-and-matching roughly eight ingredients: 'The more I write code and design systems, the more I understand that many times, you can achieve the desired functionality simply with clever reconfigurations of the basic Unix tool set. After all, functionality is an asset, but code is a liability. This is the opposite of a trend of nonsense called DevOps, where system administrators start writing unit tests and other things to help the developers warm up to them — Taco Bell Programming is about developers knowing enough about Ops (and Unix in general) so that they don't overthink things, and arrive at simple, scalable solutions.'"
A big problem with shell programming is that the error information coming back is so limited. You get back a numeric status code, if you're lucky, or maybe a "broken pipe" signal. It's difficult to handle errors gracefully. This is a killer in production applications.
Here's an example. The original article talks about reading a million pages with "wget". I doubt the author of the article has actually done that. Our sitetruth.com system does in fact read a million web pages or so a month. Blindly getting them with "wget" won't work. All of the following situations come up routinely:
That's just reading the page text. More things can go wrong in parsing.
Even routine reading of some known data page requires some effort to get it right. We read PhishTank's entire XML list of phishing sites every three hours. Doing this reliably is non-trivial. PhishTank just overwrites their file when they update, rather than replacing it with a new one. (This is one of the design errors of UNIX, as Stallman once pointed out. Yes, there are workarounds they could do.) So we have to read the file twice, a minute apart, and wait until we get two identical copies. Then we have to check for 1) an empty file, 2) a file with proper XML structure but no data records, and 3) an improperly terminated XML file, all of which we've encountered. Then we pump the data into a MySQL database, prepared to roll back the changes if some error is detected.
The clowns who try to do stuff like this with shell scripts and cron jobs spend a big fraction of their time dealing manually with the failures. If you do it right, it just keeps working. One of my other sites, "downside.com", has been updating itself daily from SEC filings for over a decade now. About once a month, something goes wrong with the nightly update, and it's corrected automatically the next night.
8 ingredients, no. i've worked at a taco bell, there's a few more then that. this is most Hot Line: beef, chicken, steak, beans, rice, potatoes, red sauce, nacho cheese sauce, green sauce (only used by request), cold line: lettuce, tomatos, cheddar cheese, 3 cheese blend, onions, fiesta salsa (pico de gallo, the same tomatos and onions mixed with a sauce), sour cream, gaucamole, baja sauce, creamy jalapeno sauce. plus 5 kinds/sizes of tortillas (3 sizes of regular, 2 sizes of die cut) nacho chips, etc etc here's an interesting fact, those Cinnamon Twists you may or may not love? they're made of deep fried rotini (a type of pasta, usually boiled)
nobody's perfect