Common Lisp: Inside Sabre

← Back to Stories (view on slashdot.org)

Posted by chrisd on Tuesday January 15, 2002 @09:07PM from the little-inventive-scheme-programmers dept.

bugbear writes "I just got permission from the author (Carl de Marcken of ITA Software) to publish this email, which describes the inner workings of Sabre, the flight search software that the airlines and travel agencies use. It is a case study in cheap Linux/Intel, NT/Intel and Hpux boxes replacing mainframes, and also the use of lisp and other languages in a server-based app. Update: 01/16 13:45 GMT by H :RawDigits writes "Common Lisp: Inside Sabre - correction. The Lisp engine is used by Orbitz, and not Sabre. Sabre still maintains mainframe systems for their booking. I should know, I am sitting in the Orbitz NOC right now ;)"

10 of 227 comments (clear)

Min score:

Reason:

Sort:

Re:Lisp without GC! by sarcast · 2002-01-15 21:57 · Score: 4, Informative

Why is GC too slow in Lisp when there are years of experience behind it?
It is not that the garbage collection is too slow in Lisp, he gave the reason that the amount of data that it had to go through was very large. The point of the system was to be as speedy as possible and garbage collection would slow that down no matter how much or how little data you gave it to process. If you look at real-time processing projects, none of them (to my knowledge) employ a garbage collector because that would take up valuable resources.
They made a wise decision to keep the garbage collection to a minimum so that the actual searching process would be all that was running on the boxes.
Look at the big boys. by digitalunity · 2002-01-15 22:32 · Score: 3, Informative

The Cray T3E weighs in at up-to-3-TFLOPS; depending on number of processors. Of course, this machine costs over $10,000,000.

For something a little more practical and realistic, the extremely-fast yet value priced Compaq AlphaServer rings in at 47 GFLOPS.

Granted, FLOPS aren't a very good judge of speed for this application, but they are easy stats to find. If you really want a standardized test, take a look at the TPC-C stats for the fastest cluster machines in the world. These more accurately reflect the kind of performance stats you're looking for in relation to this article.

--
You can't legislate goodness. Let each to his own destiny, by will of his freely made choices.
This is talking about Orbitz, not SABRE itself by Ryu2 · 2002-01-15 22:54 · Score: 5, Informative

SABRE != Orbitz. The author's company ITA software writes software that the ORBITZ site uses to answer queries against the same flight/fare dataset that SABRE and the other CRSes use, provided by the airlines.

Think of the various systems, SABRE, etc. as just different systems that are using more or less the same amalgamation of airline-provided data.

SABRE, and the other CRSes themselves are still running the big iron mainframe stuff, not LISP or Linux, and will likely remain so for a long time.

--
There's 10 types of people in this world, those who understand binary and those who don't.
Re:I hadn't realized... by entrox · 2002-01-15 22:57 · Score: 5, Informative

Lisp doesn't need to be slow at all. You're thinking of the old 70's Lisp, which was usually interpreted and ran slowly. Today's Lisp implementations can also be compiled in addition interpreted, which results in a big performance boost (lagging only slightly behind C, but faster then Java). Commercial Lisps capable of compiling are for example Allegro CL and LispWorks.
This isn't limited to the commercial ones: CMUCL and SBCL do also compile to native code. The compilers are optimizing (you can choose between variying degrees of Speed, Safety, Debugability and Compile Speed) and you can even enter Assembler code or disassemble single expressions.

--
-- The plural of 'anecdote' is not 'data'.
Re:Power by digitalunity · 2002-01-15 23:41 · Score: 3, Informative

It depends on what exactly the machine is for. The Cray I linked to has a peak system bandwidth of 136 GB/sec! Of course, this is for the entire system. Bandwidth between adjacent system boards is +600 MB/sec. You cannot do this with commodity PC's even when clustered with Gigabit nics. These are definitely for people wih serious needs. You also get the benefits of a single system image and linearly addressed memory for the entire system. However, if you have an obviously parallel need, like that stated in this article, a cluster is an ideal solution. You can spend less money and get the same result.

--
You can't legislate goodness. Let each to his own destiny, by will of his freely made choices.
Re:Impressed, but... by jea6 · 2002-01-16 01:58 · Score: 3, Informative

From the FAA (http://www.faa.gov/aircodeinfo.htm#multiple):

Metropolitan Areas with Multiple Airports

These codes don't specify single airports but whole areas where more than one airport is situated.

BER Berlin, Germany
BGT Baghdad, Iraq
BHZ Belo Horizonte, MG, Brazil
BUE Buenos Aires, Argentina
BUH Bucharest, Romania
CHI Chicago, IL, USA
DTT Detroit, MI, USA
JKT Jakarta, Indonesia
KCK Kansas City, KS
LON London, United Kingdom
MFW Miami, Ft. Lauderdale and West Palm Beach, FL, USA
MIL Milano, Italy
MOW Moskva, Russia
NRW airports in Nordrhein-Westfalen, Germany
NYC New York, NY, USA
PAR Paris, France
OSA Osaka, Japan
OSL Oslo, Norway
QDV Denver, CO, USA
QLA Los Angeles, CA, USA
QSF San Fransisco, CA, USA
RIO Rio de Janeiro, RJ, Brazil
ROM Roma, Italy
SAO Sao Paulo, SP, Brazil
STO Stockholm, Sweden
TYO Tokyo, Japan
WAS Washington, DC, USA
YEA Edmonton, AB, Canada
YMQ Montreal, QC, Canada
YTO Toronto, ON, Canada

--

sarchasm: The gulf between the author of sarcastic wit and the person who doesn't get it.
Re:Rule 1 of Efficient Lisp: Lisp is not functiona by redhog · 2002-01-16 02:21 · Score: 3, Informative

Functional programming vs. imperative programming have nothing to do with efficiency. At least not run-time efficiency.

For an example of an (very basic, and done by all LISP implementations, and even partly by some C compilers, e.g. gcc) optimization, see "tail recursion optimization" in your nearest LISP-implementation documentation.

As yoy state, some problems are "inherently" imperative, and some are functional, to their nature, but that has more to do with how easily their solution is formalised in either formalism, not how well that solution is then executed.

But I think one should not emphasis that property as much as that LISP does have a garbage collector, which C does not have. It does not _allow_ for you to have neither memory leaks, nor crashes due to multiple free()'s of the same location. And it doesn't have any sprintf() that can produce buffer overruns. But OK, then you could use Java :) Java is just LISP with objects and fancier syntax...

Also, as you state, lambdas and higher-order-functions are central to what's good about LISP.

--
--The knowledge that you are an idiot, is what distinguishes you from one.
Some "semi-official" comments by cracauer · 2002-01-16 05:36 · Score: 5, Informative

Hi,

I am working for ITA and like to comment on some issue brought up here:

1) As said, we talk Orbitz here, and not SABRE. Currently, Orbitz
uses our software for domestic US flights, not for international.

2) Our engine does not use a functional programming style, rather the
opposite. Still, we found that Lisp is a great advantage. While
each hacker here has own preferences why he/she likes Lisp, key
elements (I see) are:

2a) macros, especiallly macros that allow us to define new iteration
constructs. C programmers can thing of being able to write their own
for/while/if as seem appropriate for the task as hand. Especially
with-[whatever] constructs, but also nice tricks with
destructuring-bind.

2b) scope, working annonymous functions with static scope. Kind of
Java's inner classes but in 1/10 of the codelines.

2c) said destructuring-bind which frees your from a lot of boring and
error-prone tasks of tree parsing with a snap.

2d) compile-time computing, a key element to make our software fast
without cluttering it up by expensing manually written source code by
a factor of 100 or by inventing ad-hoc code generators which need to
be debugged after they broke your system for weeks. Macros that can
use the full language at compile time and macros that can "walk" their
argument when passed at compile-time to find interesting things to do
with them. Also see define-compiler-macro to get an idea what makes
Lisp code fast while maintaining elegance (use with care, though).

2e) safety. A language without optional overflow checking of integers
is a toy at best and dangerous at worst.

2f) debugging and testing with the read-eval-print-loop (REPL). Like
the gdb prompt for evaluating code, but you can use the native
language and you have the full language. Or better like a shell where
thing's aren't echoed in ASCII and need to be re-parsed, but you get
the real objects you can play with (send message as defined in your
system). The debuggers in Allegro and CMUCL are rather crappy, IMHO,
but the REPL and ultra-fast re-compilation and loading of single
functions (standard feature of every Lisp) -used for debugging print
statements- make more than up for that.

Keep in mind that everyone of our Lisp hackers can contribute a Lisp
of similar length, this is just what *I* like.

For the record, I like C++, but I couldn't absord all the application--specific knowledge I need while spending my day figuring out C++ specialities and keeping them swapped in. C++ is for full-time C++ coders only.
Re:Rule 1 of Efficient Lisp: Lisp is not functiona by ToLu+the+Happy+Furby · 2002-01-16 06:10 · Score: 5, Informative

I'd like to see somone post a couple of brief examples of things that were well-suited to Lisp (and would be much more difficult in C) - anyone have anything handy?

If you're interested is LISP, you should take a look at Paul Graham's excellent ANSI Common LISP, a wonderfully written introduction to LISP which is nonetheless a decent resource which can almost replace the much heftier Steele. If you're not sure you want to spend the cash, the first couple chapters are online.

In this very small, very chatty book for beginners with not too much code, Graham nonetheless manages to include examples such as a ray tracer (90 lines of code); a program to dynamically generate HTML pages (119 lines of code; this program (very much expanded, but without a single rewrite) now powers Yahoo! Stores); and a complete, seperate object-oriented language with multiple inheritence (89 lines; but a much more powerful OO language, CLOS, is already included with Common LISP). The last two in particular would be impossible to do as quickly or easily in C.

A much bigger LISP book I happen to have at the moment is Peter Norvig's Paradigms of Artificial Intelligence Programming : Case Studies in Common Lisp, which includes a whole lot of impressive and/or historically interesting examples, including ELIZA, STUDENT (solves algebraic word problems) MACSYMA (symbolic integration ala Mathematica), a Prolog interpreter and compiler, a Scheme interpreter, an optimizing LISP compiler, a natural language grammar parser, and a couple other things. I just finished (well, turned in...) a project which extended Norvig's code to play the game Othello, also from this book, to use trained neural nets (which unfortunately didn't train all that well). The coding part of this was made darn easy by the fact that Norvig's Othello function takes as inputs two functions which provide the move-selection strategies for black and white respectively--something that can't be done in a language without functional closures.

I certainly wouldn't want to do any of these in C; although all of them could be so done, it would only be at the cost of a good deal of length, functionality and elegence.

In general, LISP is great for anything involving GOFAI (good old fasioned AI, i.e. non-stochastic), anything that needs to generate hierarchically nested text (e.g. HTML, XML, or LISP programs), anything that needs to be written quickly (or LISP can be used as a rapid-prototyping language), any sort of interpreter, or for any time you wished you could modify the available programming languages to build one that really suits your problem. LISP is also great for extending existing programs, which is why almost every user-extensible application uses a dialect of LISP to do the job. (e.g. emacs, AutoCAD, etc. No, VB macros for Word don't count, although it is noteworthy that LISP is useful over such a wide range of programming tasks as to be a replacement for VB and C.)

What is LISP bad at? Well, its libraries can be rather weak and nonstandard (although ANSI Common LISP itself comes with a large array of useful functions); GUI stuff, multithreading, and networking all fit in this category and are often implementation specific. (Of course, this is nothing to do with the language itself but just with what tools are available.) Its use for really low level bit-twiddling stuff is somewhat awkward. Iteration in LISP suffers somewhat from being only a little bit more powerful than iteration in C; the upside is you can still combine it with all the other great stuff in LISP, but the downside is that the parenthisis-style syntax, which is so much better for writing macros and functional code, only clutters up iterative code.

And, certain of the most powerful features of LISP, like macros and closures-as-first-level-objects, take a bit of experience to wrap your mind around, as does the functional programming paradigm. (LISP does not in any way require functional programming; it's just that while there are other languages as good as LISP at iterative code and arguably as good as LISP at OO code, there is nothing as good for functional code.) This is usually taken to mean that LISP is only suitable for CS students and AI researchers, because ordinary programmers are too dumb to get this stuff. I'm just a CS student, and I haven't had much experience with how dumb ordinary programmers are or aren't, but intuitively I think this argument is bunk.

Personally I think these techniques are just new things to learn; subtle and powerful, sure, but so is simple recursion the first time you learn it and every programmer knows how to use that. Indeed, once you understand recursion well, functional programming and function closures are not very large conceptual leaps at all. Sometimes the mechanics of lambda closures can be slightly tricky, but no more so than referencing and dereferencing pointers in C, and with a lot greater payoff. Hell, the most complicated uses of functions as objects in LISP are a lot easier to get right IMO than even simple uses of templates in C++, and "templates" (i.e. generic funcions) come for free in LISP, due to runtime type checking. (Of course, this is why no one uses C++ templates, but whatever.)

Macros are difficult to write. But then again, they are incredibly powerful, and not "necessary" very often. And it's usually *extremely* easy to understand someone else's macro code, which is all a novice would have to do anyways.

Plus there are lots of features of LISP which make it incredibly easy for beginners. Debugging in LISP is ridiculously easy, at least for programs which don't use too many functional closures or complex objects. Instead of the C paradigm where you only have one big executable main(), LISP programs are made up of lots of little functions, all of which are callable (and thus extraordinarily easily debuggable) from the top-level evaluator. There's no write-save-compile-test-debug loop; it's all together, and all very fast. Immediate feedback means more willingness to take chances, try out things, and make mistakes.

Plus, because there's no main(), your programs are always extensible. If you want to, once you're done with a function it's trivial to make a larger function which calls the other function, takes it as an input, etc.

There is no need to manage memory, no need to futz around with pointers, and no way to cause a segfault until you start optimizing. Buffer overflows are impossible. You can start with a skeleton of a program, gradually add functionality, and only add optimizations at the end when you have tested your code; and you can test every new function or optimization, so you know exactly what goes wrong when something does.

And it's fast: once you put in proper optimizations, compiled LISP is nearly as fast as C. Of course this wasn't always the case, and it's not the case for LISP before you put in type declarations. And a compiled LISP file will probably be bigger than compiled C code, especially when you add the LISP top-level eval to it. On the other hand, C is usually not as fast or small as well optimized assembly code, but there is a good reason very few people program in that anymore: because programming in C makes your code less buggy and much faster to develop. Similarly, programming LISP will almost always make your code less buggy and much faster to develop than using C. Now that compiler technology and computer hardware have made those differences almost moot, it probably makes much more sense to use LISP than C.

Of course, the result of this change has not been to drive more people to LISP, but instead to drive LISP's features into other languages. Thus we have C++ with attempts at generic functions; Java with decent OO and automatic garbage collection; Python showing the usefullness of an interactive top-level. Nowadays Perl and Python are getting functional closures and the list datastructure, although their functions are not quite first-level objects and so not quite as powerful. Plus it will probably take another prefix-syntax language for macros to be copied properly.

Whether the world will realize that LISP already exists (and indeed has since the late 50's) or continue to reinvent it, I dunno. Probably the latter so long as LISP remains short of libraries that tie it down to modern computers. (Again, GUIs, multithreading, networking.) Still, it's probably worth learning LISP just so that when the same ideas come out in more "mainstream" languages years from now you'll already know and understand them.
Re:Lisp without GC! by cracauer · 2002-01-16 09:28 · Score: 3, Informative

[I work for ITA]

A real-time GC is not in the least what we need. A real-time GC makes the pauses go away. We don't care about the pauses, we're not interactive. But the real-time GC creates an even bigger overall CPU- and run-time overhead. While the visible pauses go away, the application is slower.

That is the basic problem about many of these discussions: there is no magic bullet. There are GC schemes more or less suitable for some tasks, but not for others. It is tradeoffs, guys.

BTW, at my former employer I had a system written in C with parts using the Boehm GC. It was faster than the malloc/free variant (the application's part could switch at compile time). The Boehm GC would screw ITA's system royally, though.

We could very well write a GC specially coded for our application, but that is a lot of work. And since we can do without system-visible memory allocation when answering search requests, we prefer to that and we hide the data bulk from the GC when cleaning up the systems in-between activities.

GC is a hard problem, just as manual memory management is. For none of these you can buy or download a perfect solution.