Slashdot Mirror


Open Source Search Engine Benchmarks

Sean Fargo writes "This article has benchmarks for the latest versions of Lucene, Xapian, zettair, sqlite, and sphinx. It tests them by indexing Twitter and Medical Journals, providing comparative system stats and relevancy scores. All the benchmark code is open source."

3 of 62 comments (clear)

  1. Re:k by eldavojohn · · Score: 5, Insightful

    Nothing else to say, really

    Really? Am I the only person that found it interesting that Lucene, the only non C/C++ implementation, gave some pretty impressive stats? I mean, it's written in Java and although it has a slower index time its search time, index size and relevancy are impressive.

    I may have to poke around in the Lucene code after work tonight to figure out what kind of strange majick those Apache developers employ. Hopefully I'll walk away with some extra spells in my bag.

    --
    My work here is dung.
  2. Re:k by Lord+Grey · · Score: 5, Informative

    Really? Am I the only person that found it interesting that Lucene, the only non C/C++ implementation, gave some pretty impressive stats? I mean, it's written in Java and although it has a slower index time its search time, index size and relevancy are impressive.

    Lucene is a great search tool. As TFA pointed out, however, if you're looking for a "search solution" rather than "search engine" then you should check out Solr instead. Lucene is a toolkit that you build on top of, not something you really want to deploy by itself. Solr is that thing built on top of Lucene.

    Be aware that while Lucene/Solr has made terrific progress, it is not quite in the "enterprise search" category. For superscale implementations you'll still likely need to look at a high-priced product like FAST.

    --
    // Beyond Here Lie Dragons
  3. CLucene by drac667 · · Score: 5, Insightful

    All the other search engines except lucene are written in C/C++. Why didn't Vik Singh test also CLucene (http://sourceforge.net/projects/clucene/)?

    Here is the CLucene's description on SourceForce: "CLucene is a C++ port of Lucene: the high-performance, full-featured text search engine written in Java. CLucene is faster than lucene as it is written in C++."