Slashdot Mirror


Building a Fast Wikipedia Offline Reader

ttsiod writes "An internet connection is not always at hand. I wanted to install Wikipedia on my laptop to be able to carry it along with me on business trips. After trying and rejecting the normal (MySQL-based) procedure, I quickly hacked a much better one over the weekend, using open source tools. Highlights: (1) Very fast searching. (2) Keyword (actually, title words) based searching. (3) Search produces multiple possible articles, sorted by probability (you choose amongst them). (4) LaTeX based rendering for mathematical equations. (5) Hard disk usage is minimal: space for the original .bz2 file plus the index built through Xapian. (6) Orders of magnitude faster to install (a matter of hours) compared to loading the 'dump' into MySQL — which, if you want to enable keyword searching, takes days."

2 of 208 comments (clear)

  1. Re:Wow! by Anonymous Coward · · Score: -1, Troll

    Yeah right. You and the KKK my ass.

  2. Just hope you don't get an effed image. by Spazntwich · · Score: -1, Troll

    Given the sheer amount of petty editing wars and defacing that constantly plague Wikipedia, you would likely be better off just reading an Encyclopedia when you want some knowledge and an internet connection isn't available.

    Seriously, I know wikipedia is the darling of open source, but the more I learn about it, the more I realize it's pure garbage.

    Why? Educate yourself.

    http://www.guardian.co.uk/technology/2005/oct/24/c omment.newmedia
    http://www.theregister.co.uk/2005/10/24/wikipedia_ letters/
    http://homepage.univie.ac.at/horst.prillinger/blog /archives/2004/06/000623.html
    http://www.kapitalism.net/thoughts/wikipedia.htm

    And there's more, but you get the idea. Collusion to ruin people's lives when they run afoul of admins, corrupt editors doing and getting favors from the head honcho himself, pet pages that end up with incorrect information, speculation, or specious reasoning, and a general air of arrogance and groupthink reinforcing an internal idea that they can do no wrong.

    Why bother, seriously?