Slashdot Mirror


Building a Fast Wikipedia Offline Reader

ttsiod writes "An internet connection is not always at hand. I wanted to install Wikipedia on my laptop to be able to carry it along with me on business trips. After trying and rejecting the normal (MySQL-based) procedure, I quickly hacked a much better one over the weekend, using open source tools. Highlights: (1) Very fast searching. (2) Keyword (actually, title words) based searching. (3) Search produces multiple possible articles, sorted by probability (you choose amongst them). (4) LaTeX based rendering for mathematical equations. (5) Hard disk usage is minimal: space for the original .bz2 file plus the index built through Xapian. (6) Orders of magnitude faster to install (a matter of hours) compared to loading the 'dump' into MySQL — which, if you want to enable keyword searching, takes days."

14 of 208 comments (clear)

  1. Re:Why? by rabblerabble · · Score: 5, Funny

    I'll bite...Unfortunately, I don't have a basement, so therefore there are times that I am required to venture into the outer realm that happens to be heated by the big ball of gas known as Sol, as opposed to a pump ;P Seriously though, this is exactly what I have been looking for. What better way to show up your friends when they cry "You're wrong, google it!" knowing that there is no connection possible within twenty miles. Next time i'm drunk at the beach and someone wants to pretend to know the history of coffee harvesting, it's on.

  2. Ho-Hum ... by jabberwock · · Score: 5, Funny
    What, no auto update? No User Agreement? No disabled features that are enabled by a mammoth key? No product registration?


    Let us know when you're ready for prime time ... ;-)

  3. Take that, Mr Obviously A. Troll! by ampathee · · Score: 5, Funny

    Programmers shouldn't be wasting time on these trivial, pointless projects. We need their work in other more important projects!
    Hah! I'm going to start work on (let's see..) a random lolcat generator now, just to piss you off.
  4. Re:Uh.... by dhwebb · · Score: 5, Interesting

    Programming something new to some people is like playing a video game. I love programming useless things just for the challenge. People who don't understand that have never had a true love for programming.

    --
    Only two things are infinite, the universe and human stupidity, and I'm not sure about the former.
  5. Hitchhiker's guide here we come! by Brietech · · Score: 5, Funny

    Combine this and one of the new E-ink ebook readers, make it pretty rugged, slap a solar panel on the back and man. . . you have something really close to a genuine hitchhiker's guide to the galaxy. Ah, I love where technology is heading =)

    --
    I'm perfect in every way, except for my humility.
    1. Re:Hitchhiker's guide here we come! by RandomWhiteMan · · Score: 5, Funny

      You laugh now, but just wait until you're stranded in the middle of Blackheath England, needing a ride from a conservative British History Scholar who has his son with him playing Pokemon Gold. Won't be so smug then, will you. I bet you won't even have your towel on you when this all goes down.

    2. Re:Hitchhiker's guide here we come! by Gromius · · Score: 5, Funny

      Yes its a perfect fit. Particularly as Wikipedia has now supplanted the Encyclopedia Britannica in many places as the standard repository of all knowledge and wisdom. Although it has many omissions, contains much that is apocryphal, or at least widely inaccurate, it scores over the older more pedestrian work in two important ways.

              * 1. It is slightly cheaper
              * 2. It has the words "You can copy and edit me for free" inscribed in large friendly letters in the license.

      Also like the guide, although it cannot hope to be useful or informative on all matters, it does make the reassuring claim that where it is inaccurate, it is at least definitively inaccurate :)

    3. Re:Hitchhiker's guide here we come! by nstlgc · · Score: 5, Funny

      Just so we're clear, you can make Pikachu evolve into Raichu by using the Thunder Stone (which makes sense, since they're Electric Pokémon). However, due to the emotional value Pikachu has to trainers, most of them choose not to evolve him. Some Pokémon games even plain don't allow this. I hope this was helpful.

      --
      I'm Rocco. I'm the +5 Funny man.
  6. Re:I hope by Anonymous Coward · · Score: 5, Funny

    You mean before someone makes it inaccurate again?

    Oh, nevermind, I see the problem:

    George W Bush

    Is a dick head!!!!11

    should be

    George W Bush

    Is a dick head!!!!!!

    Man, those out to mess with the content are getting more and more subtle...

  7. Re:2X by Brian+Gordon · · Score: 5, Informative

    Ahaha, 2.9GB? That's the text alone. Images will net you more than 200GB more. And yes, you do need a LAMP/WAMP and working mediawiki, but it wouldn't take 'days' it would take a few hours max. Also is this guy aware that wikipedia is available on DVD already?

  8. Re:Why? by thePsychologist · · Score: 5, Insightful

    Realize that some of the greatest things done by humankind were from doing "pointless projects" as you call them. Prime numbers for instance were studied by mathematicians just for fun, and now look, they're used for cryptography. Try doing your banking without them.

    Complex numbers originated from something "useless" like trying to solve the quartic polynomial in radicals...try building a bridge without them. In fact all of science is built upon people going in random tangents doing things they enjoy, discovering seemingly "useless facts" but most of it becomes useful *and* gives us an idea of the universe in which we live.

    Only working on immediate practical problems is very shortsighted, and if mandated throughout the academic community, would mean the death of innovation and most discoveries.

    --
    "What lies behind us, and what lies before us are tiny matters compared to what lies within us." Ralph Waldo Emerson
  9. Re:2X by TubeSteak · · Score: 5, Informative

    Also is this guy aware that wikipedia is available on DVD already? Are you aware that the link you pointed to (1) is not the same thing as the link (2) the author pointed to?
    (1) http://schools-wikipedia.org/
    (2) http://download.wikimedia.org/enwiki/latest/

    1 is 4625 articles hand picked for school age children, hence the website name
    2 is a straight dump of wikipedia

    Just imagine my surprise when the schools-wikipedia website didn't have the wiki article on Goatse!
    --
    [Fuck Beta]
    o0t!
  10. I know the feeling by aepervius · · Score: 5, Insightful

    They say to you that their hobby is painting/music/walking/repairing old car/gardening/making reduced model etc... And they seem to think that their hobby are perfectly acceptable. But as soon as you say you like to program stuff, they don't understand how this would be a hobby. They mostly fail to recognize that every one of us has something in common : the joy of act of creation. The fact that our hobby entail creating something immaterial and full of "logic" does not matter. It is still a joy.

    --
    C. Sagan : A demon haunted world:
    http://www.amazon.com/gp/product/0345409469/
    visit randi.org
  11. What?? by icydog · · Score: 5, Funny

    TFA is:

    1. Not a thinly-veiled attempt to advertise a crappy product
    2. Not bashing Microsoft
    3. Not about somebody who is trolling open-source (i.e. SCO)
    4. Not about Bush taking away all our rights and ending freedom
    5. Not about voting fraud and the end of democracy/America/the world
    6. Not decrying Vista DRM and its ties to the MAFIAA
    7. Posted on Slashdot

    Furthermore, TFA is interesting and informative.

    Am I in heaven?