Squeezing a Wikipedia Snapshot Onto an 8GB iPhone
blackbearnh writes with this excerpt from O'Reilly Radar "Think about Wikipedia, what some consider the most complete general survey of human knowledge we have at the moment. Now imagine squeezing it down to fit comfortably on an 8GB iPhone. Sound daunting? Well, that's just what Patrick Collison's Encyclopedia iPhone application does. App Store purchasers of Collison's open source application can browse and search the full text of Wikipedia when stuck in a plane, or trapped in the middle of nowhere (or, as defined by AT&T coverage...)"
There. Fixed that for you.
Well, I could if I had an iPhone, sounds like an impressive achievement though, but how much space do you have left over after it?
Laughter is the best medicine, except if you have a broken rib.
This is easily doable.
Once you trim the earth reference down to "Mostly harmless".
Corporation, n. An ingenious device for obtaining individual profit without individual responsibility. - Ambrose Bierce
1. Goes to foreign country - one that he has never visited before
2. Doesn't have wireless access.
3. Instead of wandering about the country he spends most of his time programming ("Then basically, I spent a significant fraction of my time there in Japan, again, in 2007 writing those applications") an application so he can look up stuff about the country he isn't spending much time actually visiting.
I bow before you sir. Awesome.
Faster! Faster! Faster would be better!
xkcd comic reference
Yeah, pretty much you're turning your iphone into a hitch hiker's guide to earth, or at least america and europe if you can manage to squeeze wiki-travel onto it.
Is it sad that I am more likely to recognize you and your posts by your sig than your name or UID?
Seen that, done that been Got the t-shirt in 1978
That problem has recently been solved. With the recent addition of sms-sharing, you could use any iPhone remotely.
When the policeman of the tie, rule you violate, hello punishment of the kitty?
Are you crazy? Imagine the frustration when you find this horrible typo and you can't fix it!
This is nothing new. Wikipedia has been available for several years now in MDict format: http://www.octopus-studio.com/product.en.htm
And for those preferring accuracy and editorial responsibility :
http://www.ipodnn.com/articles/08/02/27/britannica.on.iphone/
FTFA:
Already done.
However, I'm not sure that I want precisely what this iPhone app is. It strips out references, and from the sound of things also the discussion pages. I'd say about 1/2 of articles I check the discussion pages to see what's really going on. Also he says he strips a lot of the metadata, and obviously images, none of which are things I"d want to give up (some of the metadata might be superfluous, but if I'm copying Wikipedia onto my computer, I want to copy Wikipedia onto my computer.)
I understand there are licensing issues with images, but even so, the SVG ought to be safe. And that wouldn't add as much of a disk space hit as the gifs, etc.
One of the other issues is the timing of Wikipedia dumps. They only do text-only dumps, and according to the article they only happen once every few months. It would be nice to implement an image review policy, and figure out a way to allow for mirrors (or just some increased bandwidth at Wikipedia HQ) so that we can actually have the entire English Wikipedia, regularly snapshotted and compressed, available for download. And really, for that kind of thing a 3-month or even yearly turnaround would be well worth the wait.
Then get an iPod touch. Apple is giving them away with every new Mac, if you are eligible to an education discount.
http://store.apple.com/us/browse/campaigns/back_to_school?cid=WWW-NAUS-BTS20090507-00032
GENERATION 25: The first time you see this, copy it into your sig on any forum and add 1 to the generation.
... so clearly this app will never make it through Apple's review process.
The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
No. The Kindle supports online access to Wikipedia, but this requires a network connection. The iPhone supports the same. A while ago someone created a cut-down version of Wikipedia which you could browse completely offline on the iLiad. It sounds like someone has ported this to the iPhone, and because it's now on the iPhone it's news.
Putting Wikipedia snapshots on portable devices is interesting. I don't really see why you'd do it with an iPhone; the iLiad takes CF cards, so you can just keep a 16GB CF card for Wikipedia and not fill up space you'd otherwise use for something else, but the iPhone's storage isn't expandable so it's a strange thing to want to do. The text of Wikipedia is not that big. A complete (uncompressed) copy is 200GB, but that includes all revision history and user pages. The current version of the English Wikipedia is around 4GB of text. This leaves another 4GB for filling up with images.
I am TheRaven on Soylent News
It's cool but not $10 cool. I use 2 free apps that let me access wikipedia. Nothing really new or radical about this app unless wikipedia is really much larger and the author managed to cull 2gb from it.
It would be nice if he shared/donated some of the profits from this to Wikipedia, seeing as he's getting the database for free. There didn't seem to be a mention of it in the article or his personal site.
I dreamed of Freud: What does this mean?
Isn't the wireless access is for that purpose???
The filesize of this app, as-is, is 2GB. I wonder how many images are on wikipedia.
Ezekiel 23:20
So, I'm reading here that they convert the XML into proprietary metadata and compress that.
Why not use EXI (Efficent XML Interchange) http://www.w3.org/XML/EXI/ which has been tested as more efficient that gzip and requires less memory to parse? Especially since the XML processing can remain the same, since the nodeset is the same.
Fragile? I have dropped my iPhone onto concrete three times from 5 feet up. I carry it all day at work and don't use any of the protectors. I treat it as I do any other phone. Fragile isn't a word I would use to describe it.
i thought once I was found, but it was only a dream.
http://www.instructables.com/id/SBK1NAUFF78M26B/
I found these instructions in May 2008 and created a reasonably current snapshot of wikipedia that is still rather compact on a Psion 5MX. Not quite the same "curb appeal" as an iPhone, but a lot more functional.
Best,
Bah, that's nothing. I made an offline Wikipedia midlet! Unfortunately J2ME is unpleasant to say the least, and my phone only supports 2 GB SD cards so it only has some of the articles and without text.
I've been using this app for quite a while on my 1st gen iPod Touch, and it works and works well. It's amazing just how many articles it has. Other than some cosmetic and minor feature issues, the only real limitation is that Apple limits data file size to 2GB, so there is an obvious limit as to how much can go into the file. But it is amazingly complete. No images, no fancy tables--just text articles at your fingertips.
If you Jailbreak your iPhone/iPod Touch, then an excellent alternative is the Wiki2Touch app. Unfortunately, it seems that it's been pretty much abandoned in development, so it may be hit-or-miss if it works on OS v3.x. This implementation was REALLY slick. It provided a 4GB data file (that was much more complete) and a small Web server. You enabled the Web server, fired up Safari, and pointed it to a local URL. The app presented quick and very readable articles. And if you went to the trouble to download and process, you could also add about 4GB of image files to make things more complete (on a larger-capacity device, of course.)
Here's a review that I posted for both apps just over a year ago on my iPod Touch Tips site:
http://jimstips.com/ipod-touch-tips/ipod-touch-review-wikpedia-on-your-ipod-touch.html
In both cases, the main complaint is updating. In order to update the data file, you have to re-download the data, and depending on the app, you are typically at the mercy of the developer to provide an update. Otherwise, you had to download, index, and install the HUGE files yourself.
If you absolutely HAVE to have updated, offline data, check out the Wikipanion app. It's a nice compromise.
My mom always said, "Jim, you're 1 in a million." Given the current population, there are 7000 of me. God help us all!
[citation needed]
I'm not really kidding. Your anti-Wikipedia rant is entertaining, but it doesn't provide any substance. Speaking for myself, when I go to Wikipedia for a refresher on something I already know about, I'm generally pleased with the quality of the results, which makes me think that the articles on subjects I don't know much about are likely to be pretty good too.
Your line about "political correctness and facts washed out of existence by human insecurities" provides a clue as to what really bothers you about Wikipedia: reality's well-known liberal bias. Unless you can provide specific examples, with citations, it's reasonable to assume that the Wikipedia groupmind knows more about the way things really work than some random dude on /.
The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
To be honest, what would probably be easier would be just hacking the emulator that already comes with the development tools to run any app at any time.
Sounds a bit small.
---- Booth was a patriot ----
I use 2 free apps that let me access wikipedia. Nothing really new or radical about this app
Except it works when you're away from a hotspot, even if you paid only $220 (not $100 + $80/mo * 24 months) for your device.
Hell, I was flipping through an encyclopedia from the 40's, and under "Dynamite", it had detailed instructions on how to MAKE it yourself
Wikipedia doesn't have how-to guides. If you want that, use Wikibooks.
I think you could probably get answers to your questions by visiting public libraries, and talking to people. Maybe the "talking to people" bit might not get you definitive answers (though probably as good as a lot of Wikipedia content) but you might have found out a whole lot more. Also the public libraries probably had a lot of this info if you were looking for solid facts.
I appreciate a portable + off the net wikipedia would be a cool tool as well but nothing beats chatting to the locals.
On the other hand, one of my colleagues dropped it to a floor from his desk. Now the screen is full of cracks.
Anecdotal evidence goes both ways.
"It's such a fine line between stupid and clever" -- David St. Hubbins, Spinal Tap
Actually the app is 0.1MB from the app store, then it downloads the database from a non-apple server. And like the other poster said, you can install it over WiFi.
You should study how mistakes are made a bit more. When typing fast, I often mistype one work for another. Even though I know very well the rules for "its" and "it's" (a mistake you made in your post), "their" and "there", "than and then", sometimes my fingers just decide to type one thing even though my brain is thinking another. Sometimes I type "you" instead of "your", too. Or "to" instead of "too". None of this has anything to do with not knowing the rules. It's partially about not proofreading (c'mon, it's just a slashdot comment), but even proofreading can miss them because the brain sometimes reads what it means and not what it sees.
Moderators: Before moderating a comment Insightful/Informative, check to see if a child post has already refuted it.
App Store purchasers of Collison's open source application can browse and search the full text of Wikipedia when stuck in a plane
This page is not recommended when you're stuck in a plane...
I bought this application 6 months ago and there are 3 majors problems with it:
1) The search function is broken because you need to type the exact word (prefix)
2) This is plain text: no pictures and no tables so most articles with "list" are useless
3) No update mechanism so the dump used will be outdated soon.
"reality's well-known liberal bias"
[citation needed]
Knowledge is power. Knowledge shared is power lost.
Is there a version of this that will run in a web browser? Anyone have a link?
Who would win this election: Andrew Weiner vs Andrew Weiner's weiner.
"reality's well-known liberal bias"
[citation needed]
Hmmm, good point. Here's an edit:
"reality's well-known liberal bias [January 20, 2001 - January 20, 2009]"
The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
I had this for ages with Windows Mobile had this for ages with Tomeraider3 wikipedia database. Instead of using bzip2, it's more efficient and elegant to store the table in sqlite and use it's excellent sqlite3 compression.
Not sure when iLiad came around, but this has been done on iPod Linux since at least 2003. Nothing beats a massive hard drive.
Never made them - a PDA with a broken screen is nothing special. Got four of them somewhere in a drawer, three of them the same model (HTC Universal). That doesn't say though, that this device has got weak points - had a HTC Universal which I dropped from the cockpit of an Actros lorry strait down on asphalt and except of a scratch on the case nothing happened.
As I said, anectodal evidence goes both ways.
"It's such a fine line between stupid and clever" -- David St. Hubbins, Spinal Tap
You can download the database in a variety of formats. You'll need to trim it down quite a bit; the download of just the article text (no images, no history, no user pages, no discussion) is over 4GB compressed. You can probably compress this better with something a bit more domain-specific than gzip, and you can probably eliminate a lot of articles that are just stubs.
I am TheRaven on Soylent News
Is this an ad ?
Why don't you Just use your encyclopedia from the 40s and don't think about wikipedia, if that really seems more useful to you.
Thank-you. I already do that. I also tap a multitude of other information sources. It's call "research". It's what you have to do when, "the most complete general survey of human knowledge we have at the moment," fails to deliver.
So... no you compare WP to a imaginary guide? Didn't you just accuse the rest of humanity of trying to live in a dream state? It would be more effective if you showed us how bad wikipedia is by comparing it to something, you know, existing.
When so many others have equated the two and found themselves happily pleased with the results, it is entirely fair for me to subject that same comparison to more objective lighting. If you have trouble with that, then you're not looking at the whole picture, but are simply picking out disposable items to put in your fish barrel for easy target practice. But otherwise, congratulations on a devastating argument.
-FL
try this link from your mobile phone:
http://wapedia.mobi/en/
That way you get the whole thing, up-to-date, and with no trouble or major memory usage.
I'm not really kidding. Your anti-Wikipedia rant is entertaining, but it doesn't provide any substance. Speaking for myself, when I go to Wikipedia for a refresher on something I already know about, I'm generally pleased with the quality of the results, which makes me think that the articles on subjects I don't know much about are likely to be pretty good too.
So. . , it is good at re-enforcing your current belief structure, and when exploring new areas of knowledge, you trust it to maintain the calming, authoritative feeling of security. How nice that must be for you. You must feel right with the world when you close your browser.
You want examples? You mean you want me to work for your benefit when you are a smarmy ass whose belief system I really don't give two figs about and who would work hard to reject anything offered which doesn't fit with his comfortable dream-version of reality? Can you even take a small personal criticism without becoming angry and defensive? Most people cannot, so how the heck would you manage with having firmly held beliefs examined? Why on earth would I want to waste such enormous amounts of energy on such an individual? Answer: "I don't."
Your line about "political correctness and facts washed out of existence by human insecurities" provides a clue as to what really bothers you about Wikipedia: reality's well-known liberal bias. Unless you can provide specific examples, with citations, it's reasonable to assume that the Wikipedia groupmind knows more about the way things really work than some random dude on /.
When you say the "Wikipedia groupmind" I assume you mean "Human groupmind", and pardon me for coughing somewhat at that. Our world is a giant ridiculous mess and it was put there thanks to the Human groupmind. Reality's well-known liberal bias? -???- Aside from the fact that I'm probably more genuinely left than your average hippie, bent politics are a given in Wikipedia, and the war for Left and Right rages on while I don't actually care much since that combat zone is very small and very deliberately provoked and contained, and thus, simply Does Not Matter.
Rather, it's the "Learning Channel" style of pompous assumption and tone affected by every entry where it might matter. Look up the entry for James Randi, for instance. The dream version of that reality is very well cited and documented on WP, and so the illusion remains comforting and intact while the other side of the story remains ignored. And that other side? Sorry. You're an ass, so I'm not going to give you anything which you could find out for yourself.
-Some Random Guy on Slashdot.
I know you're being funny, but my first idea for how to implement this would be to
I use a browser for viewing /usr/share/doc/**/*html. Not all uses of a browser have to leave 127.0.0.1.
If you want stirling examples of angry defensiveness and pomposity, take a look at your own post. Clearly you'd rather just fling poo than engage in anything like a constructive debate. Just be aware that you're not likely to influence anyone's opinion with the attitude -- if you're happy with just ranting for no real reason, then hey, have fun with that.
The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
With North Korea's nuclear weapons threats looming over the whole world every day, one must wonder what would happen if, one day, nuclear missiles are fired, plunging the world into a post-apocalyptic darkness.
The progress of humanity could be lost with the destruction of the Internet, libraries, etc. Luckily, you can now carry the history of the world and beyond - on your iPhone! Combine that with a power generator, and you'll still hold the history of the world!
TiniWiki also does this. I haven't done a detailed comparison with the one in the article, but I last time I looked TinyWiki was pretty good. They had two advantages over some other similar products: a) they had more of Wikipedia, not just a cut-down or old selection, and b) they could do incremental updates.
If you want stirling examples of angry defensiveness and pomposity, take a look at your own post. Clearly you'd rather just fling poo than engage in anything like a constructive debate. Just be aware that you're not likely to influence anyone's opinion with the attitude -- if you're happy with just ranting for no real reason, then hey, have fun with that.
Fair enough. But I also happen to be right.
The interesting part is that my tone with these last couple of posts is about par for the course for me, but on this particular subject, people seem to get very upset. I was actually surprised to see my 'score' plummet as it did. I honestly half expected it to shoot in the other direction. I clearly don't have my finger entirely tuned to the pulse of popular human awareness, which is probably why I'm still here. Learning how the rest of the monkeys think.
The intensity of the knee-jerk always says a lot about how deeply and tightly tied the emotional knots are. The subject of human knowledge, what we do and do not know, is a very prickly one. The ego is intimately linked to it, particularly with geeks.
-FL