Have 100GB Free? Host Your Own Copy of Wikipedia, With Images
First time accepted submitter gnosygnu writes "Want your own copy of English Wikipedia with images? Got 100 GB of disk space? Then open-source app XOWA may be of interest to you. The project released torrents yesterday for the 2013-11-04 version of English Wikipedia. There's 100 GB of sqlite databases containing 13.9 million pages, and 3.7 million images — readable from any Windows, Linux, or Mac OS X system. Image downloads for other wikis are building, but you can still use XOWA to read the text-only version for other wikis like Wiktionary, Wikisource, Wikiquote and 660 more. Next time you find yourself stranded without the internet, you can pull out your own copy of Wikipedia for use."
It comes with software that automatically reverts your edits and insults you.
Finally I can have my own version of wikipedia so I can correct all those changes I haven't been allowed to enter into the official version!
Does it include the seasonal donation nag banners?
Holidays are coming! Holidays are coming!
Alas, the terms and conditions will forbid you running a server to do this. They'll want you to use one of their cloud servers to do it (that kinda makes more sense to put something like that further upstream).
Waiting for an amusing sig.
You are right. That's a silly summary they put on. They should say something like 'No Internet connection required while browsing/searching through the wiki' (one of their feature).
Navigate between offline wikis. Click on "Look up this word in Wiktionary" and instantly view the page in Wiktionary.
I'd have put en.wikipedia at at least a couple of terabytes. Not inconceivably large, but with some housecleaning I could actually get 100GB free.
Facts do not cease to exist because they are ignored. - Aldous Huxley
Rats. It won't QUITE fit on a microSD card...
Just exclude the star trek / star wars related entries; that should pare it down. And besides we all have it all committed to memory anyway right? :p
That's a good thing. The more we use torrents for the distribution of legitimate content, the more such distribution methods will become legitimized.
...yet. But I guess most phones won't easily read sqlite databases yet, either. I suppose it won't kill me to lug around a full-sized SD card.
Still looking forward to the library-of-Congress-on-a-card from Rainbows End.
Most phones _won't_? Four out of five smartphones today have sqlite preinstalled and ready for use: http://developer.android.com/reference/android/database/sqlite/package-summary.html
And yet you commented only 16 minutes after the AC...
You do not have a moral or legal right to do absolutely anything you want.
As a long long time editor...
Look at the quality of information.
I agree, you did a terrible job. Please, quit editing!
`echo $[0x853204FA81]|tr 0-9 ionbsdeaml`@gmail.com
YYYY-MM-DD is the only date scheme where filenames sort ASCIIbetically. Kinda useful if you have a lot of copies of something.
I prefer ZModem myself.
But if you don't have that you can probably use XModem.
If something is so important that you feel the need to post it on the internet... It probably isn't that important.
Actually, ISO 8601 dates (YYYY-MM-DD) are unambiguous: far better than the ambiguous AA/BB/YYYY notation, since Americans interpret it as MM/DD/YYYY but in some other countries it's regarded as DD/MM/YYYY.
As an added plus, a lexical sorting of YYYY-MM-DD dates is also a temporal sorting. Not so with either of the other two formats.
http://en.wikipedia.org/wiki/ISO_8601
Koans and fables for the software engineer
I hope it's when the previous pope (Ben #16) was pictured as Master Yoda in Wikipedia.. missed that :-)
Slashdot, fix the reply notifications... You won't get away with it...
Yeah, I was misremembering the line:
"The British Museum and Library, as digitized and databased by the Chinese Informagical Coalition. The haptics and artifact data are lo-res, to make it all fit on one data card. But the library section is twenty times as big as what Max Huertas sucked out of UCSD. Leaving aside things that never got into a library, that's essentially the record of humanity up through 2000. The whole premodern world."
128PB, 97% in use.
Next year or so 100GB phones will be commonplace...and you will have your Hitchhiker's Guide.
Truly amazing times we live in.
Great warrior...hrmph! Wars not make one great.
Presumably the wikipedia is under revision control.
Does this give you the whole thing so that you can forever after sync with the master?
Or just the most recent versions of the articles?
Should there be a bittorrent for syncing huge revision control data bases?
just pulled the most recent english-language wikipedia dump, and made elasticsearch ( via the wikipedia river plugin ) run over it. 13.9 million entries now on a small server, answering times ~ couple-of-millisecond order. elasticsearch rocks !
Religous speak to God. Insane are spoken to by God. When all shut up, one can finally hear Shostakovich in peace