Slashdot Mirror


Wikipedia Planning a DVD Version

daria42 writes "The Wikipedia Foundation hopes to sell an English version of Wikipedia on CD-ROM and DVD before the end of the year. A boxed set of the German language version of Wikipedia has been available since last year. An updated version of the German Wikipedia was launched on Amazon.de this week, and the e-commerce site has received 8,000 pre-orders, according to Wikipedia Foundation president Jimmy Wales. Wales said it was easier to put the German version of Wikipedia onto CD as there are significantly less pages than there are for the English language version. He said that English Wikipedia would 'barely fit on 2 DVDs.'"

55 of 310 comments (clear)

  1. Another reason the German version fits on CD by Anonymous Coward · · Score: 5, Funny

    Frequent mentions of David Hassellhoff compress really well.

    1. Re:Another reason the German version fits on CD by Stevyn · · Score: 5, Funny

      AndbecauseGermansdonotwastebytesonspaces

    2. Re:Another reason the German version fits on CD by the+pickle · · Score: 5, Funny

      I believe what you meant was "Onderspacen Der Deutschennobegewastenderbytesen."

      p

    3. Re:Another reason the German version fits on CD by nbert · · Score: 4, Informative

      well, we just hate redundant information :D

      There is no way to turn AndbecauseGermansdonotwastebytesonspaces into one single German word btw.

      It is true that the German language allows combinations of nouns of arbitrary length, but in the English language constructs like this exist as well (e.g. railway consists of two nouns). The only difference is that those speaking English are not free to make up new ones.

      And as a general rule of thumb most combinations in English are limited to two words. While it wouldn't make sense to combine more than 5 words, because it would get to hard to read and understand the term, there are rare examples in German which consist of 3 or even 4 words.

    4. Re:Another reason the German version fits on CD by arodland · · Score: 5, Informative

      You could if you studied it for a minute. The rules are actually relatively simples. For example, let's look at everyone's favorite word: "Aktiengesellschaft", meaning a public corporation, and usually abbreviated "AG".

      Semantically, it can be broken into Aktien / Gesell / -schaft; taken at this decomposed level it means something a little bit like "stock fellowship".

      Orthographically, well, you're in luck; as usual (always?), it breaks down along the same lines, according to the rules. What are the rules? I couldn't tell you exactly, but they're simple, and they're similar to Latin's. Anyway, it breaks down to Ak/tien/ge/sell/schaft. Breaks occur between consonants that don't form clusters, between vowels that don't form diphthongs, and otherwise before consonants.

    5. Re:Another reason the German version fits on CD by Anonymous Coward · · Score: 3, Funny

      > You could if you studied it for a minute. The rules are actually relatively simples.
      > For example, let's look at everyone's favorite word: "Aktiengesellschaft", meaning a public corporation, and usually abbreviated "AG".
      >
      >... Anyway, it breaks down to Ak/tien/ge/sell/schaft.

      It's even easier if you break it down into:
      A/k/t/i/e/n/g/e/s/e/l/l/s/c/h/a/f/t

    6. Re:Another reason the German version fits on CD by Just+Some+Guy · · Score: 5, Funny
      in the English language constructs like this exist as well (e.g. railway consists of two nouns). The only difference is that those speaking English are not free to make up new ones.

      That's right. "Railway", for example, derives from "railuswayus" and was not concatenated by English speakers. We also didn't coin "email", "Internet", or "loudspeaker" - we borrowed those from Swahili.

      I'd go into more detail, but I'm off to check my voicemail and weblog (Spanish).

      --
      Dewey, what part of this looks like authorities should be involved?
  2. Whaaa? by Zone-MR · · Score: 5, Interesting

    Last time I checked, the current version of the English wikipedia dump, is around 585MB. It should comfortably fit on one CD. Where did this figure of two DVDs come from?

    1. Re:Whaaa? by amliebsch · · Score: 5, Funny
      Where did this figure of two DVDs come from?

      Well, when you add in the theatrical trailers, "making of" featurette, production stills, and commentary tracks... What I want to know is, will it be in Dolby Digital 7.1?

      --
      If you don't know where you are going, you will wind up somewhere else.
    2. Re:Whaaa? by Servants · · Score: 5, Informative
      I followed your link:
      Last dump made: 2005-03-09 (30 days ago)
      Total size 50503MB (1460MB for just current revisions)

      These are SQL dumps of the current and old article revision databases for each wiki. They can be read into a local database and directly used with the MediaWiki software (MySQL, PHP, Apache required).

      These dumps are not suitable for viewing in a web browser or text editor unless you do a little preprocessing on them first.
    3. Re:Whaaa? by Zone-MR · · Score: 4, Interesting

      Yeah, I know, but...

      1.5GB for current revisions would still fit on one DVD.

      Also, that 1.5GB is for all languages. The English version only uses 0.5GB of that.

    4. Re:Whaaa? by remahl · · Score: 4, Insightful

      Not counting images and other media, yeah.

    5. Re:Whaaa? by Knightmare · · Score: 4, Insightful

      I bet people would like to be able to read it or even search it off of the DVDs, which means storing it in bz2 format on the DVD is probably a BAD idea... So yes it's only 585 megs when bzip2'd but that isn't a very friendly format to deal with.

    6. Re:Whaaa? by jacksonj04 · · Score: 4, Insightful

      Nope. Wikipedia is available over HTTP in a much more up-to-date, interactive and dynamic format than DVDs. The whole purpose of the DVD sets is... I don't know. I really don't. but why BitTorrent it when you can just point your browser at wikipedia.com?

      --
      How many people can read hex if only you and dead people can read hex?
    7. Re:Whaaa? by saforrest · · Score: 4, Insightful

      Wikipedia is available over HTTP in a much more up-to-date, interactive and dynamic format than DVDs.

      Well, yes, if you want to read it you're probably not going to download the entire bloody encyclopedia to your local machine via bittorrent.

      But some people would have valid reasons for wanting this. A lot of places resyndicate Wikipedia content, e.g. www.thefreedictionary.com. or answers.com; I'm exactly sure why these sites do it, but I can think or many valid reasons.

      Maybe data miners or researchers want to run scripts on Wikipedia and make all kinds of conclusions (such things are entirely legal and above board, since the content is free).

      The whole purpose of the DVD sets is... I don't know. I really don't.

      Well, not all of us are connected to the Internet 24/7. Some of us have laptops without wireless Internet, and even computers without network cards at all.

      Lastly, there are many places in the world where you can't get a reliable net connection at all (e.g. various places in Africa, Asia).

    8. Re:Whaaa? by isny · · Score: 3, Funny

      Everybody knows that Wikipedia is best read in the original Klingon.

    9. Re:Whaaa? by Anonymous Coward · · Score: 5, Funny

      Well, not all of us are connected to the Internet 24/7.

      What the hell's wrong with you?

    10. Re:Whaaa? by nc_yori · · Score: 4, Funny

      The best part will be that the 7.1 sound will be put together from contributions by users just like you and me from all over the world!

      The levels will be mostly ok, except for the sections where people have entries for themselves in which the dB level will be upped by 10 +-5. Also, the encoding will be completely and totally correct, except for a very small flaw which will cause the center right speaker to output everything in Latin.

    11. Re:Whaaa? by Raul654 · · Score: 4, Interesting

      Exactly right - the media take up BY FAR the largest amount of space. Being that I do a lot of work putting full length songs onto Wikipedia (and I'm pretty much the only one who does), I've put well over 2 gigabytes onto commons in the last 6 weeks alone. See the list of songs I've put up :)

      --


      To make laws that man cannot, and will not obey, serves to bring all law into contempt.
      --E.C. Stanton
  3. But... by over_exposed · · Score: 4, Funny

    How will the trolls deface a read-only version of it?

    --
    "The object of war is not to die for your country, but to make the other bastard die for his." - Patton
  4. Neat idea, but... by kyle90 · · Score: 5, Insightful

    I think it's a good idea to have wikipedia available in other formats than just online, but isn't the whole point of it that anyone can come and edit the articles to make them more correct? You couldn't do that with a DVD version. And unless someone is going to go through every article before putting it on a disc, you'd run the risk of buying an encyclopedia with some things blatantly wrong. I could envision pranksters trying to sneak in false information just before the DVD release...

    --
    Real_men_don't_need_spacebars.
  5. humm.. by thundercatslair · · Score: 3, Interesting

    I thought the whole idea behind wikipedia was that it is constantly changing. Will updated dvds be sold? And if so, will previous buyers get a discount?

  6. Mad dash to make "corrections" before it goes gold by FunWithHeadlines · · Score: 5, Insightful

    You know how controversial subjects in the Wikipedia get fights over entries. Back and forth it goes, with one person putting their "truth" and then the opposite side removing or replacing it with their version of the "truth." Now, just picture it: The deadline for the gold master version to be put on disc is announced, and like people pouncing on an EBay auction at the last second, the warring factions will rapidly replace each other's versions of an article, hoping that their version is the one to be immortalized on disc.

  7. Dead-tree version coming soon? by rice_burners_suck · · Score: 4, Interesting
    With Wikipedia taking up so much space on DVD, I certainly hope they compress the text. It should actually compress quite nicely, I think.

    I wonder... does this 2-DVD set include all articles from Wikipedia? (As opposed to some just selected somehow...) Also, I wonder if the DVD version will include all the version changes to the articles. If not, then perhaps the best version was picked out somehow?

    Hmmm... This is what I think needs to happen: Wait a few more years for Wikipedia to gain even more information, and then put some kind of button on pages that allows users to "vote" for that page to be included in a dead-tree encyclopedia version of Wikipedia. The idea is to put only those articles that have the highest votes into a traditional-style encyclopedia that can rival the likes of commercially made ones. Of course, there would need to be ways to cite sources, to make the encyclopedia worthy of academic research and the like, and preferably there should also be a way for people who want to do other stuff than write articles to submit photographs or whatever kind of artwork, of their own creation and released under the free license of Wikipedia, for inclusion in the articles. For the print version, people might be able to vote for the "best" photographs and artwork for inclusion. At that point, it should be a matter of running some perl script or something to typeset the whole darn thing. This might find its way into libraries and into peoples' homes. Imagine that!

  8. How fluid is Wikipedia? by spagthorpe · · Score: 4, Interesting

    How often do existing pages change? Maybe in a case where people catch errors.

    I have a spare 20GB lying around that I would install this on, if there was some way to sync it with the current state and have it download new pages and update current ones.

    --

    WWJD -- What Would Jimi Do?
    (Smash amp, burn guitar, take home the groupies)

  9. There are... by NumbThumb · · Score: 3, Informative

    ...no images in the dump. Just text. And not reader software.

    Also, the current dump is about 800 MG, gzipped. enjoy.

    --
    I have discovered a truly remarkable sig which this 120 chars is too small to contain.
  10. School usage by under_R_run · · Score: 3, Insightful

    This would be great for schools. They could buy the DVD set and set up a local "mirror" of Wikipedia to increase access speed and decrease Wikipedia bandwidth usage.

  11. I think I speak for all the nerds here... by Phexro · · Score: 5, Interesting

    ...when I say, "two single-layer DVDs, or dual-layer?"

  12. Is this legal? by nebaz · · Score: 3, Interesting

    In order to publish and SELL this information on CD/DVD, does the Wikipedia Foundation have to get the permission of all the article writers, or is there, perhaps, a clause on the website that says something like 'we own all the stuff put on here'. What would happen if Slashdot sold versions of article comments on DVD?

    --
    Rhymes that keep their secrets will unfold behind the clouds.There upon the rainbow is the answer to a neverending story
    1. Re:Is this legal? by teslatug · · Score: 5, Informative

      Have a read.

    2. Re:Is this legal? by remahl · · Score: 5, Informative

      Text content contributed to Wikipedia must be GFDL, so the foundation can sell it as long as they respect the authors' copyright and the terms of the license. Although the Wikimedia Foundation is not-for-profit, even commercial distribution would have been acceptable under the terms of the GFDL. But the content copyrights still belong to those who created it.

      On the other hand, it happens that people contribute material copyrighted by other people, without their consent. According to U.S. law, Wikipedia cannot be held responsible for that, as long as they act quickly to remove infringing material. When physical media is distributed, that protection is no longer valid.

    3. Re:Is this legal? by the+pickle · · Score: 3, Insightful

      Wow, what a karma whore.

      On the bottom of every single Wikipedia page, right there in plain sight, is a link to the GNU Free Documentation License, which governs everything submitted to Wikipedia.

      p

  13. wiki is going to get sued for this by Anonymous Coward · · Score: 3, Informative

    A lot of vandals copy/paste text from copyrighted websites onto Wikipedia, usually they get found and deleted but some are missed. If they sell copies of Wikipedia then they are going to get tons of copyright infringement lawsuits.

  14. Re:Why? by pmazer · · Score: 3, Interesting

    It makes sense for laptops which aren't always online. If you're writing a paper on your laptop and want to look something up, but can't easily get to a hotspot.

  15. Re:Why? by Chemical · · Score: 3, Interesting

    Because their site is slow, and the search engine always seems to be disabled for "performance reasons". I would consider it if the DVD included an enhanced search feature.

  16. Re:Why? by sinclair44 · · Score: 3, Informative

    Wikipedia's servers are often overloaded. My net connection can go offline somtimes. It's 100% positivly available for a research paper, and will 100% be around to back you up. You can run complex searches on an offline version much better/nicer/faster than an online version (if you can run it online at all). You can show it off to friends. Or a multitude of other reasons.

    --
    Omnes stulti sunt.
  17. Wikipedia Magazine... I'd pay for it! by rice_burners_suck · · Score: 4, Interesting
    Here's an idea I just dreamed up... It shouldn't be too hard or costly to do, but it might make the Wikipedia folks quite a lot of money, if it works:

    On each Wikipedia article, there should be a button where users can vote an article as being "worthy" for academic research and the like. Articles that receive high votes would actually get published in a monthly (or even by-weekly) magazine... So, for example, each month, subscribers would receive the magazine in the mail, and it would contain, in addition to paid advertising like any other magazine, something like ten or fifteen articles randomly chosen from Wikipedia. These would cover a broad range of topics. One month, you might receive a magazine with articles about Argentina, transaxles, grep, electromagnetism, George Washington, the Berlin wall, Apollo 9, goldfish, ballpoint pens, and cow manure. Some subscribers will already be familiar with some of the topics; others might not be interested in some of the topics; but chances are that if you pick up this magazine and read it, even for a few minutes a month, you'll learn some interesting new facts here and there, usually about topics that you'd never consider reading about in any serious manner, but which you're reading because the Wikipedia Magazine happens to be there.

    Links at the bottom of articles would direct the reader to the article online. This would serve an additional purpose: People who find something missing or something that could be improved in an article would perhaps be more likely to find out about it and then go online and fix it, thereby improving the quality of the entire Wikipedia.

    Money from subscriptions; money from advertisers in all fields (not just technical, and perhaps based on the content of that month's magazine) would finance the magazine and help finance Wikipedia. I see this as an opportunity to make quite a profit on something that is free, while mainly benefiting the community by doing so.

    1. Re:Wikipedia Magazine... I'd pay for it! by Raelus · · Score: 3, Informative

      Want to learn random stuff for free?
      http://en.wikipedia.org/wiki/Special:Randompage

      --
      "It is the stillest words which bring the storm. Thoughts that come with doves' footsteps guide the world."
  18. Re:What's their point? by MarthaStewart32 · · Score: 3, Insightful

    German people dont necessarily speak english and vice versa. And two DVD's is a lot of space. And 4 cd's isnt even a DVD. And just because other people use multiple disks doesnt mean its a good idea. I remember playing riven and having to switch disks way way to often. And for a Encyclopedia there would be a 50% chance that you would have to switch disks everytime you looked something up. That would be rather annoying when trying to do any research.

  19. Where is the Great Publishing House of Ursa Minor? by AeonOfReason · · Score: 5, Interesting

    You know, Wikipedia is ripe for a Hitchhiker's Guide to the Galaxy treatment.

    Put it in a little handheld, stick an Ipod hard drive in it, give it a usb port so it can grab updates, and presto.

    As for Wiki itself, "At least where it is inaccurate, it is definitively inaccurate." -Douglas Adams

  20. The fine print by Bifurcati · · Score: 3, Informative
    Just so we're clear, the article says that the majority of the price is going towards production costs and paying amazon. But if you're cheap, and really want a DVD set, then you can just download the images off "various websites", presumably to burn at your leisure.

    It's hard to get a more friendly distribution method than that!

  21. Re:Where is the Great Publishing House of Ursa Min by Raelus · · Score: 3, Funny

    And instead of a "DONT PANIC" sticker, they'll put on a "DONT EDIT" one.

    --
    "It is the stillest words which bring the storm. Thoughts that come with doves' footsteps guide the world."
  22. Vandals by Bifurcati · · Score: 3, Insightful
    Although vandals are rare, it's not inconceivable that across their entire page set there would be at least one vandalised page. Kind of unfortunate if that gets included in the DVDs!

    Anyone know if they have any way of stopping this?

  23. Re:Why? by amliebsch · · Score: 5, Interesting

    First, to "lock in" decent versions of controversial articles. But second and more importantly, to be able to produce a stable, constant "edition" that can be referenced and cited to. How do you cite Wikipedia, when the content is always changing? Now you could write a paper and cite something like Person, Random, "Wikipedia Article," Wikipedia 2d ed. (2006). Very, very, important if WP is to become a legitimate source of information.

    --
    If you don't know where you are going, you will wind up somewhere else.
  24. Another good thing about this... by bombadier_beetle · · Score: 3, Interesting

    ... maybe the zealots who use Wikipedia as their ideological battleground (e.g. this, this, or this) can host their own wikipediae, with their own versions of The Truth, and thus the revision wars on the original Wikipedia will stop.

    Or not.

    --

    If you mod me down, I shall become more powerful than you can possibly imagine.
  25. A bit of history on this by Raul654 · · Score: 5, Informative

    I first heard about this back in July of 2004. The people at Mandrake had already approached some of our people, and told us they wanted to put Wikipedia on DVD. The stumbling block was, of course, copyright issues. We launched a copyright tagging project in August - basically, they did an sql dump of the list of all uploaded files that had no copyright tag and tagged them. In January, Angela sent them an email, telling them it was done, and that's when the DVD project actually started.

    --


    To make laws that man cannot, and will not obey, serves to bring all law into contempt.
    --E.C. Stanton
  26. Stupid Idea by MSTCrow5429 · · Score: 4, Insightful

    What's the point? Wikipedia is an inherently online medium. The articles change daily, new ones are created, etc. This cannot be reasonably placed on a static medium.

    --
    Slashdot: Playing Favorites Since 1997
  27. The German version is smaller because... by Just+Some+Guy · · Score: 4, Funny
    Deutche has an amazing built-in fractal encoding scheme. For example, the German version may say:
    Gerflugenichterschweitzenbaggen.
    whereas the English version has to write out:
    Shortly after September 11, 2001, the United States attempted to rally its allies for a strike against the presumed Al-Queda stronghold in Afghanistan.
    Unfortunately, the RAR algorithm averages a 3% compression ratio on German text, in comparison to 82% for English and 94% for French - it's like bzipping a .gz file. On the other hand, there are significant savings due to the lack of entries on "sweet nothings", "pillow talk", and "Bavarian romantic verse".
    --
    Dewey, what part of this looks like authorities should be involved?
  28. Make it an appliance by andrew71 · · Score: 3, Funny


    self upgrading... and of course, based on GNU/Linux :)

    --
    13-4=54/6
  29. Re:Why? by ikkonoishi · · Score: 4, Informative

    And for those who care...

    An MLA/APA auto formatter for references.

    Every teacher at my school has recommended it to me. (Although I myself have not yet gotten a chance to try it.)

  30. Good, Bad, it is what it is. by cbreaker · · Score: 4, Informative

    Althought I think actually USING the DVD set for normal use when you have broadband kinda defeats the purpose, I can think of a few reasons why it could be a good thing.

    A) Archival. Average users will be able to get a working, usable snapshot of Wikipedia, with media.

    B) Preservation. If Wikipedia were to shut down, you'd have a copy of it.

    C) Faster access. If you have a slow connection, you can still access Wikipedia at fast speeds. This benefit dwindles over time as articles are updated.

    D) Offline access. If you're on the road with no net connection, you can still access Wikipedia. This benefit also dwindles over time as articles are updated.

    E) Although backed by Google now which helps with the financials, if it brings in some cash to help support itself it's likely to stay around for much longer.

    --
    - It's not the Macs I hate. It's Digg users. -
  31. Wiki* in Plucker handheld formats by hacker · · Score: 4, Informative
    I've been working on the Wikipedia, Wikiquote, Wiktionary and other similar works to convert them to Palm handheld formats (primarily Plucker format, but now iSilo for those users as well, with less functionality in iSilo, of course). I did a lot of work to the core Mediawiki software that drives it, to make it more usable on handheld devices.

    You can see my work so far at the following links:

    Wikipedia in Plucker format
    Wikiquote in Plucker format
    Wikitionary in Plucker format

    ..and of course, my beautiful anti-alias fonts for Plucker, made with PalmFontConv by Alexander Pruss.

    I've also converted the Creating XPCOM Components book by Doug Turner and Ian Oeschger to Plucker format as well as the FreeBSD Handbook.

    I have literally hundreds of similar-quality works I'll be releasing over the next few months to the community on an ongoing basis.

    If there's something you'd like to see, just let me know

  32. I hope they are careful about rights by blonde+rser · · Score: 4, Interesting

    I hope the take the history of Mathworld as a warning as what can happen in the publishing world.

  33. Re:Here's a question: by Scrameustache · · Score: 4, Funny

    How are they going to get a Snapshot of Wikipedia in which there is no vandalism in any of the articles?

    It's called "editing".
    You've been reading slashdot too much.

    --

    You can't take the sky from me...

  34. Wikimedia Foundation by dolmen.fr · · Score: 3, Informative

    This is about the Wikimedia Foundation , not Wikipedia Foundation which doesn't exist.
    Both the article and the /. post are wrong.