German Wikipedia To Be Published As a Book
David Gerard writes "Bertelsmann is to publish a single-volume book of the German Wikipedia in cooperation with Wikimedia Deutschland. It will cost 20 Euros, and 1 Euro from each copy will go to Wikimedia. They're editing down the most popular 50,000 articles for the 1,000-page book, to be released in September. Because of the open-source origin of the material, the publisher cannot claim copyright in the book." The German-language Wikipedia is second in size only to the English version, which has 2.3 million articles.
When I was working at IMDb.com (the Internet Movie Database), I asked Col Needham (the founder and managing director) why they never released it as a book. His answer was that the database was constantly changing. With the lead time you had to give for the actual printing, by the time any book hit the shelves, it would be months out of date.
I think Wikipedia falls victim to the same problem. It might be a very good book and they might select the most stable entries, but like IMDb, Wikipedia is a living, breathing thing that grows and changes on a regular basis. In fact, that's part of its appeal. A book is basically just freezing a snapshot of selected articles in time, but how much does something where part of its value is in its dynamic nature lose from being frozen like that?
- Greg
Start a happiness pandemic
I didn't see a reference [in linked article] to percentage of sale paid to Wikimedia, but found one here. My kneejerk reaction is that if only 5% of the sale price ends up in the pockets of Wikimedia: that sounds a little thin to me. The article does note that a staff of ten was required to edit the articles for content and length, but it still sounds like the publisher is profiting perhaps a bit more than normal off of the work of others. And knowing that many people will likely purchase the reference to support Wikipedia, it would be nice to see around 10-15% gross sale returned to the author (or, in this case, to Wikipedia).
My ballpark of "10-15% of gross" comes from the fact that although I am not in the literary world, I do work in entertainment (aka: cinema), and it's common for DVD producers to receive between $1.50 and $4 on each sold copy. On two of my films I receive around $3.50 after each wholesale transaction (when a chain retailer buys copies at $12/each wholesale to sell for $19.99 on their shelves). The second film in question was offered distribution to WalMart, and because of the bulk they buy in, the deal with them was closer to $1.50. (In the end, for artistic reasons that had to do with creating a specially "WalMart-friendly" edited version, we passed on the WalMart deal). I wonder if someone in book publishing can speak to whether the numbers I'm used to from video publishing are generally commensurate? I don't know what the cost-of-goods-sold for books is, so perhaps it's substantially high enough that it pushes authors' margins to a fraction of what they are in video publishing, but my kneejerk reaction is that 5% is too low.
I am Jack's complete lack of surprise.
So does this mean you can cite wikipedia as a valid source since it's in print! (yes, i'm joking)
------
"And may your days be long upon the earth."
Who are they going to fact check against? Wikipedia?
If they go by popularity in terms of the number of visits, I'm guessing that the entries like 'breast' and 'lesbianism in erotica' are very likely to make the final cut. But will it include all the pictures?
Apprehensions about Jimmy Wales' character aside, my main gripe with Wikipedia is that I am suspicious of everything I read there. Mostly this stems from the fact that in any topic on which I am an expert, I can generally stumble across several very glaring errors. Of course, reading topics on which I am not an expert, I find myself to be generally entertained and educated-- provided that I don't think about the likelihood of errors in those articles. I will grant that the errors usually don't take away from the overall education that a novice would receive.
With a staff editing the articles for content, fixing some of the more glaring errors, and selecting the more stable articles, I think a Wikipedia tome will nicely bridge the gap between meatspace and cyberspace. Keep in mind, not everyone has Internet connection at all times, nor is Wikipedia guaranteed to be functioning 100% of the time.. DNS errors, routing problems, etc.. they all occur. The last couple of years, have begun an interesting transition of merging between various forms of entertainment and education. It's no longer divided into books (paper), tv/radio (static electronic entertainment), and Internet (chatting, web forums, other forms of dynamic entertainment). You have tv shows producing extra content for web playing, you have individual content publishers using youtube and other outlets to publish stuff that would never otherwise have an audience, you have radio shows (NPR, etc) offering podcast downloads, you have paper books also being published electronically (Kindle, Googlebooks, etc), and now you have an electronic encyclopedia almost ironically making the jump to paper edition.
Call me an old fashioned geek, but I like paper, and given the chance, I'd buy a Wikipedia print edition.
I am Jack's complete lack of surprise.
So anyone can publish the same (or similar, or improved, or lighter, or more sustainable - recycled paper?) book, charge a slightly lower price, sell it in the same market, and profit!
...
Given the general community behind the content, would seem appropriate to print on recycled paper, or do whatever else passes as green publishing these days. Wait - isn't that what publishing online is about, saving trees? Hmm
How on earth is that going to work, cramming 50,000 articles into 1000 pages? They could edit each article down to a single paragraph and you'd still need a magnifying glass to read it.
Visual IRC: Fast. Powerful. Free.
It's going to be self referential! By the time the 50k articles get picked out, there will be an article on the book and hopefully the book will contain the article on itself! Sweet!
My kneejerk reaction is that if nothing is required to be contributed back to Wikimedia, then 5% is awesome!
Remember wikipedia's content is licensed under the GNU FDL, which states:
There are shills on slashdot. Apparently, I'm one of them.
Rather than publish the X "most popular articles," I think a more fun compilation would be a collection of the most unique, un-Encyclopaedia Brittanica articles on Wikipedia. Things that would never have made it into a real encyclopedia before the web, but that have flourished on Wikipedia. Or, along the same line, anything that showcases it as not just another encyclopedia would be cool. I'm sure there's some other cool ideas out there. (P.S. - My first ever Slashdot post!)
A book that contains 50,000 poorly cited articles about David Hasselhoff.
The words 'Don't Panic' should be printed on the cover. Hey, it's a start.
...for some troll edit to end up getting into the book. I hope they edit it really well and carefully read through it all.
"Rammstein is a German band that was formed in kyle is a big fag, Germany. They..."
Random Thoughts From A Diseased Mind (Not For Dummies)
So where does the other 19 euros go? Unless they're planning to print on gold paper, publishing costs can't be that much.
I can't vouch for the validity of these article stats, but they do appear to be legitimate.
Based on these top viewed pages, any book published using "popular" articles as a reference would be banal, amusing, and surreal. All at once.
You've got the all-time favourite internet searches "sex" and "naruto" along with recent political events, blockbuster movies and games, internet sensations and memes (2g1c, for example).
wikipedia w/o hyperlinks? no thanks. or does it come with a box of bookmarks?
The German Wikipedia is currently ranked 2nd according to the wikindex.com, but the fascinating part is what other popular wikis are out there: the World of Warcraft wiki is huge, beating many euro language wikipediae; TV show wikis are big, as are online games and sexual collections.
I guess my point is that I agree with you: the interesting thing about wikis is the non-standard collection of ideas, no matter how "non-important" or esoteric they seem to the general public.
davejenkins.com |
Anyone can edit their volumes with the included white-out and ball point pen.
You want fun, go home and buy a monkey!
Not to mention the age old argument "you can't grep dead trees."
A fool and his lamb are worth two in the bush.
To keep the spirit of wiki alive in this tome, it'll be printed in pencil and be sold with an eraser and a pencil for readers to edit the articles as they wish.
... why a hardcopy? One of the greatest appeals of Wikipedia is its searchability and linking. You can take a snapshot of Wikipedia and put it on a CD or DVD - save a tree or two and have a more useful version of the information. And still accessible to those without Internet connections or when Wikipedia is down.
"The agriculture ministry is not in charge of Gundam" - Japanese ministry official.
When I want to edit, do I have to cite?
I guess my point is that I agree with you: the interesting thing about wikis is the non-standard collection of ideas, no matter how "non-important" or esoteric they seem to the general public. Bingo!
One "side-wiki" that I frequent is the Lostpedia. Package that with the season DVD box set and you've got a whole new kind of product.
There is no tree shortage. Trees aren't being hunted to extinction like whales, but instead are being farm-raised, like carrots. There are way more trees in the US than there were at the beginning of the 1900s because of these techniques.
Should I stop eating carrots because of the looming carrot shortage?
No comment
I won't bore you with a detailed explanation of German defamation laws, but they are far more restrictive than the laws in the USA.
While online websites sometimes avoid defamation by quickly changing defamatory comments before they cause much damage, a published book does not have the same ability to be wiped clean in an instant.
What is to stop someone maliciously creating a defamatory article about themselves, waiting for Wikipedia to be published, then suing the company that produced the book?
I think it would be a brave publisher who would cede control to the millions of Wikipedia contributors.
If the pattern goes 9am, 10am, 11am, why isn't noon 12am?
Earth: Mostly Harmless
Tm
Support TBI Research: http://www.raisinhope.org
...En að Besta Sem Guð Hefur Skapað Er Nýr Dagur
Doesn't that kind of defeat the purpose of peer review? Now people can look at the obviously fallacious claims, and not do anything about it but complain that wikipedia is even more untrustworthy.
The subtitle will be [citation needed] ;-)
NB: The message above might reflect my opinion right now, but not necessarily tomorrow or next year.
I've seen snapshots of Wikipedia being sold on DVD in German supermarkets (e.g. Aldi). But I guess this is the first time they sell one as a book.
Give a man a match: warm him for an instant. Douse him in petrol and set him aflame: warm him for the rest of his life.
I'll just wait for the eBook to come out.
You mean like redicilously detailed descriptions of every episode of Arrested Developement. In the main wikipedia namespace like this episode. I forget what episode it was, but pretty recently while searching for something, I ended up on the wikipedia page for a freaking single episode of a tv series instead, with a similar name to what I was searching.
Now I love wikipedia precisely because of this kind of obscure information. I remember reading an article about "which is the most powerfull character in the Dragonball Z universe" and it pisses me off when some obscure article that I found is being deleted because it's not important enough. But for every single episode?
...as long as they don't include any articles about Harry Potter ;)
So, will this give the deletionists an excuse to go on a rampage, deleting articles they deem unworthy of being included in a dead-tree book ?
"This article is unnotable because it doesn't happen to interest me. Wikipedia is a real encyclopedia, not a collection of random facts, and we can't endanger our chances of getting published by including anything that Encyclopedia Britannica wouldn't. Besides, I'm in a bad mood and a little power trip might cheer me up."
Mod me troll if you will, but it's still true. The Deletionist Scourge will use any excuse. That's why I don't contribute to Wikipedia anymore: there's no point when the most likely result is to have said contributions deleted because Joe Powertrip hasn't heard of the subject previously.
Forget magic. Any technology distinguishable from divine power is insufficiently advanced.
In nice big friendly letters
I love Wikipedia. I believe it's a great boon to research and general knowledge -- if used correctly (i.e.: used as a kind of filtered Google-search, and as an initiator for further research -- but always to be questioned for the integrity of its data).
However, what really puts me off de.wikipedia is the tone (or: style) most of the articles are written in. Sophomoric gushing, sentences without end, wanton cruelty to the common comma (thanks T.P.)... these (and many more) insults to language in general, and scientific language in particular are to be found in almost every article.
You say that en.wikipedia is just as bad? Well, I cannot speak for everyone, but to this near-native reader there still is that blissfull transcendence of tackyness by translation: even the most ludicrous or cheesy constructs of the english language will sound sweeter than their actual meaning when translated into any old-world (roman) language while reading.*
A.C.
P.S.: Felle free to mok any grammatticcal or speling errors in this post -- I most definately deserve that! Inability to apply language, however, does not prevent one from noticing errors in other people's application of language (at least that's what I've experienced so far). Besides how gut is your German? Mock me again, when it's as bad as mine (und ich spreche die blöde Sprache seit ca. 2 Jahren nach meiner Geburt).
* Don't know about indo-chinese laguages though... my guess is anything becomes even sillier, but no one cares, and they like to party with words. And who could blame them?.
It's a good thing for Wikipedia. A lot of people are media-conservative in the sense that they don't take Web content seriously, particulary an encyclopaedia that is written by volunteers. Example: I wanted to prove a point to my dad a while ago using a Wikipedia article, and his reply was essentially "that article has no value and cannot be trusted as it was written by people hanging around on the Web". A printed book made by a real, large and well-known publisher might change this attitude, especially of those people who think Web content is worth less than printed content.
Also, I'd expect it to push Wikipedia contributions and the overall article quality. If people may expect to see their work in a printed book hopefully sold in large numbers, it will motivate them to contribute higher-quality content to Wikipedia. You can go to a book store and tell your friend: hey, look, I wrote some of the stuff in this article!
On the downside, I agree with those who wonder how they will fit 50K articles into a 1000 page book. 50 articles per page will mean one short paragraph per article on average. It's not possible to represent the nature of Wikipedia content in a space that small. Most articles will have to be edited down to the kind of content you would expect in any conventional (printed) encyclopaedia.
Also, I wonder how much Bertelsmann will benefit from this deal. 1 EUR per book for Wikimedia is not exactly generous. On the other hand, we can expect to see this book prominently on display in most every book store. If they sell 100K copies, Wikimedia will get 100K EUR, which means a lot to them.
I don't need a paper version, and I donated exactly the price of the book to Wikimedia earlier this year.
Sounds like you'd probably like Wikipedia's list of unusual articles. A print version of that would be awesome.
apterous.org
Go for it - keep it GFDL and make a few bucks :-D
http://rocknerd.co.uk
You'd be surprised how close to what you are likely to end up with your idea will be. It isn't the 50,000 most popular articles by number of edits they're working from, but by page views. The indices will include a list of the most frequently used search terms and the article to look at. In short, this isn't an encyclopedia, it is a sample of the instant, a Zeitgeist.
There will likely be at least one image on every single page, probably more. However this is going to be a hardbound edition measuring 17 x 24 cm - a pretty big book. Something a lot of people would refer to as a coffee table book.
The initial print run will be 20,000 copies and I somehow don't think they'll have much trouble shifting all of them.
The words are longer, but you don't need ans many of them. See, that would be one word in German.
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
Anyone else have this impression? Whenever I edit an article it is just a matter of hours before some registered troll undoes my changes - and this seems
to apply to many people, if the discussions are any
indication.
I was looking at converting HTML into PDF and found Prince XML . Some authors of a professional book on CSS wrote it in HTML and used Price XML to generate the master PDF document they sent to the printing press. This page has what a PDF version of wikipedia would look like. You can change the look by just changing the CSS.
This is:
"Bertelsmann is to publish a single-volume book of the German Wikipedia, in cooperation with Wikimedia Deutschland. 20 euros a copy, 1 euro from each copy to go to Wikimedia. They're taking the intro section from 25-50,000 articles for the 1000-page book, to be released in September. Who says open source writing can't work?"
http://rocknerd.co.uk
You mean like Red Iculous? You seem to love that one.
Justice is the sheep getting arrested while an impartial judge declares the vote void.
That is going to be a seriously small font!
Now I only know about the UK, but I'd be interested to hear a judgement on the compatibility between the GFDL (or similar) and the UK classification of "typographical arrangements".
Basically, a typographical arrangement (TA) is a collection of multiple works into a single volume. A TA has copyright protection for 25 years from the end of the year of first publication.
The idea is that I can research, for example, 18th century hymns and gather them into a single book. The hymns themselves aren't under copyright, so it would be no great work for someone else to replicate my hymnal, right down to hymn numbers and page numbers, undercut me and devalue my life's work.
So TA protection came along to protect my work. You can make something just as good if you want, but you can't make exactly the same thing (or even something substantially similar). So there.
I'm sure other countries must have similar laws regarding collections, compilations, albums or similar TAs.
Let's have a look at clause 7 of the GFDL:
7. AGGREGATION WITH INDEPENDENT WORKS A compilation of the Document or its derivatives with other separate and independent documents or works, in or on a volume of a storage or distribution medium, is called an "aggregate" if the copyright resulting from the compilation is not used to limit the legal rights of the compilation's users beyond what the individual works permit. When the Document is included in an aggregate, this License does not apply to the other works in the aggregate which are not themselves derivative works of the Document.Does each Wikipedia article constitute a "separate and independent document"? If so, the GFDL allows copyright protection to subsist on the compilation ("aggregate" in the GFDL's terms), even though every scrap of text is individually GFDLed....
HAL.
Got them moderator blues I blieve I walk out the do', With these mod-points I been gettin', I 'most never post no mo'
Do we get a book of pixelated cell phone camera pics of the back of the subject's head just like the real, post-Photo Nazi Wikipedia?
...it is written in pencil, and comes with an eraser and a pencil so that I can treat it like the real Wikipedia.
Andrew Borntreger
Champion of cinematic disasters
No they can't copyright the articles, but does the copyright still belong to the original author and can they revoke the right to have their article printed? Also they say 1 euro from each will go to Wikimedia, but where does the rest of the money go? IOW, is someone making a profit off of this and is that ok?
Arne Klempert, a spokesman for Wikipedia Germany, said the definitions would only be short summaries of the Wikipedia articles and there was no breach of the rights of Wikipedia contributors.
Commercial republication was allowed under the Wikipedia rules accepted by the site's users.
I can't believe Random House would have suggested this project without feeling they were going to make some money off of it. Their costs will consist of editing and publishing, but they won't have to worry about future writer royalties. I wonder if the writers have possibly given up their copyrights? As a writer, I might not have a problem contributing an article to something--say a special interest group's newsletter I was involved in--but I would want to retain all copyright claims to it, including the right to send it to a magazine at a later date and get a paycheck for it. I would NOT want to find that someone took my article from those newsletters and then published it (even in edited form) in a book. It seems to me that, even though Random doesn't hold the rights to the articles now, neither do the authors. Unless they (the book's editors) make the contributions so watered down that their value toward an encyclopedia of popular culture is negated.
I'll be interested to see if some of the contributors start to object.
If you've never been modded as "flamebait" or "troll," you've never tried to argue a minority viewpoint here!
It'll be even lower than 5% once they publish this and I reprint a knockoff version for sale at half the price!
Freedom isn't free; its price is the well-being of others.
You're missing the point: clause 7 says that an "aggragate" work (a compilation of various documents) does not constitute a derivative work. The GFDL applies to each individual document separately.
The question is what constitutes a single document. Would the law uphold that a Wikipedia article is an independent document, or would it classify the whole of Wikipedia as a single document? While the use of hyperlinks may suggest the latter, if we were to extend this argument to its logical conclusion, the whole internet could be described as one document. But maybe that's a bit of a strawman.
Anyway, I would argue that each article is an individual, self-contained document; thus the book is an aggragate work not governed as a single entity by the GFDL. As such, the German publisher may be able to legally block others from bringing out a substantially similar competing book.
HAL.
Got them moderator blues I blieve I walk out the do', With these mod-points I been gettin', I 'most never post no mo'
Why bother going through the trouble of collecting the best articles and proofing/editing them for accuracy? Don't you just end up with an encyclopedia? Do they even make those anymore? Hold on, I'm going to go and check over at Wikipedia...
Sigs are for suckers.
I must confess I wonder what would happen if an accident happened to the various chip fabs, how far that would set us back technologically. I hope that they aren't all in East Asia.
Perhaps what is necessary is a book containing the bare minimum to enable us to recover digital information. I suspect that flash would have remarkable longevity if you only write once, and store it in an enclosed nitrogen atmosphere with a dessicant sachet. Sandisk looks to be bringing out some sort of 100 year archival quality flash. It's not 1000 years, but it's a step closer.
If we had a collection of books detailing exactly how to go from stone-age to creating keyboards, monitors, computers (even a primitive computer), and then everything else is stored on flash, I suspect that could be useful. An alternative might be storing a few LCD screens, a few keyboards, a few completely solid state computers and a few solar panels in your basement for just such an occasion, and put ALL the info onto digital. That way if TSHTF you just put up your solar panels. Maybe you'd need a book detailing how to make a generator for when the solar panels eventually fail, which you could then hook up to a water mill.
A major concern with printed word is density. Libraries are large and vulnerable. It's rare to discover a forgotten library as such. It's much more common to uncover a scroll or book. I'd suggest that my "civilization in a suitcase" would be much more useful, fairly inexpensive and likely to survive barbarians if there was enough redundancy. I think the cost would not be much more than an order of magnitude though. An Eee PC, an airtight bottle, a few 16Gb flash disk, some dessicant, a generator construction manual, some solar panels... probably not more than $600. You'd probably have to resolder solid-state caps on the board though.
http://www.engadget.com/2007/02/27/sandisk-secretly-concocting-read-only-memory-for-archival-use/
If I have seen further it is by stealing the Intellectual Property of giants.
Um... what about GFDL?
http://en.wikipedia.org/wiki/GNU_Free_Documentation_License#Burdens_when_printing
"The GNU FDL requires that licensees, when printing a document covered by the license, must also include "this License, the copyright notices, and the license notice saying this License applies to the Document". This means that if a licensee prints out a copy of an article whose text is covered under the GNU FDL, he or she must also include a copyright notice and a physical printout of the GNU FDL, which is a significantly large document in itself. Worse, the same is required for the standalone use of just one (for example, Wikipedia) image."