Distributed Proofreaders Posts 5,000th E-book

← Back to Stories (view on slashdot.org)

Distributed Proofreaders Posts 5,000th E-book

Posted by timothy on Tuesday August 24, 2004 @06:41PM from the error-checking-and-correcting dept.

bbc writes "Distributed Proofreaders has posted its 5,000th ebook to Project Gutenberg. The book, a Short Biographical Dictionary of English Literature, by John W. Cousin, was proofed for this special occasion by over 500 volunteers. Distributed Proofreaders is a project that distributes the otherwise gargantuan task of correcting scanning and recognition errors in an OCR'ed text. The project has thousands of volunteers, of which many hundreds are active on any given day. It is currently the main supplier of etexts for Project Gutenberg."

18 of 144 comments (clear)

Min score:

Reason:

Sort:

Hm! by martingunnarsson · 2004-08-24 18:46 · Score: 4, Interesting

They should offer their services to authors and magazines, and raise some money from what they do. It wouldn't be enough to split between the involved proof readers I guess, but the project itself could get some money to buy...well, whatever they might need. Perhaps they already do this, I'm too lazy to find out :-)

--
Martin
1. Re:Hm! by FlipmodePlaya · 2004-08-24 18:54 · Score: 2, Interesting
  
  Sounds like a cool idea, and I'm not sure if they've done this either. I know that if I were sending a magazine out to a ~million readers, I would place great stock in my editing. The Distributed Proofreaders project probably wouldn't want to be held liable for the mistakes of volunteers, especially with the possibility of trolls.
2. Re:Hm! by baegucb_18706 · 2004-08-24 23:47 · Score: 2, Interesting
  
  Australia has a somewhat more favorable copywrite laws. Take a look at http://gutenberg.net.au/ which has some texts you can't download in the USA *wink*
500 people read it? by tod_miller · 2004-08-24 18:52 · Score: 3, Interesting

The book, a Short Biographical Dictionary of English Literature, by John W. Cousin, was proofed for this special occasion by over 500 volunteers.

Hardly a non-put-downable... I suppose that is is a Biography (Shouldn't that be bibliography *chuckle*) of english literature is kinda symbolic.

I guess this more than doubles the total number of people who have read this book though!

I like Gutenberg, I hope they start a system where you can download copyright books for a micropayment, I would pay good money for text ebooks.

Lets hope ebooks don't go the way of music, keep the costs low, no DRM fluffing up the download. If you can click 3 times and start reading a new book, and it costs you euro's then you would preffer that than d/l gigs of warez.

Anyone who illegally downloads lots of books, tends to be the person who does't read them much anyway (Someone boasted to me that they had 300 O'Reilly books, squirming under the desire to tell me that they were eBooks, off irc, oh lawks, what a riot, I wish I was your friend, go away)

--
#hostfile 0.0.0.0 primidi.com 0.0.0.0 www.primidi.com 0.0.0.0 radio.weblogs.com
1. Re:500 people read it? by tod_miller · 2004-08-24 20:26 · Score: 2, Interesting
  
  Would that 'honesty book shop' appease Authors? I meant buy new releases and older copyrighted works (even out of print copyrighted that Gutenberg won't touch) If I want to re-read some Orwell, Asimov, Steinbeck or others, where do I go? (pah, library...)
  
  Who cares what publishers think, they are wondering how they can be a middle man in a digital age. We will start with good bi-format books, all available in eBook, all 100% well formatted. Then some will move more over into eBooks.
  
  Then every internet whore will inflict their putrid poetry onto the world. Tum tee tum.
  
  --
  #hostfile 0.0.0.0 primidi.com 0.0.0.0 www.primidi.com 0.0.0.0 radio.weblogs.com
Who picks this stuff? by Animats · 2004-08-24 19:16 · Score: 3, Interesting

"Final Report of the Louisiana Purchase Exposition Commission"?
Still, I look forward to the day when someone starts digitizing the Mechanics Institute Library in San Francisco. It's a beautiful private library one can join. The books are in excellent condition, and there are century old original editions on the shelves.
But it's the magazine collection that's stunning. They have Popular Mechanics in bound volumes, all the way back to the beginning, when it was a serious scientific journal. All the major railroad magazines from the heyday of railroading. Every issue of Electric Railway Journal (the trade magazine of streetcars). Few other libraries kept that stuff.
law of averages? by Anonymous Coward · 2004-08-24 19:21 · Score: 2, Interesting

All in all, I have to say that I think this project is better than nothing at all. I am sure that the proofreading is better than what was there before.

However, I am curious as to just how accurate the proofreading is. I think that they try to improve accuracy by having many different volunteers; accuracy in numbers and all that. However, just because many people think in a certain way, does not mean that what they think is accurate. Just look at standardized tests. They are specifically designed to make use of common mistakes, so that the majority (the swell of the bell curve) all get the wrong answer together. Only a slim minority will get all the questions correct. Considering how many people (even educated people), get around average on even the verbal and English sections of such tests as the SAT, GRE, etc., I wonder if certain passages in books will be incorrectly edited on a mass scale. This would especially be true for older or more complex works.
Re:good books? by adolfojp · 2004-08-24 19:44 · Score: 3, Interesting

http://www.icarusindie.com/Literature/Library/

That site has a couple of good ones. You should read first "The lost continent". The book was written shortly after, or during WWI and follows a hypotetical developement of the world if the new world and the old world had lost comunication until 200 years later. The most interest thing about those old science fiction books is to contrast their world view with ours and to see what futuristic devices would exist by now.

Cheers,

Adolfo
Re:I need a new job by jonathan_ingram · 2004-08-24 20:07 · Score: 5, Interesting

Luckily, you do not need either grammar or spelling skills -- just the ability to match text against a source image. Indeed, it may even be an *advantage* to not be a great linguist! One of the key things we emphasise is that we want an exact copy of the source material -- we do not want people 'correcting' or 'updating' the originals to bring them into line with the way the language is written today.

--
-- Help Digitise the Public Domain at DP.
Re:Make them renew each year by iamdrscience · 2004-08-24 20:07 · Score: 3, Interesting

Lawrence Lessig proposes a similar scheme in "The Future of Ideas". I doubt he was the first, but that's just what you made me think of. It's a good book, even though it can get kind of dry at times (it is, at least in some capacity, a book about law after all).

As far as your scheme though, I would really like a hard extension limit and I think 25 years for a default term is really too much (I mean, to use your example of Apple II games, many of those games wouldn't even quite be out of term yet). I think 5 or 10 would be much better.
Re:because by jonathan_ingram · 2004-08-24 20:09 · Score: 4, Interesting

because playboy hasnt lapsed into the public domain yet...

Very true, although several of us do keep talking about searching for some Victorian Porn to put through the site :). There's actually quite a lot of public domain 'erotica' (anything written and published before 1923, for example) -- we just need people to scan it and contribute it to the site! We've had a couple of 'racy' books, and not surprisingly they tend to be proofed very quickly.

--
-- Help Digitise the Public Domain at DP.
Re:Make them renew each year by RAMMS+EIN · 2004-08-24 20:40 · Score: 2, Interesting

``You get automatic copyright for 25 years. After that, you must pay $1 per year to keep something in copyright. If you can't be bothered to keep track of your stuff and pay the $1, it lapses into the public domain.''

I would even go a bit further. Why even have a default term at all? (and 25 years is a LONG time) And $1 is arguably a bit little. If you really care, you can pay a bit more. Maybe we can even have different levels of protection - pay nothing if you allow modifications, pay more to retain exclusive rights to distribution, etc.

I think this is an interesting idea worth investigating. Thank you for publishing it!

Oh, and BTW, I will be using your idea as if it were mine, unless you pay your $1, of course. ;-)

--
Please correct me if I got my facts wrong.
formatting by golgotha007 · 2004-08-24 21:16 · Score: 3, Interesting

I think the Gutenburg project is a terrific idea!

My only complaint is with the formatting. Project Gutenburg uses hard formatting within the text. I think that's an extremely stupid idea.

There should be zero formatting within the text (other than paragraph breaks). Whatever client you're using should provide the formatting for you.

Let the client handle the presentation!!
Re:How strange by bbc · 2004-08-24 23:21 · Score: 2, Interesting

This is all my fault! :-(

I got a bit carried away. This 5000th project was organized so that as much proofreaders as possible would work on it. (Although any book going through DP runs a chance of being proofread by many separate people, usually proofreaders stick with a certain book for a while, so that the work has only been seen by 50 or so.) I was so glad we pulled it off, that I sent a story to Slashdot without thinking.
Request for MATH experts by jhutch2000 · 2004-08-25 00:52 · Score: 5, Interesting

Right now, we've got plenty of old math intensive books ready to move through the DP system. Because of ASCII terrible ability to handle equation formatting, we use TeX layout. The average DPer doesn't know TeX and it's a rather high learning curve to get started on. So, since Slashdot is full of self-professed geeks...all you TeX geeks should join up and help with the TeX formatted MATH texts. I've got plenty of books scanned and ready to go, so don't think you'll run us out of 'em any time soon!

JHutch
Accuracy by jefu · 2004-08-25 01:45 · Score: 3, Interesting

I have worked on the distributed proofing of a couple of texts and found that the accuracy of a page after the second proofing was often close to perfect.
One of the books I worked on was the "Anatomy of Melancholy" and I (conveniently) have a copy myself. There were often more differences between the scanned image of the page and my copy than between the scanned image and the proofread text.
Don't underestimate the amount of work people put into this too - for "Anatomy of Melancholy" it often took 30 minutes to proof a single page because the page often had latin and very small footnotes.
Re:because by jhutch2000 · 2004-08-25 02:58 · Score: 2, Interesting

Yeah! I'm one of the "several" that Jon's referring to. I got a real kick out of recent book that was posted by us to PG...

Sane Sex Life and Sane Sex Living
For a turn of the century study of sex (published 1919), this guy was amazingly (IMHO) progressive! A very fun read! JHutch
Re:Shocking by Dizzle · 2004-08-25 05:15 · Score: 2, Interesting

Some of those are legit too. Professional/Professor reading gets shortened to profreading. The other mistakes are mostly users.

--
-Dizzle
"I most likely AM so interested in myself."