25000 Books Proofread By Project Gutenberg Distributed Proofreaders
New submitter fritsd writes "Project Gutenberg Distributed Proofreaders, a volunteer site which helps provide public domain books to Project Gutenberg, announced that their 100 000+ volunteers have reached the milestone of 25 000 books scanned, OCRed, and then meticulously proofread."
The 25000th title is The Art and Practice of Silver Printing by Capt. Abney and H. P. Robinson.
Many thanks to Project Gutenberg and their volunteers. There is a lot of great public domain material out there, and I've especially enjoyed Dickens, Wilkie Collins and Trollope. Also Jules Verne's work is pretty good for French learners.
If I'm not mistaken, they mean meticulously proofred by us in reCAPTCHAs.
When I was proofreading on DP, all rounds of proofreading involved examining the scanned images and comparing it to the OCR text and making corrections. The later rounds of proofreading involved increasing attention to various details of correctness and formatting. All of this was done directly in the DP web interface. I didn't see any mention of the use of captchas in the OCR process.
I'm glad Mr. Guttenberg is doing something with his time and money with such a noble project as this. I guess it makes up for the Police Academy movies he did.
Now if I only read books.
Yep, multiple rounds, and multiple levels of proofers and formatters
who have to earn the right to access those higher rounds
by completing hundreds of pages and passing a few tests.
proof that socialism is a failure.
Proof that people can in fact be decent, generous, and caring.
Proof that people can in fact be decent, generous, and caring.
Or bored. many years ago I had this temp job of staffing the front desk, really quite little traffic and the occasional call, collecting the mail and various other small duties but a lot of downtime and no interest in training me for more since it was a rather short contract. Project Gutenberg seemed like a good way to pass the time, and they were cool with it as long as I tended to my other duties when they needed tending. Seem like a better use of my time than playing solitaire.
Live today, because you never know what tomorrow brings
Proof that you are a moron: you don't know the difference between 'whose' and 'who is'...
That's why he's pissed at PD. They didn't like his work product.
Oh, I'm sorry sir, I thought you were referring to me, Mr. Wensleydale.
I signed up and proofread a few pages when I saw someone mention this site in the comments a few weeks ago. It's pretty interesting stuff and is mostly intuitive, but there are some tricky corner cases, e.g. hyphenated words that span two lines. Back in the day, publishers were pretty inconsistent about what words were hyphenated (e.g. to-day), and Project Gutenberg is (rightly) adamant that the text maintains the original spelling and hyphenation.
The only thing I completely missed was that I didn't put an extra newline at the top of the page when the first line was the start of a new paragraph. Those instances were found and corrected by the second-round proofreader. There is a third round of proofing, two rounds of formatting, two rounds of post processing, and then an optional "Smooth Reading" round that anyone can do. I've checked out a few of the finished products, and they are much, much better than the naked OCR'd texts of old.
There is a lot of great public domain material out there
So what happens once Project Gutenberg has finished releasing all notable books in the English language that were first published on or before 1922?
I have read quite a few of their books and have found them all to be high quality edits.
I would like to thank everyone who has worked on the project for the excellent job they are doing.
(In contrast, I recently purchased a Kindle copy of Paul Theroux's The Happy Isles of Oceania which is about 20 years old and they obviously produced the electronic copy by OCR and from the looks of it did little or no proofreading. There were obvious typos on every page. It's irritating that a publisher who actually get's paid to do this work can't be bothered to do even cursory proofreading.)
Makes you appreciate the fine work the Gutenberg people are doing.
I don't read your sig. Why are you reading mine?
Many thanks to Project Gutenberg and their volunteers.
Also many thanks to Michael Hart, the founder, heart, and soul of Project Gutenberg. Michael passed away in 2011. Although I never met him face-to-face, we exchanged many emails, and even spoke on the phone a few times. He was a generous and selfless man, and somewhat eccentric (but in a good way). We love you Michael, and we miss you. You made the world a better place.