Why Project Gutenberg Isn't There Yet
option8 writes "This wired article ('Any Text. Anytime. Anywhere. (Any Volunteers?)'), goes into good detail on why Project Gutenberg, and similar efforts, are far from creating a complete, free electronic library. A quote: "The mechanics of a universal library are simple. The tricky part: harnessing the free labor." Though it doesn't go into technology much, I expect there's a lot of potential in mass OCR tech and good speech recognition (faster to read a book aloud than to transcribe it correctly)."
Umm, Project Guttenberg can only legally use public domain works. If you know of any 100+ year old novels typeset in Tex lets hear about it. Even if a modern reprint was done recently, do you think the publisher would really want to give away all that hard work so that everyone can get it for free instead of buying their spiffy new edition?
Delivering militantly anti-commercial music to all two people who care!
Get manuscript from author.
This could be either handwritten or typed. If typed, it's likely to be in either plain text or Word format, but with a lot of errors.
If the manuscript's handwritten, farm it out to a typist.
We used to pay 0.5 yen a letter for English, 1 yen a character for Japanese.
Once it's data, edit.
I used to do my editing on a Mac with BBEdit, but this varies a lot between editors - some do it on (shudder) Word, where all the formatting gets in the way.
Reformat it to pass it to the DTP firm.
When I say 'reformat', I don't mean making things bold or italic - I mean cleaning it up so it's easy to do the next step, which is...
Print out and insert format directions.
The manuscript is printed out, and you go through it one line at a time adding things like "Line break here" and "Use larger font for this".
Proofs arrive from the DTP firm.
You go through the proofs, making corrections by hand (i.e., "Move this down one line", etc.)
The DTP firm passes you back the formatted data.
QuarkXPress is king here. You get the data in a finished form and pass it to the printers.
The printer produces the final proofs.
You can still make corrections, but these have to be done by the DTP firm, who then give you the updated data.
Last-minute corrections are made.
This depends on the printer, but quite often these are done by pasting the changes over the top of the printer film (i.e., they're not reflected in the data).
The book is printed.
Corrections after printing are usually done as described above (pasting changes over the film).
The problem with this is that the text data held by the editor is now out-of-date in all sorts of ways:
- It doesn't have the corrections made by the DTP firm.
- It doesn't have the corrections made by the printer.
- It doesn't have any formatting.
QuarkXPress can output the data in other forms, but it's still missing the last-minute changes and after-printing changes, and quite frankly once it's on the market, most publishing companies aren't interested in reworking the data to keep it as text for the next 90 years, so it can be released into the public domain.
Mickey Mouse will never be public domain because MICKEY MOUSE IS A TRADEMARK/LOGO. That would be like forcing IBM to give up their IBM logo/colors/design.
However, *Copyrighted* works should eventually go into public domain. The point is that after you are dead, anything - be it a movie, song, cartoon, book, poem --- whatever --- serves a greater good to mankind than it could to its dead creator. I think that a decade or two is too short of a limit for copyright. If I write a book when I'm 20 years old, I should still be allowed to make money off the sale of that book when I'm 40. But when I'm in the grave, it servs me no use.
Now, it could be said that a person who works hard to create pieces of work like movies or books or songs should be allowed to bestow the revenue from use of that material after the original author is dead. If I write a book that still sells well 20 years after my death, my son and daughter should be allowed to benefit from this copyrighted item in my 'estate'.
But I think that indefinite extensions are rediculous. I would say that 100 years is bordering on ridiculous. I think that 75 years is reasonable. If I create something when I'm 25, the copyright will outlive me by as much as 25 years.
In fact, I would propose that copyright should be extended to the life of the creator plus 20 years **OR** 50 years. Whichever is less (so if you die two years after the copyright, the copyright is still in effect for another 20 years).
I prefer to phrase it, "Thus Project Gutenberg has raced ahead at an amazing rate. In its 32nd year in existence, the collection has 6,267 etexts, averaging almost 200 etexts per year. That works out to about one book every other day. This is more impressive given that in the first twenty years of the projects existance the Internet didn't exist anywhere near the form we take it for granted today. The popularization of the Internet has just accelerated the rate the Project Gutenberg grows. With the help of Distributed Proofreaders, a project that allows average people to donate small amounts of time to proofread just one page at a time, Project Gutenberg can expect to add over 400 etexts per year. Clearly Project Gutenberg is thriving."
Search 2010 Gen Con events
Floppy disks get magnetized, hard drives crash, optical disks get scratched...A book can take a beating, man. All the OCR and voice rec in the world won't change this until we can get widespread, cheap cartridged optical media.
One small book also takes up the space of a hard drive, and can't be redownloaded, or backed up. If my roof leaks, or I have a fire, it will cost me thousands of dollars to replace my books, and some will be hard to impossible to replace. If my hard drive crashes, I redownload the files from Gutenberg, and/or restore them from my backups.
That's all for now. Thanks to all the supportive comments in this thread, and to all the constructive criticism. And remember, a page a day is all it takes to contribute!
Greg Newby, Director and CEO
The Project Gutenberg Literary Archive Foundation
www.gutenberg.net