Project Gutenberg Made Accessible
scishop writes "Mazarin is an open-source interface to Project Gutenberg's library. Mazarin increases the accessibility of Gutenberg's 10,000+ books as it formats the books for HTML display -- providing paginations in addition to generating table of contents and other advanced markup features -- along with enabling users to carry out full-text searches on the entire library."
Interesting idea, I can't get to the website but a feature I'd want is the content shared P2P so you don't have to rely on a central server for the content.
;).
A central webpage index could just have ed2k links to the files: sharereactor for books. When they update the book they release a new hash-link and the file onto the network.
It being P2P it could open it up to more then just public domain books too
10,000+ books. Right, so I've got to read all of them before I can post a comment?
Oh wait, this is Slashdot.
Where's the Kaboom?
There's supposed to be an Earth-shattering Kaboom.
This sounds like it just adds complexity and does not make gutenberg's data accessible.
There were several research projects for which I used pg as a corpus. However, pg's a terrible hassle for the first-time researcher, since the format of the introductory text ("we're gutenberg, here's the copyright, blah blah") is inconsistent.
You have to remove the introductory text to avoid bias in the corpus, however there are so many pathological special cases (different formats, spelling, languages, words used, punctuation, case) that it requires several hours of Perl coding to successfully strip the header text from 75% of the documents with >99% accuracy. Yuk.
If gutenberg is serious about making their work more accessible, they should think about the simple concern of ensuring consistency in the header text format.
since some seem to have trouble on the index page... here it is:
Project Gutenberg is the brainchild of Michael Hart, who in 1971 decided that it would be a really good idea if lots of famous and important texts were freely available to everyone in the world. Since then, he has been joined by hundreds of volunteers who share his vision.
Now, more than thirty years later, Project Gutenberg has the following figures (as of November 8th 2002): 203 New eBooks released during October 2002, 1975 New eBooks produced in 2002 (they were 1240 in 2001) for a total of 6267 Total Project Gutenberg eBooks. 119 eBooks have been posted so far by Project Gutenberg of Australia.
Click here for the full PG story and here for the latest News , and learn about the Stockholm Challenge Award recently won by Project Gutenberg in the category Culture.
The key link is search page.
Do you need a website upgrade?
Charles Franks
Founder, Distributed Proofreaders
If you have a palm pilot, i can recommend Weasel Reader.
I've been using it for a couple of years on my Palm V, and despite its small screen size it works perfectly for reading ebooks.
Bah. Posting HTML is so 1996. You can do so much more with these texts. One example is Open Source Shakespeare, which takes all of Shakespeare's texts, indexes them, presents them in an attractive manner, creates a concordance, provides a full-text search engine, organizes the lines by character, etc.
All of the texts are open source, and you can download the database and source code from the site, too. Check it out.
Indeed, there are many, many sites that do all sorts of wonderful things with Project Gutenberg eBooks. That's the wonderful thing about PG, you can do anything you like with the books.
While personally I prefer the original and the best... hey, whatever floats your boat!
It is very much worth noting that Project Gutenberg would have nowhere near as many eBooks as it does without the help of Distributed Proofreaders. Sign up there, and proof just a page a day to make your contribution to preserving literary history. You can proofread as little or as much as you like, and do something worthwhile! Distributed Proofreaders is a great way to spend some of your time.
Quote:
...donating to the good cause. If you don't want to donate money, volunteer to proofread, or it might be worth it for writers out there to consider a notation in your will that will allow your works to pass either directly into the public domain, or, as i have been in contact with lawyers to discuss, simply passing the copyright of your own works on to project gutenberg. This allows them more work to publish, and if you're in a contract somewhere that allows for royalty collection, you can set it up so that those royalties switch to project gutenberg at the time of your death.
Now might also be a good time to contribute an hour a week to a literacy project, or to make a donation there. Adult literacy is a serious issue all over the world, and that includes right here in the states, where there really are bright people out there who could have better lives if they could read. I can't think of a more on-topic subject than project gutenberg to discuss adult literacy and the need for both literacy teaching and to support free literature for the masses such as this project provides.
Just my $0.02...
solemndragon
"I'd say 'Have a good time,' but arson is still illegal.