Project Gutenberg's 32nd Birthday

← Back to Stories (view on slashdot.org)

Project Gutenberg's 32nd Birthday

Posted by ryuzaki0 on Friday July 4, 2003 @06:04AM from the read-franklin's-autobiography dept.

David Moynihan writes "July 4th marks the 32nd anniversary of that day in 1971 when Michael Hart first sped an all-caps version of the Declaration of Independence to anyone and everyone then on what later became the web, thus founding Project Gutenberg. Thanks to an army of volunteers and the Distributed Proofreaders, this is the last year PG will have fewer than 10,000 titles. Strangely, Microsoft picked this dual anniversary of literacy and freedom to re-launch their Reader product, with three free bestsellers a week, if you activate the new version with Passport, sign a EULA, etc. Real reason for the upgrade might be that the DRM on MS's old Reader was cracked. If you're not into giving away data, or are running a system other than Windows, maybe you could take the time to tell a friend about free books online, or even help out by visiting the Distributed Proofers and editing one page per day."

15 of 178 comments (clear)

Min score:

Reason:

Sort:

very timely for me by b17bmbr · 2003-07-04 06:16 · Score: 5, Interesting

i am going to be teaching modern civ next year in high school (i have been at the junior high for 7 years) , and have already gone to the site and gotten works from aristotle, plato, locke, montesque, et al. thanks guys. there is still something to be said for a classical education. glad somebody is doing all they can to preserve the classics, especially with all the assaults on it from the social reconstructionists.

--
My problem? I was perfectly gruntled, until some numbnuts came by and dissed me.
Really great work by the guys behind the project! by jaemark · 2003-07-04 06:17 · Score: 5, Interesting

There's really a problem though about getting the word out to people, in pretty much the same way the popularity of libraries today has been dropping. A good idea would be a separate advocacy site to come up with lists of texts in the project (i.e. What's New?, Most Popular, etc.) to help people wade in immediately.
Re:'reader' books not much cheaper by Jonathan · 2003-07-04 06:24 · Score: 2, Interesting

So... why in the world would anyone want to use a format that ties them to the computer?? With a paperback, I can read it anywhere, read for as long as I want without having to change batteries, and even pass the book onto a friend

Well, I don't use MS-Reader myself (For commercial e-books I like the cross-platform Mobipocket), but a major reason I like e-books is I like to read them on my PDA -- not to save money. I carry my PDA around anyway, and having e-books means less to carry. I would purchase all my books as e-books if they were available as such.
Too bad... by Insurgent2 · 2003-07-04 06:26 · Score: 5, Interesting

Unfortunately, with the copyright periods being extended so long, the material will only be of (ancient) historical interest. The 98 percent of copyrighted works that are unpublished and should be on there, unfortunately, gets to sit collecting dust instead of benefitting mankind.
Business Model by AndroidCat · 2003-07-04 06:27 · Score: 1, Interesting

1. Gather great PD books.
2. Hard work to put them in computer form.
3. ????
4. Profit! (For all humanity.)
Hip-Hip-Hooray for a job well done!

--
One line blog. I hear that they're called Twitters now.
Re:You can't be serious by Aldarondo · 2003-07-04 06:35 · Score: 5, Interesting

As one that has been involved with Distributed Proofreaders for the past 18 months, yes we are serious about having Slashdot people proofread. The last time a story about D.P. ran in November, thousands of new users joined us and helped us grow and expand to our current size.
Go and check it out, there is great work being done there. (I am a bit biased though). Click here for a history of DP.
MS Reader is crapola by blair1q · 2003-07-04 06:41 · Score: 2, Interesting

"cannot open this title on a Terminal Services session"

What bollocks. Free software and free books but you can't read them over a network link to your own compute server? Microsoft, as usual, screws the pooch.

Now. How do I uninstall this without removing my adenoids?
Greenstone by gmaestro · 2003-07-04 06:45 · Score: 4, Interesting

Great to see a project like this run on Free software. Read more at Greenstone's website.
Re:'reader' books not much cheaper by Joe+Tie. · 2003-07-04 06:46 · Score: 3, Interesting

Someone else mentioned the fact that he's got a reader with him all the time anyway, which makes it pretty conveinent to have a book or three in there. I'm not going to bring a book around with me everywhere I go just on the offchance that I might get stuck in a long line, or waiting for someone. But when such an event happens, having good reading material right at hand is very nice. Also nice is being able to have a selection of books in there at any one time, just in case I finish one book while waiting somwhere.

Battery life isn't much of an issue for me. I've got an older ipaq, and even with that I can usually squeeze about ten hours out of it with the addition of an extra battery pack that's small enough to tote around with the pda. Hooking it up isn't much of an issue. Take out of pocket, plug into pda. And if at home, the power situation wouldn't be an issue.

--
Everything will be taken away from you.
Speaking of XML markup by Moderation+abuser · 2003-07-04 07:40 · Score: 2, Interesting

http://www.conglomerate.org/

Lovely bit of kit.

--
Government of the people, by corporate executives, for corporate profits.
Re:You can't be serious by tommertron · 2003-07-04 08:04 · Score: 1, Interesting

The thing is, this brings up a somewhat serious point. I've proofread professionally in the past, and I know that it's hard and nobody's perfect at doing it. An open approach might work with software, because anyone can easily test it: there are bugs in the program. But without a wiki-type format (www.wikipedia.org) who is there to make sure it's proofread properly? If this is proofread incorrectly and distributed to schools and stuff, I have to worry about the quality level of the texts students are learning with if they use the free texts. I have in fact read a lot of public domain texts, and find typos and grammatical errors to be fairly common in them. Would a wiki-format help open texts? (Or maybe a moderated wiki-format.)

--
Random rants about technology: http://technorants.blogspot.com
Re:Really great work by the guys behind the projec by Anonymous Coward · 2003-07-04 08:17 · Score: 2, Interesting

Want to know what's new, etc? The Project Gutenberg website admittedly sucks, and their ASCII adherence admittedly verges on dogma, but there is a good substitute:

The Online Books Page
http://digital.library.upenn.edu/books/

It currently has 20,000 FREE titles listed, from hundreds (at least!) of sources, in all subjects, beautifully categorizes by title, author and subject--and topped off by an up-to-date what's new listing and a fine search engine. Much props to John Mark Ockerbloom and the University of Pennsylvania for supporting the site.

P.S. Won't one of you nice Slashdotters with time or interest in good works consider doing a complete redesign of the PG site, a full-text on-site search engine for the texts, a better categorization system and just a decent, half-respectable look? It don't get no respect lookin' as it does now. Among other things, the lack of internal organization means that individual texts get shafted in Google rankings.
This is just wrong by Anonymous Coward · 2003-07-04 09:22 · Score: 2, Interesting

XML is not a character encoding. XML does not require the use of non-ASCII characters. What can be represented by an XML document is a superset of what can be represented by a plain ASCII document. XML is a human-readable markup.

MS Word 2000 .doc is a binary format.

I suspect that you have very little idea what you are talking about.

PG already uses XML-like markup to indicate an emphasized portion of a passage, among other things. If we were to accept your argument, then even this alone should be seen as a failure.

Afterall, what if over the course of 50 years we forget what "blahblahblah" means? What if in some impoverished country, while the people have the processing power to read these documents, they do not have the processing power to parse out ?!

Both of these worries are foolish. If you use an XML format for open content, you have an obligation to provide openly the strict and formal DTD or schema which describes your XML markup.

What if this DTD or schema becomes lost? This won't happen, because you can embed the DTD or schema in the distributed documents (the books) themselves.

What if we forget how to parse XML?

Yes, if there were a terrible war which left the entire planet in shambles for 100 years, then we might forget how to parse XML.

But this is no different than with ASCII. We could just as easily forget how to convert binary data (you know, '1's and '0's) to corresponding ASCII characters.

Now, even if there were such a catastrophy, you insult the human creature by suggesting that we would not be able to figure this out, and to figure out the XML DTD or schema. Have you ever read an XML document following a standard article or book DTD or schema? It is painfully obvious what the markup means, and what its use is.

However, all of this discussion is just silly, because there probably will not be such a catastrophy in the near future.

You are forgetting that change is gradual. If a new format becomes popular (and this is unlikely, because XML can describe any possible format), it will be a matter of an hour or two to convert the entire PG library to the new format.

And if the new format is as well defined (as we should hope) as the existing XML format, then this process will be painless.

You are welcome to continue to comment and complain from a position of clear ignorance, or you can admit that there might in fact be some things which you are not an expert on (suprise!), and that others understand better than you.

We are telling you that using a strictly defined XML format would in every sense be the better choice. It does not require the use of non-ASCII characters. It is human readable. It is well defined, Conversion of the XML document (which for your purposes would not be very complex) to plain (as in not XML formatted) ASCII strings can be done by a 15-20 year old processor or by hand if needed.

In fact, since it is human readable, there is no need to do the conversion at all if we some day find ourselves in a situation where we can not automate it (as in after a worldwide nuclear armageddon). The document can be read as is if needed, and the structuring afforded by XML will be just as clear.
Re:You can't be serious by croddy · 2003-07-05 02:34 · Score: 2, Interesting

well, that was fun. I think it would be more addictive if I got to do pages in order though...
Re:How to sperad the word... by Junkster+Julian · 2003-07-05 12:19 · Score: 2, Interesting
I wasn't listing the final specifications for a device in detail. Yes, it would have HTML support, and CSS would be useful to have as well. With HTML, people are going to want images supported, that means a few different libraries there as well.

Ok I'm gonna tone myself down a little... this should be a little less of a rant so hang on. The point I was trying to make is that I think HTML should be the one technology an ebook reader should be able to support unlike even standard desktop browsers. I'm not sure it would be such a strech to see the "web browser" condensed into a hardware-streamlined product. SGML support would be great but to implement SGML we must first master HTML, and if we can't deliver an machine dedicated to rendering HTML then how much chance would we have in implementing a technology with less sample-base? It's hard to match HTML in terms of demographic penetration at least in so far as actual text-based content... contrast with postscript, pdf, and the like which (for the most part) do not have human-readable source -- essential for "debugging" our ebooks.

Yeah and the pdf reader for WinCE needs, uhh, "work". It is by no means comparable to its desktop cousins... a cheap knock-off from a huge company complaining about the limitations of PDAs. IMHO, avantgo is a considerably better "ebook reader" that's easier to code for and is far more compatible. HTML 3.2, that's it... can't go wrong. Visit my site and you'll know what I'm talking about: popnt.com Keep in mind my work is still beta, but anyways.

And about permanent media.. well.. I'm going to go way out on a limb here and suggest that print cannot truely be compared with your examples.. although I do in all seriousness appreciate your debate. Just for the sake of argument, what distinguishes print from (at least) the three examples you listed (and please I hope this does not escalate) are the following:
1. Stone tables were never mass-produced in the same way as books (or paper media) were: sure there were sandscript, but specifically what distinguishes print as breakthrough was its potential for industrial mass-production via inventions like the printing press.. ubiquity made the press permanent in many ways.
2. Music (and movies): due to the very recent inventions of the gramophone and that which makes up a motion picture (the camera, film, etc), I'm not sure these can be compared to print media, specifically because of their very recent introductions to society.. note that I am not saying music is a new introduction, rather recorded music.. so in that light, and given the whole MP3 hoopla we're having with the RIAA et al, I think the music/movie industries would have a lot to learn from the print industry -- not the other way around. Also, the music and movie industries themselves use a concept very closely tied in with books in that they are given data to process. I'm not sure music/movies can really compare, in all seriousness to books.. in all honesty, I'm not sure there is much out there that even CAN compare to the print industry. These are secondary industries which require processing that print-media does not. Print is unique in that respect and is therefore again really tough to beat! Even braille is a form of print which requires nothing whatsoever, not even a light-source! What makes print so permanent is its ubiquity -- the sheer volume of static copies whose content and information cannot and will not change over time. No other industry has this power.