Giving Project Gutenberg Recognition
In an email from Michael Hart, the head of Project Gutenberg, he says:
"Getting the Etexts to twice as many people is just as important
as creating twice as many Etexts. . .but without MAJOR publicity it
is not likely to happen. . .we constantly get messages from readers
who tell us they have been LOOKING for Etexts for years and just at
that present time FINALLY FOUND US. . . . That means we cannot get
to a major part of our audience with the kind of publicity we have,
we need something more. . . . For example, we were the first in an
entirely new column: "People To Watch" in the November 8th edition
of TIME magazine, but we have received less than a dozen emails per
that article. . .what we really need to do is get on Oprah Winfrey,
and hopefully add something to her book club. Those of you on AOL,
perhaps you could email the show and request they invite us. . . !
We should undoubtedly also try the other talk shows, and "magazine"
shows, etc. All the press we receive is from them contacting us, I
have had no luck "generating" publicity. . .which seems to be easy,
for those who have the knack. . .it's just not MY knack. . .help!!!"
So, if there's anything that you can do to help - do it!
Proj. Gutenberg has saved me much money on books. When I need one for school, or just to read, I grab it and download to my Palm. (And when i get bored, I can always switch right over to Tetris or Hardball). Thanks Project Gutenberg!
-mark
If your computer says LINUX, run...computers can't talk! [unless you have text-speech software]
Don't think they'll let you in, they'd probably find some clause about you not being the authors or something... Anyway, can you really afford the bribes? =)
- -------------------
On the topic, I find the Etexts QUITE useful... suggested bookmark #1.
--------------------------------------------
Everybody's got something to hide except for me and my monkey...
www.stampede.org
-------------------------------------------------
Everybody's got something to hi
This is an incredible project which needs a lot of support.
Another worthy Gutenberg-style project which deserves help is the Internet Dictionary Project. They are building the world's first copyright-free English-to-French dictionary by typing in the text from an old out-of-copyright dictionary by Spiers dating from 1853. They're asking for help, which involves typing in the text, using some simple formatting rules, for any of the remaining pages by reading the scanned images of the original book.
After joining the project, you download a scanned page and type it in according to their instructions. An important issue is whether they should be aiming to put the resulting work into the public domain as they state, or under a licence offering something akin to GPL's protections?
Try teaming up with amazon or something.
I've been using PG for a long time, and it's an EXCELLENT resource. The people running it obviously do need help publicizing, and I think once they get started the project will really take off, seeing as they already have tons of information. As the Subject says, once they hit critical mass in terms of public knowledge they're set. Another hassle they're facing is copyright law. The PG site lists some elementary info on copyright law, and it was changed not too long ago to cover a lot more stuff, so unfortunately we'll be seeing that many fewer etexts in our lifetimes. Ah well, what Project Gutenberg provides is nonetheless fantastic, and hopefully it will soon receive due recognition. Dan
Fuck it
By having Project Gutenberg on Slashdot, you can sure millions of readers asking for online books, whether you want it or not. Maybe Slashdot should start a generic web portal...
I'm gonna spread the word on IRC and all sites I have (as if anyone visits them). I've always wanted to see books online and finding this site was a godsend. Seeing as the only other site that does (did... I think they stopped a month ago) was MCP but those books (teach yourself jack shit in 2 seconds) weren't worth reading anyways.
If you think you know what the hell is really going on you're probably full of shit.
If you think you know what the hell is really going on you're probably full of shit.
jdube is who I am.
...but they are hardly the only people producing free e-texts. Yes, I remember that in the pre-Web era their ftp site was about the only place on the net for e-texts, but as the existence of huge archive sites like The Online Books page show, PG is just one group among many similar groups these days.
Project Gutenberg has been around for literally years, and is a resource I always check when I can't find that elusive book.
In fact, we used to use the text from many Gutenberg documents when I was fiddling around with data compression (specifically compression methods aimed at text and english in particular).
Also, many of my relatives have asked me "Can you find out about this book on the net, and where I can find it?" and are somewhat suprised when I hand them a disk with the Gutenberg text version of it. First time I did this, they thought it was reviews of the book and details where to find it, instead of the actual text. "Remember how I said that a floppy disk holds about as much text as one small book?". *grin*
However I think they got one of the most important boosts of advertising they could ever want, an article on good ol' Slashdot. Way to go Hemos! (and CmdrTaco of course) *grin*
PS: Heya to FunkyBob, the guy who did most of the coding on that compression stuff was I mentioning earlier - and when will it ever work properly damnit! It's only been about 6 years! *grin*
Disclaimer: The opinions expressed are not necessarily my own, as I've not yet had my medication today.
The previous post is very relevant to Project Gutenberg and deserves a higher moderation score.
How about giving Project Gutenberg a free banner ad here on Slashdot? Now that'd generate a lot of traffic and put them right out in the public view!
Whaddya say guys?
Project Gutenberg was one of the first endeavors that really got me interested in the internet. Although etext always seemed to be in tough competition with online porn. However if they really want the Slashdot readership, perhaps a few comments that they are running on a Beowulf cluster and use only GPL software would really kick the /. effect into high gear.
if only for one thing, and this is nitpicky, but I think it might help a little:
They need to make their online books a little more web-friendly. I understand PG's reasoning behind keeping everything in pure vanilla ascii, but quite frankly, in that state, the etexts don't look terribly great, nor are conveniently navigable in most web browsers. Appearance and ease of use are, imo, important factors if you want to attain a large audience.
Bono Vox, bono@vox.org
Personally, I remember my first run-in with PG back in the days of the BBS... it was a Taoist text in one of the download sections that had been created for PG. I also seem to remember a very lofty goal at the time, something like a billion downloads...?
/. is a huge start.
At any rate, I think a few areas might provide support...
*Amazon (someone mentioned this) is a _bad_ idea. Profit motive and releasing free documents don't coincide well.
*The Palm computing platform is the big plus. To be able to read in such a convenient form is wonderful, and PG offers a large library of material for consumption. However, PG needs to _market_ to them, meaning convenient little formats, getting linked to, etc...
*Align with the OS movement more, there's plenty of talent that would likely work on such a task, but probably isn't even aware of it. Getting mentioned on
*Make better use of technology... I seem to recall very slow rates of progress, which lowers the level of excitement for those involved (it's sad that this is a factor, but very true)- can't many works simply be OCR'ed?
*The general public (Oprah Winfrey's audience, etc.) is most likely worthless. It seems as though most of the public rarely reads, let alone transcribes... The only thing they might be good for is cash to support the effort.
Just my US$0.02
My laptop's a reall old 486 running dos/win3.1 : P (linux won't boot), but wow .. project guttenburg saved me a ton of money on not buying books that i wouldn't use ever again .. plus its less of a load to carry ..
.. eh ?
Maybe they should start posting posters across campuses and such
If any of you have played with the E-book readers out there (Rocketbook or Softbook are the main contenders) you'll notice that 90% or so of the books they offer right now seem to be public domain ones, mostly from the Project Gutenberg collection. And that does make sense - PG is all about etexts, the E-book readers are about reading etexts... Anyway, it seems the two parties ought to get together. But unfortunately, the Ebook vendors seem to be more focused on licensing and copyright issues and making money from selling content, rather than just making and selling their hardware. Can't Dell or somebody like that get into this business and show how it ought to be done?
Anyway, if we could get a bunch of recent books out there in the public domain (or GPL of course) - either under Project Gutenberg or some other auspices - I think that would demonstrate this is a serious option for the future of reading. The technical market might be ideal - how about merging in some of the Linux Howto's and the Linux documentation project with this kind of effort? Instead of making a buck for yourself and Tim O'Reilly, how about publishing with Project Gutenberg next time? Just as with Linux and the World Wide Web, it could be a way to guarantee readership you would never get by selling the stuff.
By the way, I prepared 2 books for Project Gutenberg many years ago, and did some work on their Encyclopedia project, but I've not been keeping track for the last few years - it's definitely continued to grow and be successful. Despite Michael Hart's quirkiness, it really has come close to fulfilling the original promise (10,000 free etexts by 2000). A hearty congratulations to Michael and all the volunteers!
Energy: time to change the picture.
What PG is doing is a really great service and needs to get publicity. Sadly the copyright restrictions serverely limits most recent books from being included. Wouldn't it be nice if the GPL would catch on with books? Why not some "open-source" sci-fi novels?
I've been reading PG books since I've been on the net ('94) and I think they have got to be one of the most important resources available.
People discount PG by saying thing like "Oh, you can get free texts anywhere" and "Books are outdated, anyway".
Well, imagine happening without PG: Copyright laws are changed so that copyright does not run out after 30 years (or whatever it is) - and this is what the film lobby wants.
Then, in 10 years or so, a law is made giving ownership of texts that have become public domain back to the decendents of their owners, who then seel them to film companies or amazon.com
These companies decide that they only want to sell paper-books, and the demand for some titles is so low that you have to get a special publishing run for them.
Then a some books get banned for being sexist/sexy/racist/communist or whatever, and you can no longer get them - period!
Books - or at least the text of then is the life blood of civilisation - and PG is something that is making this freely (as in speach) available to all.
Support it!
PS:yes, I know the scenerio above wasn't real, and I know "the internet changes everything", but in 5 years, when you are reading "Sherlock Holmes" on your Palm XX, you can thank Project Gutenburg for keeping it free.
--Donate food by clicking: www.thehungersite.com
I like getting my hand on free 19th Century classics as much as the next guy. However, I find Project Gutenberg of dubious usefulness. And I strongly disagree to the claims of some journalists that this project, if completed, will be a great help for the schools of the Third World.
I am sure the Internet and its associated technologies can be used to help impoverished kids worldwide, but I don't think they would benefit much from an electronic version of, say, Boswell's Life of Johnson... in English.
[I know this is slightly off-topick, but I just wanted to prevent someone coming up with the references to rural Kenya that always pop up when discussing Project G.]
Really? I'm rather thrilled to see this here. Until now, I've never heard of 'em.
If Project Gutenberg had been started by Linus or RMS, would it send out hysterical letters every month asking for money to keep the project afloat?
I fail to see how this qualifies as sending out "hysterical letters every month asking for money to keep the project afloat." Rather, it seems to be a request for publicity. It doesn't seem to be all that hysterical at all.
Wondering what this has to do with Slashdot and Open Source and GPL.
I don't see that it has much to do with Open Source and the GPL, but it seems to have a bit to do with Slashdot. "News for Nerds. Stuff that matters." Well, I'm a nerd. I like to read. This is news to me, and stuff like this sure matters to me. End of story here, at least.
Seems PG is a conservative outfit that resists change in technology,
Unfortunately, I fail to comprehend how translating texts printed on paper to an easily-reproducable format that can be easily obtained via the internet qualifies as resisting change in technology.
won't cooperate with other free ebook causes,
Proof? Links, quotes, or something, please. If this is true, I'd be interested in seing something to subsantiate this, as I'm not likely to take the word of an A.C. alone as gospel truth.
is intent on producing numbers of poorly proofread texts and admittedly of not top quality,
Try getting 30 of your closest friends and proofread several hundred thousand pages of material and see how well you fare in getting all the errors. Please cite something to prove that they are "intent on producing numbers of poorly proofread texts."
and doesn't accept criticism from outsiders.
Again, proof please!
BTW, the aforementioned letter from Hart is not on the page linked to, and it contains errors of fact about copyright.
I don't see any claims that the letter is on the page linked to. If if does contain errors of fact about copyright, please cite some.
Maybe I'm way off here. I've never heard of this project before today, and thus my knowledge is limited to what I've seen here and my brief perusal of their web site, which more or less only consisted of checking to see what they had by F. Scott Fitzgerald (only This Side of Paradise) and Gabriel Garcia Marquez (nothing). If what you say is true, I'm certain that I, as well as other readers of Slashdot, will benefit from having some primary source material to peruse demonstrating your claims. Right now, all we have to go on are your quite unsubstantiated allegations.
1) Use your domain name: Gutenberg.org 2) Get the crawlers to got thought the texts _at your site_ . You can wrap in pre tags... 3) Your ripe for a grant for outreach. If you don't have the "official" framework, contact some CS or English depts and see about some joint work here. 4) oss4lib is a new group that could be seen as having a relation to you... 5) Perhaps some outreach letters to English depts at various levels, from grade school up. 6) Bells and Whistles: how about some history on gutergerg, past and present. Entertainment. 7) Given talks at various places helps. You might meet some connected people on the way... 8) In general, of the e-libraries, what tactics are the successful ones using. Seems a good learning place. I do like the tasteful layout and quickness of your cover page. I have always been impressed with Gutenberg! Good luck
That really is a quite considerable cost, in much the same way that the production of "free" software requires substantial effort.
It is somewhat unfortunate that there have been such peculiar positions as:
It did not add to the project's credibility when they on the one hand indicated that their funding was maxxing out at around $30K per year, whilst claiming that they were producing "billions" of dollars in value. (Note that the PostgreSQL HOWTO suffers from the same sort of thing...)A claim of $30K on the one hand, and $Billions on the other, do not reconcile very well.
Not unlike the situation with the FSF, they could probably more readily use contributions of time rather than of money, although some of both doubtless prove valuable to some degree...
If you're not part of the solution, you're part of the precipitate.
I really liked that link.
Here is another: classics.mit.edu
Anyone who is doing the distributed.net project, you can vote for the charity money that will be won to go to Project Gutenberg.
I've followed them and downloaded their etexts for a number of years now, and I must say that Gutenberg is one of the finest, most selfless projects on the internet.
... and the voices in OS 9 are much better than they have been in the past.
My favorite thing to do with Gutenberg etexts is to load them up in TextEdit Plus on my Powerbook and "Speak document" while I work. It's very cool
Three cheers to project Gutenberg, and anyone out there who hasn't already checked them out should do so ASAP!!
Share data. Share code. Share ideas. Share the wealth.
http://stockfilter.org
Is there a copyright-free English dictionary that could be distributed with open-source software like word processors for Linux? This seems to be an important feature that current Linux distributions are missing. I have in mind something better than /usr/dict/words.
Creating a good dictionary from scratch is hard work, but if you can get the structure and the word list e.g. from a copyright-free source then the hardest part is done. Therefore, a good starting point would be to take the structure of the copyright-free Spiers English-French Internet Dictionary, i.e. cut out the French translations to leave the English core. Is anyone else interested in this?
Does anyone know how much volume a site tends to get while being featured on slashdot? For anyone who has maintained a site that got slashdotted, what kind of traffic (in numbers) were you getting while the hammering was going on?
Thanx, I've always been curious about that.
My most memorable Project Gutenberg story:
Back in 1991 when I started my life on the net,
we had an English class teacher that one day
brought one of those competitions into class
that test your knowledge of the English
language and all that. One question asked what
the longest word Shakespeare ever used was.
No one had any idea until I looked on the 'net'
(where!?) and found an electronic copy of his
complete works. Back then there was no real
Linux and I still used DOS (yuck!). My 286-12
needed about 5 minutes to come up with the
correct answer. Yay!
A prelude to todays online homework collections?
I think so. Reinventing the wheel wastes so much
time, doesn't it.
Every submitter formats the text differently, and the inline ("botton of page") footnotes are a real annoyance.
:)
However, I would like to say that via GB, I've read every Charles Dickens and Sir Arthur Conan Doyle novel they have e-published, to much satifaction. I started on other authors, but then a friend introducted me to the Dune and Hyperion series.
I think it's safe to say now that webifing the text would be a wonderful idea. If you were to index them in the web search engines, you would then definately get more hits. I'd love to be able to type in a search engine "to be, or not to be" and get sent to the correct page in the GB e-text.
Once you do that, launch a ad banner campaign with suggestive quotes. ie. "The staircase was darken with gloom...(click here to read more...)"
BTW: I read "Sun Tsu" as well. Way cool...
A friend of mine introduced me to this during undergrad, and I was instantly a fan. Sure, the texts aren't that pretty to look at as is, but dump them into an Emacs buffer, add a few LaTeX markup tags, and suddenly you've got a decent-looking copy of whatever. (This is especially nice with texts which are relatively short -- I remember in particular having a tex-ified version of the Communist Manifesto. :-))
...a text to speech company. It would put a new spin on "AudioBooks."
BTW: Gutenberg texts suffer from alot of typos - about half way through the work, the quality really started to suffer badly...
I was wondering about how feasible it would be to start a GPL version of this. Ie, start with a bunch of words that would be commonly looked up, and we'd come up with definitions for them paraphrased from a variety of sources, so it wouldn't be plagiarizing.
If the resulting text wasn't stored in plain-text (too large) but compressed, there could also be specialized tools to grep for keywords anywhere in the definitions, etc. Is this a decent enough idea? Having noah was really REALLY handy, just at a prompt type, "noah asperity" for a good definition of 'asperity'. Really useful
>>Seems PG is a conservative outfit that resists
>>change in technology,
>Unfortunately, I fail to comprehend how
>translating texts printed on paper to
>an easily-reproducable format that can be
>easily obtained via the internet qualifies as
>resisting change in technology.
It doesn't. However, PG in general, and Hart in particular (as if you can really separate the two) are stuck in a reasonably old-fashioned mindset when it comes to textual information. Because the project started way back in the Seventies (I believe), the choice to use only plain ASCII might have made sense then. It certainly doesn't do so now.
PG would benefit greatly from a structured information format, preferably one that could be transformed down to plain ASCII when needed (most formats that would be appropriate already do this). Using something like SGML or XML would give them the benefit of structure in the information, like footnotes, italicized sections, page breaks, etc., in a machine-readable format. Also, they would have the option of using Unicode, which would benefit them greatly, since 7-bit really doesn't cut it for anything but English text.
I, and I'm sure many others, would be happy to provide an XML system for them free of charge, but as I've understood from interviews, Hart has his mind set on continuing to use ASCII, because he feels it makes it available to everyone. Personally, I think it reduces everyone to the lowest common denominator, and could be solved in a better way. My two centavos.
--Joakim Ziegler
Maybe not Macromedia, but how about HTML, even HTML 2.0? After all, with the Lynx web browser, blind people would be able to read the books just as well as sighted. And Lynx runs on every computer, even terminals. Even ASCII text and word processors are not so portable. That's what the WWW is all about, haven't you heard?
Restricting PG to ASCII also implies forgoing pictures. But moving PG to HTML would mean that graphics browsers could see real pictures (even in color, something not so common nowadays in cheap printed books) and visually impaired readers could read the text descriptions with their text-to-speech synthesizers.
I agree that PG needs to be more web-friendly. PG has started to include a few HTML works. But it suffers from a great stasis of sunk capital. Open Source business methods know how to overcome that problem. It's time for PG to adopt them.
Some folks want Gutenberg to move past ASCII and become more web-friendly, more non-English-language friendly, more Y2K-friendly, whatever. I happen to believe they're on the right track. They are trying to provide a baseline of texts which can be adapted to specific purposes.
That's how I use 'em. I've downloaded a few such texts and made them into Newton books, which I put on my Web site. (I'm a retro-geek. I prefer Newton to Palm.) I couldn't do that with an HTML page, or at least, not as easily.
The one thing I found in doing this myself is that some Gutenberg texts, at least, aren't error-free, even if they have been proofread. I've proofed two such books so far and I h've had to correct around a dozen errors in each. Now, the books I'm converting are by a British writer named Ernest Bramah who's completely obscure today. I happen to have original editions in hardback, but with a writer as obscure as Bramah, there are damn few of us out here with original editions to check. I could wish the Gutenberg proofing process were a little more thorough. There isn't even a central place to report such errors to: the Gutenberg help line just told me to forward the corrections to the original text provider, which I did.
On the other hand it does make me feel like I'm actually giving something back.
Can someone explain why the original post, which is very relevant to PG, is (currently) only moderated with Score:1. It was at Score:2 until a few minutes ago when some moderator for some reason decided to downgrade it to 1.
The thing is that PG is an archive of books that are/were in print. This and the ipl, which I found a while ago are links to webpages that are akin to books(yes, some are actual books, but for the most part...). All of PGs links work because they control the archive, and while I haven't browsed around On-line books, when someone's site goes down, the book goes down. A few at the IPL I thought might be interesting were down. Yes I know, mail in broken links, but it gets annoying.
I would rather have an ASCII version than a pure web version/PDF(annoying as hell)/whatever other doc format. Easy to transfer, universal, and plain. Yes, footnotes and such get lost or are somewhat squished in, but you really want only books with this, not referrences. For that, you can head over to britannica.com(when its working) or some other large site. Plain HTML might be good because a simple script could strip it, but I really like PG's ASCII texts.
That is the point, actually. PG is for books, not encyclopedias and referrences. A few "How to use..." books are out there, but few. Most are fiction, and things like the Bible or the Declaration of Independence.
Umm... isn't Linux centralized and dominated by one person? Why should they setup their own site, when it would be much better to put it in a large archive such as PG? And no one is forcing anyone here. How does Open Source really work here? These are already finished, printed books! What improving is there to do? Why do you think people keep yelling about kernel forking? It is better to have it all in one spot than spread out over a maze of websites. Also, this prevents a lot of double books from being created and wasting time.
You're just bashing PG because it isn't a big referrence for the type of books it wasn't meant to create. It was not meant as a big library for stuff like science, computer...etc info, but for fiction and speeches and such. I can see the IPL and Books Online suffering from dead links, PG won't. If I need say... Hamlet for something, I scoot over to PG and download it, and it is very likely to be there, even offering several mirrors. Books online or IPL would offer me a link to another site, that might be helpful, but what if the site is down? What if my ISP charges by the minute and I want to get the whole thing in one move? Uh oh, multiple pages, save as, save as, save as...
I started out somewhat agreeing with you, but now I highly disagree. I repeat: PG is not a library, it is an archive of books, it was mostly meant for stuff like Shakespear or other important literary works, but now some other things have seeped in, but its main purpose is still the same.
Simply make an IPO on NASDAQ. If possible, associate yourself with Linux as well. Maybe get an endorsement from Bob Young of Red Hat. Wall Street will be beating down your door to give you money, without knowing why or what you do. Use the money to buy banner ads. :-)
dragonhawk@iname.microsoft.com
I do not like Microsoft. Remove them from my email address.
Hmm. I looked at that site, and it *looks* like they expect authors to use Word to enter documents. It talks about putting words in italics, which doesn't make much sense for pure ASCII editing. It's not particularly clear, though; will they accept a tagged HTML document?
Also, quick dummy's question: what is the situation with HTML and Unicode? I've always assumed the HTML docs were ASCII, but presumably our international friends have some nicer way to work with HTML and different alphabets.
Ooh, a sarcasm detector. Oh, that's a real useful invention.
If other moderators up the score again, the downgarding moderator will end up losing some of his/her Slashdot karma.
The GPL is really a license for open-source program/documentation development, and nothing more. You don't want 100 different revisions of a fiction text out there, each slightly "improved" by a different author. Can you imagine reading a book, perhaps a chapter at a time, and having it constantly change on you? The plot inconsistencies would make The Phantom Menace look like Shakespeare. I love the GPL, but please, let's be serious about where to use it.
If getting more users is really as important to them as getting more texts online (and there really isn't an awe-inspiring amount there yet, so far as I can tell), then they need to be able to pass the mom test (you know, could my mom use it?). I mean, I really *like* having a book on my Pilot at all times -- it saves me in situations where I'm unexpectantly bored. I'd bet I'm not the only one. PG needs to cater to this.
----
Every year during my review, I just pray the words "slashdot.org" aren't mentioned.
I just tried to register "gutenberg.org" so that I could give them a domain name. gutenberg.com, net and org are all taken. :( Any sugestions for a good domain name I can point at them?
--GnrcMan--
HTML 4.0 includes version 2.1 of the Unicode standard for international characters which assigns a unique identifier to each of 38,887 characters in the set of the world's major languages. This work is being coordinated by the Unicode Consortium.
Beowulf cluster. Awww yeah!!!
Ok, that's a falsehood, I have used it, once. About 2 years ago I downloaded Notes From The Underground. It lingered on my hard drive with some Mark Twain that I had also downloaded at the time. I don't believe that I ever read them, because it's too darn uncomfortable to read a full novel on a computer.
Eventually I picked up Notes from the Underground As a Dover Thrift Edtiton. It cost me all of $1.00. I couldn't print it myself for that much. Also I picked up Faust, The Theory Of The Leisure Class, The Devil's Dictionary, The Queen of Spades, Oedipus Rex... and the list goes on. These were brand new. None of them were more than $2.00. And that was suggested retail. Used books fall into much the same category, as they are usually $2.00 for a paperback.
In this era we publish more books than ever before but fewer authors than 30 years ago. Why not use E-texts to promote some authors who cannot get published by the big boys like Bantam, Del, Tor, etc... Why not have a more user friendly site? Why not invite reviews? Reccommendations? Etc...
Why not make it so that PG is accessible to the masses. Let people have their stake in PG, make them a part of something. That is what draws people to participate in these projects. Slashdot is not the best news site out there for news, but it is the best community out there for news.
When I first found PG it seemed like one of those great ideas. I bookmarked it. I stopped back, nothing had changed, A year later I stopped back, still didn't see anything that really caught my eye.
In short, I appreciate what PG is trying to accomplish, but I cannot find where it has any real relevancy to me. Not when the price of the information on a user-friendly, portable media that never needs winding or batteries is available for so little. To truly draw attention and keep it, you need to fight our pitifully short attention spans, and our desparate need for convenience. Why not encourage people to write for PG, not copy. Why not encourage the stockpiling of information, not fiction. What about an app that facilitates the finding and reading of e-texts, something more than "more"...
PG has been around long enough to have garnered the recognition it deserves. If it is concerned that it is not busy enough, then it should be wondering why. It has always seemed to me that PG tries to lure it's readership with the mantra that "This is for the greater good..." Help us... Instead of playing on our consciences, fufill a need. As of this writing there are ~50 responses from people who have all heard of PG. Some use it, some don't. But they all know about it.
PG, give me the slightest reason to come and keep coming, and I will. Until then, I can get Vonnegut for $0.25 at the library and PK Dick for $2.00 at Novel Futures. And god knows that our independent booksellers are struggling too. (Tangent: Don't buy from book behemoths, as smaller booksellers die out our culture moves further into the realm of vanilla pop garbage!)
~Jason Maggard
"Give me convenience or give me death." ~Jello
I remember reading a Wired interview with the PG founder back in '96 or '97.
They were talking about how movies were begining to come out of their copyright period, and how he wanted to make a public domain MPG of "Gone with the Wind" before he died.
I'm not quite sure what the copyright status of early (say, pre-WW2) movies is, now. Anyone?
--Donate food by clicking: www.thehungersite.com
Although the Spiers French-English Dictionary website often mentions MS-Word, they do state on the how to join the project page, that
"There are no shortcuts to any place worth going."
"Be regular and orderly in your life, so that you may be violent and original in your work." -Flaubert
Try here
Back in '97, Wired did a feature on PG. The original Gutenberg ftp site was hosted on a UIUC machine. I have some friends who were there at the time, and have regaled me with stories of what a pain in the ass the guy was. The FTP site that is alluded in this article by one Mark Zinzow was on a machine, mrcnext (which no longer exists but still has a DNS entry) adminned by a friend of mine at one point. Anyway, the point is, this article has a lot of interesting things to say about the Project and especially Michael Hart. Check it out.
--
--
"In Cyberspace, no one can hear you be sarcastic"
I think, actually, that project gutenberg ought to store their files in a simple semi-formatted way.
Like, lines beginning with \ are escapes with codes for 'title' 'author' 'chapter' 'paragraph' and 'footnote' Like,
\title The Slashdot Effect
\author Rob Malda
\chapter Chapter One: What is The Web
etc.
(apologies if there's a real book by that title)
Most of it'd just be plain text. With -just- enough formatting that a perl-script (or future-language script) can transform it into the pretty-format of the day with a bit of analysis,
but not so much as to make it unreadable.
Just a thought.
--Parity
--Parity
'Card carrying' member of the EFF.
Project Gutenberg is a great thing. I have no affiliation with it other than as a reader. I think it's great that slashdot took the time to post this story and I hope that PG is successful in getting more publicity.
I'd like to address some of the rather negative posts that people have made regarding PG because frankly their short-sighted:
PG's etext's should be made more web-friendly:
Most of the people who make this comment also suggest a lot of nonsense about adding HTML formatting to texts. This would be a huge mistake. The beauty of having the texts in as simple a format as possible is that it is always possible to add the formatting later. I'm sure that the creative readers at slashdot could come up with about fifty ways that the formatting could be added later, dynamically if need be appropriate to the display device. It is also particularly annoying to see the aforementioned comment on slashdot for the same reason. If you want to see the texts presented in HTML grab the texts (I have seen many folks on here bragging about their great bandwidth), write up a perl script to format them and start serving them up from your site, BANG! You've just created a great new companion site to PG.
This is not helping folks (somewhere else):
The gist of this comment seemed to be that PG was not useful, because the texts were mainly western literature and in english? This is so bogus it's hard to address. The bottom line is that you have to start somewhere. The first document ever done by PG was the United States Declaration of Independence. Yes, this shows a bias, but, you have to start somewhere. I have never seen anything from the project saying that any works were to be excluded and in the meantime yes, the works that are there are useful to people all over the world. If PG got some more publicity, maybe more people from around the world might hear about it and could contribute.
Anyway, anybody interested in great literature should take a look a PG. They have a lot of great stuff to read available for free.
Stupid moderator, kicks are for trids.
My first ever post to SlashDot! What a moment!
I'm impressed by the texts available - from Charles Dickens to Geoffrey Chaucer, and Mark Twain, and even The Hackers' Dictionary of Computer Jargon.
My question is, what portable devices are available these days for reading texts such as these downloaded from the Internet. I would love to able to use one these on the train and tram, on the way to and from work - better than a broadsheet newspaper. I had a look at the Rocketbook and Softbook mentioned by a previous poster, but those devices seem to be very restrictive in terms of availability of books. I guess WinCE machines could be an (expensive) alternative. What about Palms? I don't actually own one myself, so I don't know about how hard they are on the eyes for extended periods.
"Who makes Steve Guttenberg a star? We do, we do."
I think there's a much stronger case to be made for using HTML over ASCII: elementary typography. Vanilla ASCII texts contain no italics, no super or subscrips, no "N" or "M" dashes, all of which are rather important basic typographical features used in many written texts. The problem with the vanilla ASCII text isn't that its boring (if you think that, you're probably not the sort of person looking for e-texts of works whose copyright has expired; I mean, you're not going to find any of that John Grisham shlock on PG); the problem is that it often isn't faithful to the actual text. Typography is important.
Maybe not the GPL per se, but something along those lines could easily be implemented for a novel. A main author could write an intro chapter, post it to the web, set deadlines for each chapter submission, and then piece together a finished project. Of course the publishing industry being what it is, this is unlikely to happen anytime soon. Nice dream for literature types like me though.
As distasteful as the thought may sound to some of us, it may be time to solicit the help of the government. Specifically, I believe it was California's governor, Gray Davis, who was recently talking about building a virtual library of all the texts in the California State university system. In this case, collaboration may make sense.
Article I, Section 8 of the US Constitution enumerates the relevant federal power as "Congress shall have the power... To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries;" The key phrase here is "for limited Times". Retroactively extending the duration of copyrights or bestowing perpetual copyrights is plainly unconstitutional.
Needless to say, unconstitutionality has never prevented legislators from passing unconstitutional acts, and the late Sonny Bono had his day in Congress a year ago and changed the rules, and we'll all suffer for it. There are some legal battles being fought on this issue, but I can't seem to drag up the references.
"If one is really a superior person, the fact is likely to leak out without too much assistance" -- John Andrew Holmes
Palms are nice, and are cheap when compared to other handheld devices, and are amazingly useful, but they are limited to about 13 lines of text per screen, with an average of about 30 characters per line (that's my guess anyway . . .), so I don't think they are best suited for something like this.
The development guides that are available from 3com even note that they are meant to be used as an auxilary device to a PC for the most part. That being said, in a pinch, one would certainly work, though.
Thank you! The WordNet dictionary seems to be exactly what I was hoping for. Because they've chosen to use such a flexible license, it's likely this dictionary will become very widespread in many applications in future and possibly even the dominant dictionary of the English language. How many words does the dictionary contain? I couldn't see any mention of the current word count on their website.
Ignore Alien Orders
The way that Project Gutenberg gets material is by people contributing [time, not so much money]
Yes, it might help the credibility of Project Gutenberg if Michael Hart were present to respond. Perhaps he should not feel so bad if he reads these posts, because the project is indeed larger than one person, and he might even learn a little about Open Source business processes if he cares to. Perhaps that is what we can offer to help, here a little friendly advice as outsiders, rather than time scanning or money donations.
Just read an article on Di-Ann Eisnor in FastCompany magazine. She started the first (according to them) offline advertising agency for online brands. Apparently she was able to cause a ruckus for MiningCo changing to About.com so maybe they could do something for PG. No, I don't work for them, I just happened to see the article this evening.
If so, he should put a few strategically placed links to Project Guternberg on SlashDot and any other web sites he has influence on.
Other Slashdotters with web sites should do the same; after all, "Open Source" books should be encouraged as much as "Open Source" code....
Donte Alistair Anderson Roberts - hi son!
Karma: Chameleon
After the Project Gutenberg exposure on /. today I was invested with the idea for a needed piece of software (mentioned in a few /. posts). Two things seem needed, both of the same purpose:
- A client program that can suck an etext out of PG (et al) straight (or nearly directly) into a PDA (palm3 here, etc)
- A cgi/script to do nearly the same thing on the server end (that is, mangle the etext into a DOC and then a pdb) (The example given in the slashpost is AvantGo, a nifty web page caching system for Palms and such. When it works, all you do is click on a link in your browser and the helper/etc queue the pdb up in your Installer)
As is only proper, I intend to hack on this myself (client first), but am hoping to have help. I am a lousy hacker in java perl and python, but learning as best I can.I figure to start in on the client using the best case scenario: a unix system with palm doc tools, pilot-link, pilot-link-perl and pyrite (the palmos module for python). Once I get something running, I will try and exchange palm doc tools code for perl or python code, eventually getting it into one module. (I intend to attempt it first with python, but I am quite open to perl, too) I would like to try and implement the serverside as a perl or python CGI (as I am without a better idea). Someone better than me could probably whip out a java1.1 client for multiplatform if some java code for making DOC files can be found, and the same goes for a servlet. I'll poke around and see if any such codes is in any of the obvious places.
Anyway, these are my ideas. I would like yours. Feel free to flood my mailbox, etc. Email to adric@adric.com and try and put something like VP in the subject line so I can filter it from the spam :) A copy of this document and anything later will be at my site at: adric.home.mindspring.com under hacks.
Oh yeah, I propose a name for this beast: VacuumPress
This document is copyright 18 nov 99 by adric@adric.com (me!) and any software resulting from it will be DFSG / OpenSource compliant.<script>alert("I never liked JavaScript, really; it just seemed a bad idea.");</script>
It definately does need to have markup codes, though. I'd personally prefer XML, because it would allow the documents to include book-relevant tags like <chapter number> and such, which would make the e-texts a great deal more machine-readable, and accessible for everyone. (in addition, it would make it a lot easier to re-publish the texts.)
I design and typeset books for a living, so I know what I'm talking about when I say that it's a lot easier to remove or process existing codes than it is to insert them. A machine can easily reformat a marked-up document; if plain text is wanted, a one-line perl script can be written to remove everything within angle brackets. The reverse is not the case. Computers are not currently smart enough to know where to add tags, so right now, every single tag in a document has to be inserted by a person.
All too often, that person is me. ;-)
I don't really understand licenses that well -- this is just my uneducated opinion.
I don't think the GPL would work well with something other than software. Once I tried to think about how people could copyright music under less restictive licenses. You'd want to copyright a song (not necessarily a given recording of the song) so that coffee shop/bar bands could legally sing it, but you have to do something to preserve the integrity of the art. I don't think the GPL really does that, because with the GPL people can modify your work and distribute those modifications. This has practical value in the software community, but in the music community people want their work to remain unique and intact. I assume that authors would feel the same way.
What you'd need, in my view, is a copyright license that allows people to distribute an etext freely and ensures that no one down the line can take that freedom away. However, people should be forbidden from altering the etext, and the author should always receive credit for the work. That way, you can give your stuff to Project Gutenberg without fear of compromising its integrity.
This is just an idea, and I know that it isn't a perfect solution yet. But I think that a license based on these ideas could be worked up and actually used by authors, musicians, and artists to promote the exchange of ideas and information. That's really the spirit of the GPL anyway, right?
Take care,
Steve
*Amazon (someone mentioned this) is a _bad_ idea. Profit motive and releasing free documents don't coincide well.
I suggested Amazon earlier, but I guess I should have argued my point rather than just suggesting it. Why don't profit motive and releasing free documents coincide well?
Profit motive and free documents can coincide perfectly, and work to each other's mutual benefit. The free software world shows that the profit motive, demonstrated by companies like Red Hat, may in fact be the *best* way of supporting the development of free stuff (software, documents, and who knows what else). What Project Gutenberg needs is publicity, and who can do publicity better than companies like Amazon with plenty and of money and marketing skills?
But why would Amazon want to help PG? For the same reason why enlightened bookstores make it easy for customers to browse through books -- letting customers browse increases sales; putting links to PG texts brings this browsing experience online (imo, there isn't much worry that the customer will just read the whole book online rather than buy it -- reading a whole book online is just too unpleasant).
Furthermore, the PG deals with copyright expired books, so the market is different in most cases; linking to PG is just another value added service that online booksellers like Amazon can provide for their customers.
Why should PG become web-friendly this project is about the text, the words not the illustrations, typography or whatever.
.pfd, format it for the palm and upload it somewhere else, not forgetting to site PG who make it possible.
Download the ASCII and format it at home for printing, on screen reading, throw it at a text to speech synthesizer or whatever you want to do with it. It is free and you can do with it what you like. The ASCII format is more portable than html. I can even boot my old C64 and read the PG text there if I want to.
I have had horrible time over the years getting rid of the formatting of online text's I want to read (The HTML principia discordia is a good example). I like the raw text format because I can download the text, throw it into a word processor and change the font and print it, make a
So if you think PG should become web friendly format the texts for the web yourself. PG might need that but we will still need the e-texts in a standard format
Unfortunatly as with much of the internet today the technology to make it really workable and useful isnt there. Until they develop a small walkman type thing with a screen big and clear enouugh to make reading easy it will never catch on. At the moment the power cable on my PC wont stretch the 12 mile train journey I got to take every day, and my laptop gives me headaches if I stare at for too long.
A worthy cause though, there is a lot of stuff out there that if it isnt saved now, will never be available, so I'll try to spread the word amongst my more learned friends, both of them.
To be serious for a moment though, as I am into music and I am constantly aghast at all the great music that has never been issued on CD and therefore is currently unavailable, I wonder if there is a similar scheme for albums? With mp3 technology, which the internet gods did get right, surely there must be somewhere to get all those currently deleted albums, anyone got any ideas?
---
This fat chick?
I suppose its a matter of oppinion... What does this have to do with Project Gutenberg anyways?
Someone set us up the bomb, so shine we are!
The question here isn't whether to use ASCII, HTML or LaTeX, because there already is a highly developed, sophisticated markup language for electronic text editions, TEI-SGML, specifically designed to preserve all structural information of the original text. Some e-text projects such as the Victorian Women Writers Project code in TEI-SGML. This is not only good for scholars/literature hacks, but also allows lossless reformatting of the source code into HTML, ASCII, PDF, RTF, etc..
The Gutenberg Project certainly was a good idea and a great achievement when it is founded, but might have to rethink its coding policy. Other e-text projects are already doing better here.
gopher://cramer.plaintext.cc http://cramer.plaintext.cc:70
Try #bookwarez on EFNet...We've got a bunch of stuff too
Why not do an interview with Michael Hart? At
the risk of being labelled a troll for the
second time in a week, he'd be a lot more
interesting than John Vranesevich.
K.
-
-- Proud descendant of semi-nomadic cattle-herders.
This is the text of an email I've just sent to Michael Hart, the director of PG:
Their biggest problem is that the format sucks. They should convert everything over to XML with appropriate tags for headings, chapters, etc.
Then you could use style sheets to make the text readable. What they have now blows. If it isn't pretty, people won't be interested.
Don't get me wrong, I hate M$ just as much as the next slashdotter, but converting the PG books to the openebook format is a good idea. The fact that M$ will doubtless try to co-opt this format is beside the point; the point is that it is a published standard, and more importantly that it is XML-based.
It really doesn't matter whether the PG texts can be plugged directly into a web browser or not, the important thing is to make the texts into structured texts. Really it doesn't matter what XML format is used, but since one already exists for books, it might as well be the format that is used, if it is technically sufficient.
From an XML format, it is trivial to produce the plain-ASCII format that PG seems to be so fond of (one might even say irrationally fond of). It is also trivial to produce a web-ready HTML document for online publishing, one that works with all browsers. Heck, we could even make it work for lynx, for people without a text editor... :-) Storing the documents in HTML is a bad idea, for a few reasons:
The thing is, when someone types/scans/edits a work and submits it to PG, they ought to have a way of specifying the things that are lost in the flat-file format: footnotes, chapter and section headings, bibliographies, etc. Once this happens, the works can be linked to with more granularity than as ftp://.../book.txt. This is critical (as several posters previously have stated) if books are to become full citizens of the noosphere.
Hmm, well when I write a sentence like that last one I know it's time for bed. I hope my arguments are clear. Re openebooks, my main point is that M$'s ownership of the standard is irrelevant as far as its usefulness is concerned. If anyone is interested in discussing these issues and maybe taking a look at what a good XML format would be, please email me.
I couldn't agree more with your and jaso's statements. The point is not to mark up the books so that they look pretty on the web, the point is to add the structural elements that plain-text doesn't preserve back into them. HTML is a bad idea (as I've argued above somewhere), but it is trivial to produce it for presentation on the web if need be. Likewise, from an XML source, it is trivial to produce a flat text file. Going from a flat text file to a format that includes structural information, though, is decidedly non-trivial.
I have noticed some people have mentioned the need for some sort of client software for Project Gutenberg.
It just so happens that I thought this same thing some time ago.
I am working on my own GPL'd project called Gutenbook. Right now it is not much, just a rapidly prototyped Perl/GTK application. It downloads and parses the Gutenberg index and allows you to select a title. Once selected, that Etext is downloaded and displayed for you to page through.
As I say, it is only a *rough* prototype right now and I have been too busy to work on it as much as I want. I have plans to port it to Objective-C and C with GTK++. (I think Objective-C *rocks*.)
I have exchanged emails with Michael Hart and some other of the Gutenberg people and have their support. I just need more time! I would love to get feedback on this.
Please check out the link above. The prototype is available for download. Please also take it easy on the server. It is a lowly Sparc 2. It enough people are interested, feel free to make a mirror.
Nothing can possiblai go wrong. Er...possibly go wrong.
Strange, that's the first thing that's ever gone wrong.
Tyler's words coming out of my mouth.
(Warning: use of "canon" and other English Lit. terminology ahead.) :7)
In England, IFRC, a copy of every book published goes to their national library -- and the same is true of the LOC here in the US. A Slashdot reader's comment to this article suggested that publishers should be compelled to submit an ASCII version of each of their new books to PG. Given the introduction of those systems that print single copies of books on demand [aren't they going to start showing up in Borders RSN?] it's as unlikely that book publishers will give away their raw materials as it is that software publishers will open-source their products.
Project Gutenberg seeks to make as many books as they can, _now_, available. Is this important? Yeah, because publishers won't make much money off of public domain materials, but that doesn't make those works any less important. Look at what books they made you read in school, what pointy-headed academic types call the canon, and see how many of them are available via PG. Other than some of the rubbish that one wild-eyed assistant prof made me read for my English degree, *most* of my high school and college reading lists are downloadable. These are valuable works that we'll still be reading for a long time.
(There are, of course, people who will dismiss most the the canon as Dead White Eupoean Males and by extension most of the PG etexts as tools of the patriarchy but that's a debate that'll never be settled. Read Harold Bloom's book "The Western Canon" for a chapter on each of what he (and many others) consider the most important books of Western literature, and then download the original texts from PG.)
It is possible to read these plaintext books on screen: make the text white and the background black; find a comfy spot and curl up with a laptop, or blow up the point size so that you can sit comfortably in your desk and read it on a monitor; take frequent breaks to rest your eyes. Sure, it's not going to advance your career as a sys admin, but you have to come up for air sometime. Anyone who commutes on a train or subway owes it to themself to read at least one "good" book a season. And the more I talk to people, the more people I'm finding who accidentally read something they skipped in school and rediscover books.
Is this relevant to Slashdot? Sure is, if for no other reason than the "information wants to be free" line.
Someone mentioned putting gutenburg texts onto palms or something and that got me thinking... I always thought it was cool on Star Trek when Picard was reading some play or old French book on one of those little computer pads. You know, th thins ones where they write all their reports and stuff. This would essentially be the same thing! Am I the only one that thinks thats really cool? Probably...
Probably the best way to get Project Gutenburg recognition would be to have classics professors mention it in their classes. Hmm, next time I see my cousin ( a Greek and Latin professor ) I'll suggest the idea to him.
All the creatures will die, And all the things will be broken. That's the law of samurai. (Jubai, 1605)
It would also facilitate content based indexing. After all, it's the content that counts.
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
In fact, you can use the Gutenburg text create your own edition, with your own illustrations, introduction and footnotes, and publish it. Perhaps for distribution to your students if you are a teacher,or perhaps to the world at large.
The Guteburg copyright restrictions are a lot like the BSD license. They're aimed to get the work used as widely as possible. If you modify the work, you just strip out the Gutenburg notices and it leaves you with the unencumbered text to do what you will.
This is a incredible idea, and one that deserves support. Maybe they should be nominated for a MacArthur grant?
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
It is interesting that Project Gutenberg have chosen to put things on line only in ASCII text.
I wonder if they might benefit from keeping their master copies in some kind of basic plain text markup language based on XML or SGML. Having things like the chapter structure, or more importantly the scripting information for plays, in the document in a machine readable format would seem to make it easer to search the collection, and also easier to reintroduce formatting to allow prettier looking hard copy. I don't know about anyone else, but I find reading on paper much easier than on screen, and nice formatting in that context is important.
I don't mean to belittle what they are doing - I think it is excellent work - and I emphatically don't want them to keep the master copies in HTML or some equally labile format, or to start to introduce physical markup, but it would be nice to have some idea of the structure written more clearly into the text.
Since the content is freely reusable, strip off the Guteburg notices and make your own archive of e-text's in your favorite format! They've done the hardest part of the work, HTML-izing the work with CSS should be a breeze.
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
I think we should get Mike Hart as one of the intervewees on the weekly featured articles... This way we can ask him stuff and see what kind of responses he has to TEI-SGML and other things like it...
Large print giveth, and the small print taketh away
WHy don't they use some OCR (optical charcter recognition) to help with this? They could probably have raw files with the text, formating required, in a matter of days. Wouldn't that be a big help? Or is there a lack of Open Source OCR software?
Laugh, it's good for you!
I'm with you there. I guess my point is that they should store the books as XML, and produce plain-text versions from it for public consumption. I certainly agree with you (and PG) that plain-text is a necessary format to have available (for end users). The conversion from XML to HTML (or, say, PDF) would be a lot easier than converting from text, though.
I'm not sure if the Open Source frenzy and the GPL are just inherent human cheapness or a trained, almost kneejerk reflex, but get serious. For one, you don't want to GPL a book. Look at the Bible. Collection, rejection, revision, revision, and now we have something that makes very little sense. King James makes his little change, a council in 329 AD kicks out a few books that don't fit in line, and you get this wacky hodgepodge. Just not a great idea. Second, whatever happened to people actually receiving money for hard work? A lot of writers are working 40+ hour jobs already for little pay, and then write these books on the side to make ends meet or perhaps make a better life for themselves. You don't think someone who busts his hump for one to three years on a novel deserves some cash to replace his busted washer? Or that a poet who has to waitress sixty hours a week to support her kid and cribs verse on cocktail napkins doesn't need a little extra cash? If somebody works and produces a product that you use, enjoy, whatever, shell out a few bucks, geez. Making all books (or software, for that matter) free doesn't provide a lot of incentive to people. Do you really want the trust-fund, independently wealthy types as our only writers? And, if you go for the traditional argument of, "Their work will get noticed and someone will employ them doing what they do best," think of the two biggest consequences. One, their work is devalued, since the samples are free. Second, who employs them? Hallmark? You want Harlan Ellison or other good writers pushing out books as resumes and samples, so that Hallmark and the Chicken Soup For The Franchise types can hire them on to write schmaltzy crap full time? Is this what we want?
Its new location is at:
http://digital.library.upenn.edu/books/
The CMU address still works for the opening page, but the site manager is recommending that everybody link and bookmark the new URL.
I like the idea behind Gutenberg, but I have to say that after meeting the guy in charge of the thing, I'll never ever ever contribute to it.
but dont you think that people should find it by themselves. /.ed so...
I ve been lookin for good free (=very old) books, about 2 or 3 years ago, and I found a reference for project gutenberg in every search engine, I tried.
So may I suppose that people should try to enhance their searchin capabilities?
Of course gutenberg has allready been
One problem might be that Micro$oft has claimed in the Calera suit that it lost the source code to DOS.
Hmm, seven major versions, countless minor versions, over the span of many years...
Ooops, all lost. Even Win95/98... Joe Bob brought them home for an elementary school project and his dog ate them all...
The court didn't actually *believe* that did it? For a lie, that's pretty damn boldfaced!
Logosproject.org is just getting started but is trying to be an audio version of project guttenberg, with real audio and MP3 versions of public domain books. They are read by anyone willing to spend the time reading a book into their computer and encoding it. Submit something.
I seen the following site Amiktech.com is donating 20% of each CDRom they Sell of the Gutenberg Collection to the Gutenberg Project. If more companies like this started giving back to the cause Project Gutenberg would recieve alot more exposure. For Exact details and more information on Gutenberg read more here
Team up with the guys from PALM and/or HANDSPRING.
;)
/. crew provides a column "My Favorite :)
For them, the availability to download books into their devices
has cash value since it can attract new customers.
They'll eventually place some pointer to PG into
their hand-held manuals or their web sites.
Try to get some authors to sponsor PG by providing
etext stuff (or even books?) donated to PG.
Speak to Tim O'Reilly.
The
Literature Download of the Month" which can
bring new insights to geeks who usually read more
Perl than literature pearls (training the other
half of your brain cannot be wrong and
might even improve your programming skills
--
One surefire way to raise Project Gutenberg's visibility would be to register the domain name projectgutenberg.com (!)
On more than one occasion, I forgot the promo.net/pg URL and had to hunt around. Fortunately, Google solves that problem nowadays.
The number one dumb mistake that websites make is to pick a URL that doesn't match their name. Remember altavista.digital.com and salon1999.com and mckinley.com (Magellan)? The first two ended up buying their "natural" domain names at great expense, and the latter was bought by Excite and faded into obscurity.
--
--
Bitwise, Andrew.
IMHO, PG needs to establish relationships with websites that have attractive content. People don't wake up one day and think, "Gee, I have a mad urge to read an e-book" -- they cruise sites like mine to get ideas about what books to read. So you need to capture reader interest near the checkout line, so to speak, when the offer of an instant free copy is maximally attractive. For example, I once needed to refer to Dickens's _A Christmas Carol_ once on short notice to do a parody -- and PG came to the rescue. Or I got an email from a teenager in like Norway who was reading a Conrad novel in the middle of the night, only to find the last chapter was missing. PG texts are lifesavers in situations like those.
The main web page needs to be simple and powerful. Put the search engine front & center! Don't make me click a link to get to a search engine. Put the A-B-C-D-E-... links for Author and Title lookups right there on the main page, and don't make me have to scroll down to reach it. The least important thing is the gigantic text file of every book you have available, yet you put that on the main page, occupying over half the visual space on my browser. Another huge chunk of visual space is dedicated to FTP sites containing the texts, and even HOW TO USE ftp sites (!) -- the instructions for GETTING THE INSTRUCTIONS takes up an entire paragraph.
The fundamental aspect of good web pages for the next century is: MINIMAL WORDS
For example, the bottom of the PG web page says:
I feel uncomfortable wasting my time reading that entire sentence just for the concept: An entire sentence digested down to 3 words, and you can make all 3 a link to the help page, giving the user a larger target to click on than just the word "Help".The third paragraph talking about FTP also has combined within it discussion of subscribing to a mailing list/newsletter. Different concepts should be visually separated.
And lastly, there's way way way too much text at the top of every Etext that has nothing to do with what the user is attempting to read. Learn from the GNU project - one simple paragraph with the basic facts, and a pointer to a web page where they can read more. This solves another problem for you - if you have to change that text, you only have to change 1 web page, not the tops of 10,000 documents.
SUMMARY
Make the user's life quicker & easier, and you will get returning visitors. The way your web page looks today, I don't want to come back.
I hope the PG team accepts these comments as constructive criticism, because I strongly believe in the purpose and goals of PG. Keep up the good work!
The comments stating that PG can be of any use for the children of the Third World are just ludicrous. I mean, how paternalistic can you be? These e-texts can be a valuable resource for people that can read English and have the cultural background to understand the Greek classicks. Any child from rural Indonesia that fits that description will be probably buying Dover Thrift books through Amazon and having them FedExed to him/her -- overnight.
Changes are coming and PG is well positioned to take advantage of them.
Electronic books (Rocket et al) will eventually make it into the mainstream. I saw one for sale online @ $199 -- $100 less than two months ago. The Peanut Reader for Palm works surprisingly well, and there are lots of Palms out there already.
Popular press prices WILL come down as printing and distribution expenses are eliminated and demand rises. When people are used to downloading books to portable electronic devices and paying very little, PG popularity will take off. Hold on, Its coming!
I suggest charging a nominal fee, say $1 for each download and use the revenue to promote (maybe even advertise) PG. This means actually paying someone to get on TV, talk to newspaper editors, beg for free banner space etc.
Free is good, if you know about it. $1 aint bad if that's what it takes to find out about it.
Things have been pretty quiet for poor old Steve since the Police Academy movies...
I never thought of it as needing any more publicity than it gets (after all, I know about it.) I spent a summer doing RC5 when Bovine pledged $8K to the Project; and I've often toyed with the notion of sending some of my favorite old novels there. ("Three Weeks", by Elinor Glynn, the uncut"Pelham", by Bulwer-Lytton.) I've often used it to make gift versions of "Agrippa", by William Gibson for friends, as well as my BBS Housewarming Kit (Agrippa, The Hacker Crackdown, and a Blue Box plan), which I've used to get file points for boardz all over the local dialing area and beyond.
So, I guess you might say I'm a fan. I think that what's necessary is a bit of pizzazz. It was OK when it was one of the only things out there, but nowadays, it's not at all thrilling for people who expect anything on the Web to jump, flash, and leap off the screen. It needs to play up the fact that it's not just classic novels: there are movies in there, music, pictures...there are quite a few childrens' books, a truly classic cookbook or two...a treasury of literature on every reading level for people who might want to learn English, or empower themselves with a knowlege of Western Culture in general.( I don't think that it's bigoted to point out that it's a lot more empowering to learn a foreign culture associated with technology, than it is to try to reinvent the wheel as it pertains to one's own. The West has had to do this several times.) Perhaps a small M$ Bookshelf-like selection included with Linux distros? This is one of the most inspiring things to be put out on the web: I'm sad that it doesn't get eyeballs.
teleny, friend of cats.
One advantage the On-Line Books Page has ovr PG is that also organizes the titles by subject. 1. It looks like PG only has Title and Author listings. It would be a lot easier to find books on a topic if a subject listing was created. I would find it very useful. 2. Another idea to make PG works more widely read is to set up a user interface like Amazon.Com where book reviews can be submitted and where the system suggests other books the user might like.
It seems to me there is a technical solution to give us the best of both worlds: Define an authorised minimal subset of HTML to use. Write a program to automatically strip this simple HTML from the texts to yield ASCII. Write a program to do a 'diff' between an ASCII and HTML version of a text, and update the HTML version with modifications made to the ASCII. Write programs to convert minimal HTML to XML, LaTeX or whatever your favourite format is.
With these tools, you can easily maintain the HTML and ASCII versions synchronised, and add other formats as required with other conversion programs.
One problem with this approach is "what about when the language you wrote the programs in becomes obsolete". I have several answers to this: First of all, if the minimal formatting is not too complex, neither will the programs be - they can simply be rewritten. Secondly, FORTRAN and COBOL compilers are still available - once popular languages last forever. In 50 years, Perl and C++ will still be compilable.
Quattuor res in hoc mundo sanctae sunt: libri, liberi, libertas et liberalitas.
Anyway, the reason I'm posting is that I've registered the name anthology.org (not yet active) to provide a directory to etexts from various collections/projects. Sorta like a card catalog (or maybe the interlibrary loan database? whatever) -- probably a yahoo-style navigation. I'm sorta surprised that I no-one else has done this (with high visibility, anyway) -- does anyone else think this would be useful?
-
<SIG>
"I am not trying to prove that I am right... I am only trying to find out whether." -Bertolt Brecht
<sig>Guvf vf abg n frperg zrffntr
Plain Ascii is the way to go with PG. When you start down the road of wrapping it up in extra gunk at the base you have then lost the pure simplicity of the Project.
Any one worth thier bytes can take an ascii file and wrap it up in what ever flavor they want, but only a true fool would require everyone be subjected to that flavor.
Keep it Ascii Keep it open. All else is geeky wankings.
... is some kind of book recommendation engine. I like to read, but I am no literature expert. And I have no idea what most of the books at PG are about. 99% of the time, the author and title of a novel don't really tell me anything about it. The search engine is a good start, but more could be done to make PG more friendly. Lets say I like the hard-boiled detective novels of Charles Willeford. There's nothing by him, but maybe there's something else I'd like? Sure, I can search on "mystery" and get results, but that's pretty generic. And what comes back is just a list of titles and authors. I have no idea what any of the books are really like. More keywords are needed. A synopsis for each book is needed. Even the LOC has a sentence or two describing each book. Book cover scans would be nifty but not really that important. Reader comments would be great, and (once the system is in place) would not require additional work by PG and the people scanning. Amazon is a good model in terms of what info could be provided about the books. It actually is fun to browse Amazon discovering what books are out there.
There are so many reasons that ASCII is an error. The character set is extremely limited. You can't capture the full formating of the text at all with carriage returns and spaces scattered about.
They really should have moved to a mark-up language as quickly as they could. There is no reason they couldn't have keeped a master copy is SGML marked up with TEI, and then a text version derived from it. If somehow the international standard of SGML disappeared, then the plain text would still be available, and it'd be no worse than it is now.
A quick persual of the web finds the Oxford Text Archive, where the texts were marked up in SGML. Now the texts are available in ASCII, RTF, and HTML. Using existing free software, the texts can also be coverted into TEX, and if you don't like the specifics of the formating above (say you want to remove the page numbers in the HTML because you don't care about it) you can will little effort.
Every day a large section of slashdoters make it clear they are not interested in progress but simply the ego boost of the Jihad.
PG is one of the best examples of people DOING things rather than fighting for the Jihad, or producing real and useable items rather than hot air and worthless code that gets outdated before it is ever used.
PG has been around longer than many of you have been alive. It has done more over the years than you will ever hope to do in a life time. Seeking to gain an addition to your Jihad by cutting PG down to your level is not only foolish but short sighted.
PG stands as a becon to those thta would do, you jihad to those that would talk and produce empty code.
Consider this: it may not be ideal, but it is suitable. Reading a small surface is not so bad. Remember those little bibles that some religions distribute freely? it's the same feeling.
And you can even read in the dark, using the backlight. 3COM rules, when everybody have a Palm nobody will ever need books (even with pictures, I mean, my Principia Discordia has all the pictures of the original book).
Patola
Patola (Claudio Sampaio)
Unix System Administrator
Gutenberg is cool, one of these days I'm gonna buy their CD so I don't have to download everything. But I have one serious beef with them. Everything's in plain ASCII, formatted for an 80 column screen or line printer. (Some of it is even double-spaced). This is nice if you wanna read on a VT100 or something, but what if you want to read on a Palm Pilot, or make some nice printouts, or just on-screen with a nice (proportional-spaced) font, huh? Then you've got a jumble. I can understand the need to have ASCII versions -available- but here's the problem with the Gutenberg Project's assertion that the ASCII version is enough - it's -easy- to convert non-ASCII (HTML, SGML, LaTeX, etc.) into ASCII - it's something that can be completely automated. Going the other way can't be automated so easily - pretty much the best a person can do is mark it up while they're reading it. Of course, it's also dumb to make everyone out there mark up their own copies by hand if they want it in another format. Maybe Proj. Gutenberg could start making all their new E-Texts in SGML or something, so all their hard work doesn't look like shit on-screen. ---GEC
Bow-ties are cool.
> A more significant beef about PG is that it is
> centralized and dominated by one person, who does
> not share the philosophy of Open Source production
> that most of us do. Instead of forcing individuals to
> contribute to this project, why not help them set up their
> own web sites to publish their own works, or other
> works they have scanned?
Read the PG copyright notice, which is at the top of most PG text files. Michael Hart does not force anybody who contributes to Project Gutenberg to post his work on only his one site. PG's copyright terms are actually more liberal than the GPL. I have beta copies (not completely proofread yet) of my Project Gutenberg transcriptions on my own site:
. html
http://www.con centric.net/~Wkiernan/text/Gutenberg_at_Frownland
Contributors (and anyone else, too) are allowed, by the terms of the PG copyright, to redistribute PG works on one of two conditions: either they strip off the PG copyright header, in which case they can reprint the work with no further restrictions; or otherwise, if they leave the PG copyright header on, they must contribute 20% of the profits to Project Gutenberg. Certainly that's not requiring too much of a redistributor, to ask him to strip off the PG copyright header from the top of the text file, before he uses it anyway he pleases.
And for those HTML fans who criticize PG's least-common-denominator ASCII format, many of the works in the PG library are available in both ASCII and HTML format. While I myself prefer plain text, after I finish transcribing my next two books, I'm going to fall back and make HTML versions of all the books I've done thus far. (It's going be a while; the next two books amount to a little over 3000 pages. At a hundred pages per weekend, I'm "booked" until about next June.)
Yours WDK - WKiernan@concentric.net
I know nothing about XML. What software reads XML files? Why should I use XML instead of HTML? I have several thousand pages worth of books which I have transcribed or intend to transcribe into ASCII for Project Gutenberg, and I was planning on making HTML versions of them all. But if, as you say, XML is so much better, maybe I want to make XML versions instead? Also, how does XML handle text in foreign languages, with letters with accents, text in the Greek alphabet, and all the rest of that? I would very much appreciate the input of Slashdot readers on this question. If any of you can offer suggestions or pointers, please email me at Wkiernan@concentric.net.
Yours WDK - WKiernan@concentric.net
It would seem to me a version of the Principia without formatting that at least approximated the paper book would be sort of... well, not like the Principia.
Personally, I prefer fnord.org's scanned-in copy for an online version.
These are *MY* opinions.
They will not be *YOUR* opinions until the Orbital Mind Control Lasers are operati
Feb. 1997, Wired Magazine carried an article on Michael S. Hart and Project Gutenberg. http://www.wired.com/wired /archive/5.02/esgutenberg.html
Here is Denise Hamilton's insightful prediction at the end of the article: If he were any less obsessed, he would have given up a long time ago. Instead, he is doubling each year. But I also wonder how long the project can keep expanding exponentially. Unless Hart can draft reinforcements or hook up with a sponsor, eventually it probably will stall, and that's a shame.
Almost three years later, we are confronted with the same quandary. A socially useful project. Seems to be stalling because of lack of support. Give it some support, and it keeps stalling. Offer some advice, and it is refused. More passive-aggressive crying from a leader who will not lead. What to do now?
I suggest we all learn from Open Source movements and just do it ourselves. When PG started, computers and scanners and software and networks were very expensive and few knew how to use them. Now they are cheap and ubiquitous, and we need not depend on others to do this work for us, and we don't need to pay money for free books online, and we don't need to ask the government or corporations or anybody else to take it over and do it for us. We don't need to wait for Microsoft or anybody else to provide us with geewhiz technology to finally make it possible to read books online.
As one of the first online books put it, "The light which puts out our eyes is darkness to us. Only that day dawns to which we are awake. There is more day to dawn. The sun is but a morning star."
If we want to preserve the classic works of literature or other books we find important enough to leave for our grandchildren, then we should take responsibility ourselves. Spend $25 for a scanner, $70 for OCR software, learn to use them. Check out books from the library you like. Learn to OCR them and edit them into your favorite format. Put up a website somewhere--many sites are free, or somebody will put the books on their website for you. How much computer power is going to waste sitting on your desk?--could you use it to leave something behind, if not great software, if not a great book your wrote, then a great book that changed your life and needs to change the life of some kid in some other country?
Publish ebooks yourself, put your own name on them, be responsible for correcting errors yourself. Type in links to other web pages that have useful information. Do some research and share it with us online. Choose one author and put all of his books online in one place at your site.
What I am suggesting is no more than our realizing the value of the book culture vs. the TV culture in our society. In Ray Bradbury's "Fahrenheit 451" (a book you will not find on the web) we learn of a time when books are banned because they make people unhappy. The few who are left who treasure books are forced to flee to the woods and memorize them in order to preserve them. They became the books, the authors. This is not a technological problem, it is a social one, and we bookpeople can do something ourselves about it. We can learn from the experience of the free software movement and from PG's sad history too. PG has been quite successful but faces an unceratin future. On the other hand, we need to take control of our own lives and publish our own books ourselves. We can do it!
For more on how we can engage in this communal project without necessarily feeding money to one organization, see Get Involved! at the UPenn On-Line Books Page.
Since I entered the Web in 94 Gutenberg has been one of the most important points I have found there. What it has done is of fundamental importance. Let us note that some of the texts are considered World Literature. Besides project Gutenberg allows us to reach literature that hardly one can find today.
However I am very critical of project Gutenberg in other point. It is good to be conservative. Specially if we consider the nature of this project. However it is too much conservative.
Project Gutenberg always suffered from a illness of having a very primitive search interface. Or by preserving for too long an interface that is morally old. The problem is that sometimes it may not be only necessary to search books by author or title. There are a lot of other search classifications and tools. One of the most important is to search for specific context much like Altavista or Excite do. If project Gutenberg wants to deliver availability then it needs to work on this.
The other point is the cumbersome nature of texts. I agree that it was rather dangerous to choose a text format that could deliver some incompatibility in the future. But that was good in 1994. Today HTML is standard, SGML is standard, XML is new but it is also a standard, TEX may not be so popular but it is also a standard, PDF may carry a commercial tone but anyway it is a standard. And there are tons of tools for converting and reconverting from one standard to the other. So it is time to rethink the standards.
Other point is organisation. Project Gutenberg was and is badly organised. This may look as a seen for some but I really think that a little bit of marketing would help the project a lot. And maybe a little commercial flavour would help even more. Much like what RedHat is to Linux. Gutenberg needs a face. It needs a design. It needs to deliver people something. Frankly no one is borned with the name Oesopus burned in big letters in the brain.
I don't pretend that Gutenberg should become another Amazon. But I think that by making literature a free tool, by delivering an infrastructure in a very GPL'ed nature and, by building a commercial basis for more complex tasks and material support, I believe that Gutneberg may become another lighthouse of the Web.
Sincerly it would be sad to see project Gutenberg closing its doors. Yes we hackers may give some help on making tools and helping project Gutenberg with some design and technical support. Humanists may help by translation, classification and analysis. We may try to push a marketing campaign all over it on our own resources. But this will not save the project if there is not an organisation. If there is not a mechanism to deliver people that the world does not end on The Matrix and Coca-Cola. And if we don't take the care to feed the project with some material resources that may be needed for its future.
...I've been wanting to do this for a while, but have been short-handed. But I'm sure there's somebody reading this who can help me.
BiblioBytes (http://www.bb.com) gives away books on the net. We sell ads on the pages, and authors share in the ad revenue... which means authors of more recent works also put their works online, such as fiction from Neil Gaiman, Peter David, Nancy Kress, Barry Longyear, Ron Goulart and others, and nonfiction like The Temp Survival Guide by Brian Hassett.
PG grants reproduction rights to anybody who wants to distribute their books commercially, as long as PG gets 20% of the revenue-- but I'm shorthanded and haven't had the resources to do the conversions necessary on books in the public domain when there are living authors and expiring contracts to take care of first.
So I'm appealing to you folks. Anybody who wants to mark up PG texts to our stylesheet (basically HTML with certain conventions) is invited to contact me to work out the details. In compensation, I'm willing to give an additional 15% of the revenue generated to whoever does the markup and formatting.
The books will then be in a web friendly format, and PG will get funding from it. And so will you.
Send me email at comment@bb.com if you're interested.
Best-- Glenn Hauman, BiblioBytes
http://www.bb.com
Project Gutenberg has about 50 active volunteers. The rest of the 1000 that Michael Hart claims to have are merely recipients of his distribution lists. The volunteers who work, work very hard and are to be congratulated for having produced over 2000 books in the last few years. A wonderful team indeed!
Please also see the Bartleby Library site, which has many full-text books. Concentrations include reference (Bartlett's Quotations, Emily Post's Etiquette, Strunk's Elements of Style, Fowler's The King's English), poetry anthologies (over 1800 poems in six classic collections), Theodore Roosevelt (8 books, including Autobiography). http://www.bartleby.com Sincerely, Steven van Leeuwen Editor and Publisher Bartleby Library