Slashdot Mirror


Giving Project Gutenberg Recognition

A reader wrote to us about a project that deserves a lot of goodwill Project Gutenberg which is having a hardtime getting attention. Click below to read a letter from the head of the Project.

In an email from Michael Hart, the head of Project Gutenberg, he says:
"Getting the Etexts to twice as many people is just as important as creating twice as many Etexts. . .but without MAJOR publicity it is not likely to happen. . .we constantly get messages from readers who tell us they have been LOOKING for Etexts for years and just at that present time FINALLY FOUND US. . . . That means we cannot get to a major part of our audience with the kind of publicity we have, we need something more. . . . For example, we were the first in an entirely new column: "People To Watch" in the November 8th edition of TIME magazine, but we have received less than a dozen emails per that article. . .what we really need to do is get on Oprah Winfrey, and hopefully add something to her book club. Those of you on AOL, perhaps you could email the show and request they invite us. . . ! We should undoubtedly also try the other talk shows, and "magazine" shows, etc. All the press we receive is from them contacting us, I have had no luck "generating" publicity. . .which seems to be easy, for those who have the knack. . .it's just not MY knack. . .help!!!"

So, if there's anything that you can do to help - do it!

171 of 256 comments (clear)

  1. saved me mucho dinero. by marks · · Score: 1

    Proj. Gutenberg has saved me much money on books. When I need one for school, or just to read, I grab it and download to my Palm. (And when i get bored, I can always switch right over to Tetris or Hardball). Thanks Project Gutenberg!

    --

    -mark
    If your computer says LINUX, run...computers can't talk! [unless you have text-speech software]
    1. Re:saved me mucho dinero. by sporty · · Score: 2
      Another project that will save you much dinero is the one @ cmu. It's the yahoo of online books.

      http://www.cs.cmu.edu/books.html

      Btw.. you didn't say first post.. what's wrong with you ;>

      ---

      --

      -
      ping -f 255.255.255.255 # if only

    2. Re:saved me mucho dinero. by marks · · Score: 1

      Thanks for the link :)

      Btw.. you didn't say first post.. what's wrong with you ;>
      a) I'm not a l-user or an AC
      b) with my luck, in the time it would have taken me to type "fisrt post," it would have been second or third :)

      --

      -mark
      If your computer says LINUX, run...computers can't talk! [unless you have text-speech software]
    3. Re:saved me mucho dinero. by pen · · Score: 1
      Here's another: http://www.bibliomania.com/. This site doesn't have any tech books, but it's a very good resource nonetheless.

      --

  2. Oprah Book Club? by DeadMonkey · · Score: 2

    Don't think they'll let you in, they'd probably find some clause about you not being the authors or something... Anyway, can you really afford the bribes? =)

    On the topic, I find the Etexts QUITE useful... suggested bookmark #1.
    --------------------------------------------- -------------------
    Everybody's got something to hide except for me and my monkey...
    www.stampede.org

    --
    -------------------------------------------------- --------------
    Everybody's got something to hi
  3. RELATED: Copyright-Free English-French Dictionary by Anonymous Coward · · Score: 3

    Another worthy Gutenberg-style project which deserves help is the Internet Dictionary Project. They are building the world's first copyright-free English-to-French dictionary by typing in the text from an old out-of-copyright dictionary by Spiers dating from 1853. They're asking for help, which involves typing in the text, using some simple formatting rules, for any of the remaining pages by reading the scanned images of the original book.

    After joining the project, you download a scanned page and type it in according to their instructions. An important issue is whether they should be aiming to put the resulting work into the public domain as they state, or under a licence offering something akin to GPL's protections?

  4. amazon by gargle · · Score: 1

    Try teaming up with amazon or something.

    1. Re:amazon by Jack+William+Bell · · Score: 1

      Team up with Amazon? Why would they promote a free competitor to their own service?

      Jack
      --
      - -
      Are you an SF Fan? Are you a Tru-Fan?
    2. Re:amazon by Ob1w@n · · Score: 1

      Good idea... only problem... Amazon is in the business of _selling_ books to make money. I just don't see them posting a link... "Buy This Book Now *OR* READ IT FOR FREE!!!!!!" :)

    3. Re:amazon by gargle · · Score: 2

      Project Gutenberg carries only copyright expired texts, so it doesn't really compete directly with Amazon's business. And it may be a good thing for Amazon: it's a nice service for Amazon's customers to be able to read books online.

      Even in the case where Amazon is trying to sell the same books covered by Project Gutenberg (e.g. classics), I think most people still prefer to read the book in printed form, so providing links from the book's page to the project Gutenberg text provides a using previewing service and may help to generate more interest (and sales) for the book.

    4. Re:amazon by reflector · · Score: 1

      It's not as farfetched as it seems. Amazon sells hard-copies, not eBooks. A lot of people want a hard copy when they're reading a book. The only place it cuts into their revenues is where someone wants to look up something quick in a book and would be willing to pay for it only if they had to. Ever tried reading a book online? It's just not the same thing. Amazon could get some good press out of this. Market economics will determine whether the publicity is worth the lost revenue to them, unless a high-up individual in the organization is feeling particularly philanthropic and sees the value of this project for all people and looks beyond the dollar bill. Not altogether impossible. It's worth a shot, I'd say.

  5. Critical mass by Kafka_Canada · · Score: 2

    I've been using PG for a long time, and it's an EXCELLENT resource. The people running it obviously do need help publicizing, and I think once they get started the project will really take off, seeing as they already have tons of information. As the Subject says, once they hit critical mass in terms of public knowledge they're set. Another hassle they're facing is copyright law. The PG site lists some elementary info on copyright law, and it was changed not too long ago to cover a lot more stuff, so unfortunately we'll be seeing that many fewer etexts in our lifetimes. Ah well, what Project Gutenberg provides is nonetheless fantastic, and hopefully it will soon receive due recognition. Dan

    --
    Fuck it
    1. Re:Critical mass by Dr.+Weasel · · Score: 2

      I don't quite understand why this post was moderated down to 0 for being a "troll." Maybe I'm just weird but I saw nothing about this post that is in any way a troll. It might be somewhat redundant but certainly not a troll.

  6. Hell by jdube · · Score: 1

    I'm gonna spread the word on IRC and all sites I have (as if anyone visits them). I've always wanted to see books online and finding this site was a godsend. Seeing as the only other site that does (did... I think they stopped a month ago) was MCP but those books (teach yourself jack shit in 2 seconds) weren't worth reading anyways.


    If you think you know what the hell is really going on you're probably full of shit.

    --
    If you think you know what the hell is really going on you're probably full of shit.
    jdube is who I am.
    1. Re:Hell by Runna^Muck · · Score: 1

      lol. Teach Yourself Jackshit in 2 Seconds. I love that title. Yep I know, I'm going down.

  7. Not to belittle Project Gutenberg... by Jonathan · · Score: 3

    ...but they are hardly the only people producing free e-texts. Yes, I remember that in the pre-Web era their ftp site was about the only place on the net for e-texts, but as the existence of huge archive sites like The Online Books page show, PG is just one group among many similar groups these days.

    1. Re:Not to belittle Project Gutenberg... by John+Fulmer · · Score: 3

      Actually you are belitting them. The link you gave credits Project Gutenberg for most of it's information and only lists two groups, Project Gutenberg and Celebration of Women Writers, as actual organized groups doing online texts.

      And while does list a number of online books, most are small individual online collections, and many are formatted HTML.

      Project Gutenberg is:

      a) Organized
      b) Just the text, no formatting
      c) Extensive

      They are the premier group doing online texts. You really have to give them that.

      jf

    2. Re:Not to belittle Project Gutenberg... by Anonymous Coward · · Score: 3

      You are wrong. PG lists a little over 2,000 books. The On-Line Books Page has links to more than 10,000 in English alone. Therefore you are wrong in saying the OLBP "credits Project Gutenberg for most of it's information."

      Instead of calling PG "organized" and "extensive," you might say it is random and shallow. The OLBP and http://www.ipl.org list books by Dewey categories and subject, while PG does not. PG is shallow because it refuses to put the necessary bibliographic information on texts--one can't even find out which year or which edition most of them are.

      Because PG uses only ASCII files downloaded by FTP or gopher, it has not yet joined the World Wide Web. It is not possible to make a deep link to a paragraph or page inside a PG book, for example if one wishes to reference a quotation. It is intended for offline reading only, and in that respect it offers nothing over a paperback book. Instead, other online book projects deliberately produce works that take advantage of computer power, for example by using more readable fonts.

      As far as PG being the "premier" group doing online texts, that smells of a little old American prejudice. Since they use only ASCII, they can't include accented characters for other languages. They have started to include a few works in European languages, with strange conventions to represent those characters, but it will be interesting to see how PG adapts to Unicode and the extension of the World Wide Web to non-English-speaking nations.

      A more significant beef about PG is that it is centralized and dominated by one person, who does not share the philosophy of Open Source production that most of us do. Instead of forcing individuals to contribute to this project, why not help them set up their own web sites to publish their own works, or other works they have scanned? The WWW has made this type of centralized project unnecessary and even harmful. That is what we ought to be discussing here, not how to send money to this project to bail it out once again.

      I am not afraid to belittle Project Gutenberg. I sign my name, too!

    3. Re:Not to belittle Project Gutenberg... by Alan+Shutko · · Score: 3
      I am not afraid to belittle Project Gutenberg. I sign my name, too! That's why you're an anonymous coward, right?

      At 2337 books (last time I ran a count), PG is nearly 1/4 the OLBP. It's been doing this for a long time, and somehow, keeps meeting its goals of exponential production. No, it doesn't list books by Dewey or by category like IPL or OLBP. Also unlike IPL and OLBP, it's actively involved in putting works in online format.

      Indexes are nice, but third parties can (and are) doing indexes. Without people doing scanning and keyboarding, those indexes won't have much to index. PG provides a single point of contact if you want to scan, proofread, or archive etexts. Ever try looking for a book and found the server down? Collections like PG minimize that problem, because a work is no longer in the hands of a single person who may decide they're sick of it taking up their web space.

      As for preferring rich markup to plain text, it's easy for you to _add_ that markup. Usually, at least 70% of the work can be quickly automated. And keeping plain text means it's maximally useful, since you have a simple base to provide whatever markup you want, be that HTML, LaTeX, MS Word, whatever. If they'd been rich markup from the start, do you think that the texts from 1991 would be in HTML? What about when we switch over to some XML-based markup? And what happens after that?

      If it makes you feel better, don't think of PG as finished product. Think of it as raw material, and put together your own site with richly marked up texts and scripts to do web-cites of specific chapters, sections, etc. I think that would be a very valuable thing to have, but it will certainly be easier for having so much of the grunt work done for you.

    4. Re:Not to belittle Project Gutenberg... by chris.bitmead · · Score: 1

      I think it would be better now if PG made appropriate use of basic XML. Formatting and indexing can EASILY be taken out. But putting it back in is WAY HARD. One example is the PG king james bible. Parsing and figuring out the chapters, verses etc it a semi-hard computing problem. But stripping out some XML markers for these things is trivial. Or converting it to some future format is trivial. But if the markers aren't there it's a bit hard.

      Plain text is much better than nothing, but it's not the way to go, especially these days.

    5. Re:Not to belittle Project Gutenberg... by GnrcMan · · Score: 2

      They are the premier group doing online texts. You really have to give them that.

      And the oldest! There's pretty much as old as the net.
      I have nothing but respect for PG and what they represent. I actually feel bad because I haven't visited the site in a while, but I remember them being the reason I started using the net...remember Gopher?

      --GnrcMan--

    6. Re:Not to belittle Project Gutenberg... by GnrcMan · · Score: 1

      There's pretty much as old as the net

      "There's"? Sorry, that should be "they're". I'll be going out back to beat myself with a clue stick now.

      --GnrcMan--

    7. Re:Not to belittle Project Gutenberg... by Anonymous Coward · · Score: 1

      If it makes you feel better, don't think of PG as finished product. Think of it as raw material, and put together your own site with richly marked up texts and scripts to do web-cites of specific chapters, sections, etc. I think that would be a very valuable thing to have, but it will certainly be easier for having so much of the grunt work done for you.

      Some of us have tried. But what if PG actually loses information in their process. For example, they are not careful to reproduce italics or bold face or accents or vertical spacing. And they don't give the exact edition used, so one can go back and compare with the original, even for proofreading obvious typos.

      So, unfortunately, in many cases it turns out to be easier in most cases to OCR the book again and get a clean copy to work with.

      And when doing that, with HTML, one can even include the illustrations that were left out of the PG edition--even if the text refers to them.

      Plain ASCII text is only maximally useful if it conveys all the information in the original. Since each PG text has its own conventions for markup (unlike HTML) it should not be called plain ASCII text, but some sort of arbitrary structured ASCII. It's not useful at all, in too many cases.

    8. Re:Not to belittle Project Gutenberg... by John+Fulmer · · Score: 2

      > I am not afraid to belittle Project Gutenberg. > I sign my name, too!

      A bit odd for a AC post. :)

      And so hostile, rambling and full of what are apparently personal issues that I can't even comment, since it seems so, er, out there. Sorry can't help you.

      PG is a good project. There should be more like it. And I don't understand all the hostility for a project whose goals are to provide and preserve texts in an electronic form.

      jf

    9. Re:Not to belittle Project Gutenberg... by arcade · · Score: 1

      >> They are the premier group doing online texts. You really have to give them that.
      >And the oldest! There's pretty much as old as the net.

      I remember using Mike's BBS (in norway), and downloading Pg files from there. I thought it was a great resource then - and I still think it is.



      --

      --
      "Rune Kristian Viken" - http://www.nwo.no - arca
    10. Re:Not to belittle Project Gutenberg... by Shotgun · · Score: 1

      A more significant beef about PG is that it is centralized and dominated by one person, who does not share the philosophy of Open Source production that most of us do.

      I'll have to concur on this point. Several years ago I wrote a small hack that would convert the PG's version of the King James Bible to html. It broke each book down into a seperate directory and then created a different file for each page. It created a table of contents as index.html and I even added a truly exhaustive concordance before loosing interest in the project. I contacted PG contact person via email to make a gift of my work to the project, but he was reluctant to try the program because it created sub-directories. He felt that sub-directories were to difficult to remove.

      You don't get a throng of people wanting to join your project with this sort of not-invented-here attitude.

      --
      Aah, change is good. -- Rafiki
      Yeah, but it ain't easy. -- Simba
    11. Re:Not to belittle Project Gutenberg... by Vidar+Hokstad · · Score: 1

      Mm, no. They are nowhere near as old as the net. They are older than the web, though.

    12. Re:Not to belittle Project Gutenberg... by GnrcMan · · Score: 2

      Actually, they are very nearly as old as the net. PG was started in 1971. The original 4 computers of ARPANET were hooked up in 1969.

      --GnrcMan--

  8. Useful stuff! by Cef · · Score: 2

    Project Gutenberg has been around for literally years, and is a resource I always check when I can't find that elusive book.

    In fact, we used to use the text from many Gutenberg documents when I was fiddling around with data compression (specifically compression methods aimed at text and english in particular).

    Also, many of my relatives have asked me "Can you find out about this book on the net, and where I can find it?" and are somewhat suprised when I hand them a disk with the Gutenberg text version of it. First time I did this, they thought it was reviews of the book and details where to find it, instead of the actual text. "Remember how I said that a floppy disk holds about as much text as one small book?". *grin*

    However I think they got one of the most important boosts of advertising they could ever want, an article on good ol' Slashdot. Way to go Hemos! (and CmdrTaco of course) *grin*

    PS: Heya to FunkyBob, the guy who did most of the coding on that compression stuff was I mentioning earlier - and when will it ever work properly damnit! It's only been about 6 years! *grin*

  9. Great Site by Jonathan+the+Nerd · · Score: 1
    Project Gutenberg is a Godsend for those of us who love to read. Last summer I spent nearly all my free time there. They have a great collection of older literature. Soon after I discovered the site, I had read nearly all their Sherlock Holmes collection, as well as many other books/stories I had wanted to read but never gotten around to checking out from the library. The only thing I dislike about the site is that they have virtually no 20th-century literature. (But that's due to restrictive copyright laws, not because of any failing of the site's administrators.)

    --
    Disclaimer: The opinions expressed are not necessarily my own, as I've not yet had my medication today.
  10. Speaking of Slashdot... Ad banners? by Cef · · Score: 2

    How about giving Project Gutenberg a free banner ad here on Slashdot? Now that'd generate a lot of traffic and put them right out in the public view!

    Whaddya say guys?

  11. Re:Congratulations!!! You just received 1M+ Hits.. by peterbasil · · Score: 1

    Project Gutenberg was one of the first endeavors that really got me interested in the internet. Although etext always seemed to be in tough competition with online porn. However if they really want the Slashdot readership, perhaps a few comments that they are running on a Beowulf cluster and use only GPL software would really kick the /. effect into high gear.

  12. I support PG, but... by Anonymous Coward · · Score: 2

    if only for one thing, and this is nitpicky, but I think it might help a little:

    They need to make their online books a little more web-friendly. I understand PG's reasoning behind keeping everything in pure vanilla ascii, but quite frankly, in that state, the etexts don't look terribly great, nor are conveniently navigable in most web browsers. Appearance and ease of use are, imo, important factors if you want to attain a large audience.

    Bono Vox, bono@vox.org

    1. Re:I support PG, but... by Skyshadow · · Score: 2
      I agree here. It wouldn't kill anyone to run everything through a simple Perl script that would tidy the texts up a bit and make them a bit easier on the eyes. It's not as if you have to burn the ASCii copy to make a HTML-formatted version available.

      What I'd really like to see, however, are versions that take advantage of hyperlinking. I once saw a HTML copy of Dante's Inferno which was fully linked up with an in-depth annotation which explained references and other aspects of the work which I would have missed unless I'd taken a class about the book. It was incredibly useful; it still stands out in my mind as the most incredible thing I'd ever seen done to a book. It let me understand Dante in a way I couldn't have otherwise.

      HTML and Palm-formatted versions would be great. Again, it's not like they have to ditch the plaintext version to provide others.

      ----

      --
      Every year during my review, I just pray the words "slashdot.org" aren't mentioned.
    2. Re:I support PG, but... by arcade · · Score: 1

      They need to make their online books a little more web-friendly.

      Download the ascii and read it at home, stupid.


      --

      --
      "Rune Kristian Viken" - http://www.nwo.no - arca
    3. Re:I support PG, but... by spconner · · Score: 1
      While they may not look great as plain ASCII, as long as they're consistant in formatting with ASCII it's relatively easy to break up and reformat for the web.

      Last week I downloaded a version of the King James Bible (bible10.txt---almost a 5M textfile) and because of the consistant formatting, was able to break it up into book, chapter and verse and create an online version that was primarily (for me) an experiment on making a web-friendly version of a hiarchially structured document (Book.chapter:verse etc).

      More, or inconsistent formatting, would have made the job more difficult that it was.

      -spc (Converting the text was the easy part ... )

    4. Re:I support PG, but... by cabalamat · · Score: 1

      They need to make their online books a little more web-friendly.

      I agree. In the web-age, their site needs to be easilty accessible with a web browser. This implies the books being in HTML. (They shouldn't be stored internally in HTML, an SGML or XML markup format would be better).

      Hmmm, I've got a website which could hold HTML-ized versions of their e-texts, I think I'll have a word with PG about it.

  13. How to gain support by RaveX · · Score: 4

    Personally, I remember my first run-in with PG back in the days of the BBS... it was a Taoist text in one of the download sections that had been created for PG. I also seem to remember a very lofty goal at the time, something like a billion downloads...?

    At any rate, I think a few areas might provide support...

    *Amazon (someone mentioned this) is a _bad_ idea. Profit motive and releasing free documents don't coincide well.

    *The Palm computing platform is the big plus. To be able to read in such a convenient form is wonderful, and PG offers a large library of material for consumption. However, PG needs to _market_ to them, meaning convenient little formats, getting linked to, etc...

    *Align with the OS movement more, there's plenty of talent that would likely work on such a task, but probably isn't even aware of it. Getting mentioned on /. is a huge start.

    *Make better use of technology... I seem to recall very slow rates of progress, which lowers the level of excitement for those involved (it's sad that this is a factor, but very true)- can't many works simply be OCR'ed?

    *The general public (Oprah Winfrey's audience, etc.) is most likely worthless. It seems as though most of the public rarely reads, let alone transcribes... The only thing they might be good for is cash to support the effort.
    Just my US$0.02

    1. Re:How to gain support by apsmith · · Score: 2

      Good comments.

      On the question:
      > can't many works simply be OCR'ed?

      Project Gutenberg has been using OCR for years, including some custom OCR software developed along the way. However, they care about quality too, and OCR text ALWAYS has errors, especially when you're OCR'ing something that's 75 or 100 years old, as required by the copyright laws. The major effort is usually in proofreading. However, in some cases it's just faster to re-type the text - that's what I did for the things I worked on for them. I also learned how to touch type at 90 words/minute :-) which never ceases to impress my co-workers.

      --

      Energy: time to change the picture.

    2. Re:How to gain support by A+Big+Gnu+Thrush · · Score: 2

      The Palm computing platform is the big plus

      I read PG texts on my Palm IIIe all the time, but I have to take the time to download the text, convert it to Palm format, then install. If PG were to take their top X number of downloads and make them conspicuously available to Palm users, it might go a long way to increasing visibility.

      The problem is, you may be preaching to the converted. Palm users are tech savvy; tech savvy people are already aware of PG.

      Still, it would help me. Start with Shakespeare.

    3. Re:How to gain support by Hard_Code · · Score: 2

      ""The Wizard of Oz" was originally written to promote the coinage of silver in the late 19th century."

      And Frank L. Baum was also a racist (as many were of his time). He actively promoted and supported the idea of extinguishing all Native Americans to "put them out of their misery" so to speak.

      History differs a lot according to who tells it ;)

      --

      It's 10 PM. Do you know if you're un-American?
    4. Re:How to gain support by jzitt · · Score: 1
      RaveX suggested:

      *Align with the OS movement more, there's plenty of talent that would likely work on such a task, but probably isn't even aware of it. Getting mentioned on /. is a huge start.

      So howabout setting up an RDF headline server, or whatever they're called, with a list of recent PG (I keep thinking that means Peter Gabriel)
      releases, and making it one of the /. slasbox sidebar thingie options?
  14. Guttenburg is a money saver for students by xHost · · Score: 1

    My laptop's a reall old 486 running dos/win3.1 : P (linux won't boot), but wow .. project guttenburg saved me a ton of money on not buying books that i wouldn't use ever again .. plus its less of a load to carry ..

    Maybe they should start posting posters across campuses and such .. eh ?

    1. Re:Guttenburg is a money saver for students by dozing · · Score: 1

      This is one of the best ideas I've read yet. Any /. students out there should post a note on message boards around campus when it comes time to buy books. Maybe you could even talk to some English/Literature teachers. I know mine always complained about the price of books. If they new about PG they might encourage the students to check it out, and possibly rearange their curriculum to include more freely available texts. (Of course we know this one won't happen... College profs NEVER change the curriculum)

      Well I'm not in a spot to do much in that area, but I've been a PG lover for many years, so you can bet I'll be including a link to them from my page.

      If you don't like it... Don't read it...
      Dozing
      www.dozings.com
      The most fun on the web.

      --
      Dozings.com -- Its kinda funny... If you're as crazy as me.
  15. E-books, Gutenberg, public domain and the GPL by apsmith · · Score: 5

    If any of you have played with the E-book readers out there (Rocketbook or Softbook are the main contenders) you'll notice that 90% or so of the books they offer right now seem to be public domain ones, mostly from the Project Gutenberg collection. And that does make sense - PG is all about etexts, the E-book readers are about reading etexts... Anyway, it seems the two parties ought to get together. But unfortunately, the Ebook vendors seem to be more focused on licensing and copyright issues and making money from selling content, rather than just making and selling their hardware. Can't Dell or somebody like that get into this business and show how it ought to be done?

    Anyway, if we could get a bunch of recent books out there in the public domain (or GPL of course) - either under Project Gutenberg or some other auspices - I think that would demonstrate this is a serious option for the future of reading. The technical market might be ideal - how about merging in some of the Linux Howto's and the Linux documentation project with this kind of effort? Instead of making a buck for yourself and Tim O'Reilly, how about publishing with Project Gutenberg next time? Just as with Linux and the World Wide Web, it could be a way to guarantee readership you would never get by selling the stuff.

    By the way, I prepared 2 books for Project Gutenberg many years ago, and did some work on their Encyclopedia project, but I've not been keeping track for the last few years - it's definitely continued to grow and be successful. Despite Michael Hart's quirkiness, it really has come close to fulfilling the original promise (10,000 free etexts by 2000). A hearty congratulations to Michael and all the volunteers!

    --

    Energy: time to change the picture.

    1. Re:E-books, Gutenberg, public domain and the GPL by fiori · · Score: 1

      Actually Dover buys the rights to out-of-print books. The copyrights on all of Dover's books are still in effect.

    2. Re:E-books, Gutenberg, public domain and the GPL by Hello+Kitty · · Score: 1

      I just completed a review of the RocketBook that says basically the same thing. That's a spiff little unit (and incredibly cool once you get used to it), but I like it best as a PG delivery device. Who wants to pay $20 for an e-book that'll cost you $25 in hardcover form?

      Reading a PG text on a RocketBook, however, is sweet -- much better than reading it on a regular screen (and far more portable). Read in bed! Read in the dark! Read on the stoker's seat of a tandem bicycle! Read the M$ findings-of-fact without wasting 207 pages worth of tree!

      And PG has a secret quality weapon going for it too -- to get into the archive, someone has to like a text enough to go through the process of input (which is pretty tedious even if you're scanning). Keeps the riff-raff out, in a wholly unintentional way...

  16. PG and GPL'd books by kjack · · Score: 1

    What PG is doing is a really great service and needs to get publicity. Sadly the copyright restrictions serverely limits most recent books from being included. Wouldn't it be nice if the GPL would catch on with books? Why not some "open-source" sci-fi novels?

    1. Re:PG and GPL'd books by Doomsayer · · Score: 1

      Someone is writing open source sci fi novels:
      Mike Combs, mikecombs@aol.com
      who has his sci fi stories at:
      http://members.aol.com/howiecombs/hard_s-f.htm
      of which I really liked the novel 'A Bridge to Space':
      http://members.aol.com/howiecombs/bridge.htm

      Here are links to free online book sites:
      http://www.stanford.edu/~sothy/books.html
      http://samizdat.mines.edu/
      http://www.icemall.com/free/free_books.html
      http://www.ipl.org/
      http://www.itlibrary.com/
      http://www.cs.cmu.edu/books.html
      including Project Gutenberg:
      http://promo.net/pg/list.html

  17. Why PG is imporatant & relevent by Dacta · · Score: 5

    I've been reading PG books since I've been on the net ('94) and I think they have got to be one of the most important resources available.

    People discount PG by saying thing like "Oh, you can get free texts anywhere" and "Books are outdated, anyway".

    Well, imagine happening without PG: Copyright laws are changed so that copyright does not run out after 30 years (or whatever it is) - and this is what the film lobby wants.

    Then, in 10 years or so, a law is made giving ownership of texts that have become public domain back to the decendents of their owners, who then seel them to film companies or amazon.com

    These companies decide that they only want to sell paper-books, and the demand for some titles is so low that you have to get a special publishing run for them.

    Then a some books get banned for being sexist/sexy/racist/communist or whatever, and you can no longer get them - period!

    Books - or at least the text of then is the life blood of civilisation - and PG is something that is making this freely (as in speach) available to all.

    Support it!

    PS:yes, I know the scenerio above wasn't real, and I know "the internet changes everything", but in 5 years, when you are reading "Sherlock Holmes" on your Palm XX, you can thank Project Gutenburg for keeping it free.

    --Donate food by clicking: www.thehungersite.com

    1. Re:Why PG is imporatant & relevent by pb · · Score: 2

      Excellent, I must agree.

      I used it back then when I didn't want to buy a copy of Flatland... I'd already read it before, and it's much easier to search on online version.

      It again proved invaluable when I had to look up lots of random British poetry for a class, and makes searching and citing lines so much easier.

      However, what do you expect for popularity when the web site is somewhat organized and the ftp site (where everything is) is worse--last I checked, organized by the year it was retyped or something... But I haven't looked in a while, and I can usually find a link to it on the web. However, it ain't pretty, even from my '93-'94 web publishing standards. ;)

      I wish there was something similar for movies. Time to start collecting movie scripts! (I found an annotated script, basically, of "Shadow of a Doubt", and it helped a lot with a paper I was writing, but I didn't find a central place for scripts. Although I bet the imdb would take them, or link to a site that had them.)
      ---
      pb Reply rather than vaguely moderate me.

      --
      pb Reply or e-mail; don't vaguely moderate.
    2. Re:Why PG is imporatant & relevent by pb · · Score: 1

      :) Yep, that's the one. It's okay, because I'd read the book before, and other people had copies of it too.

      ASCII illustrations? What are you talking about, that's awesome! Put it in HTML with the PRE tag! ;)

      If they put it in standard HTML, they'd have to use table art, or use separate image files. Then they'd have to decide on (copyrighted) GIF's, (unsupported) PNG's, or (wasteful for line art) JPG's as an image format. Then they'd have multiple files for individual books... And then they'd say "Why didn't we use .pdf's?" And we'd say "Isn't that proprietary-speak for ps.gz files?" Aaaaggghhhh!

      So I can see why they used text, even if there are three different conflicting conventions for ending a line... The nice thing about standards is that there are so many to choose from!
      ---
      pb Reply rather than vaguely moderate me.

      --
      pb Reply or e-mail; don't vaguely moderate.
    3. Re:Why PG is imporatant & relevent by WNight · · Score: 1

      You deserve a few moderation points on that message. It's a great discussion of why the newest greatest format isn't always the best to use even if it has some whiz-bang feature.


      I'd prefer, in these days of cheap bandwidth, a text version of the writing, and black and white scans of the book so that a future person could add the illustrations into a rich version, but the book could still be read by the lowest common denominator.

      If it can't be viewed properly in lynx, we've lost a good portion of our audience.

      The blind have a right to read public domain books too.

  18. this is *not* helping the kids on Biafra by Anonymous Coward · · Score: 2

    I like getting my hand on free 19th Century classics as much as the next guy. However, I find Project Gutenberg of dubious usefulness. And I strongly disagree to the claims of some journalists that this project, if completed, will be a great help for the schools of the Third World.

    I am sure the Internet and its associated technologies can be used to help impoverished kids worldwide, but I don't think they would benefit much from an electronic version of, say, Boswell's Life of Johnson... in English.

    [I know this is slightly off-topick, but I just wanted to prevent someone coming up with the references to rural Kenya that always pop up when discussing Project G.]

    1. Re:this is *not* helping the kids on Biafra by Li'l+Mark · · Score: 1

      Well, English is widely spoken throughout Africa, in the former British Empire, and i think that eg. Shakespeare is of value anywhere. The site is perhaps too oriented towards Western literature - theres plenty of Islamic/Indian/Chinese literature is available out of copyright (as well as folt tales, mythology, etc). Actually, it would be very interesting to have a related project transcribing old textbooks, classics of science and the like. (I did notice that PG has the 1st million digits of 1/pi)

    2. Re:this is *not* helping the kids on Biafra by engel · · Score: 1

      Yeah! Right on Anonymous Coward! All people in Biafra need is food in their belly that we give to them. They don't need something like EDUCATION or THINKING. Screw that. I mean, if they can think for themselves instead of relying on us to think for them, then they could eventually figure out how to FEED themselves. And if they feed themselves and educate themselves, then we won't feel good about ourselves every 20 years when we decide to donate food for them or have a concert for them.

      There is more to being human than eating and excreting. Even Thoreau, who would be the first to say that 3rd world starving countries should stand for themselves, would agree that the way to MAKE them stand for themselves is to give them as much information and support as possible, and then let them make the decisions that will allow them to function properly.

    3. Re:this is *not* helping the kids on Biafra by hey! · · Score: 2

      English is widely spoken and important international language. It would help any developing country to have people who read and speak English, because it can improve their access to information, capital and trade. It benefits the country to develop a class of literate people who at least understand western culture, even if they don't have to agree with it.

      To learn English, you need books. Maybe the kids don't have a computer, but it's a fair bet that somebody in the country has a computer and access to a printing press. That person could print Gutenburg texts, with local language introductions and footnotes on difficult words or phrases. They don't have to wait for some western publisher to decide there's a market in third world edition of English language classics.

      Does it solve all the problems of the third world? Of course not. But freedom of information benefits everyone, even when they don't have immediate access to it.

      --
      Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
  19. *Ahem!* by fireproof · · Score: 2
    It is sad to see this here.

    Really? I'm rather thrilled to see this here. Until now, I've never heard of 'em.

    If Project Gutenberg had been started by Linus or RMS, would it send out hysterical letters every month asking for money to keep the project afloat?

    I fail to see how this qualifies as sending out "hysterical letters every month asking for money to keep the project afloat." Rather, it seems to be a request for publicity. It doesn't seem to be all that hysterical at all.

    Wondering what this has to do with Slashdot and Open Source and GPL.

    I don't see that it has much to do with Open Source and the GPL, but it seems to have a bit to do with Slashdot. "News for Nerds. Stuff that matters." Well, I'm a nerd. I like to read. This is news to me, and stuff like this sure matters to me. End of story here, at least.

    Seems PG is a conservative outfit that resists change in technology,

    Unfortunately, I fail to comprehend how translating texts printed on paper to an easily-reproducable format that can be easily obtained via the internet qualifies as resisting change in technology.

    won't cooperate with other free ebook causes,

    Proof? Links, quotes, or something, please. If this is true, I'd be interested in seing something to subsantiate this, as I'm not likely to take the word of an A.C. alone as gospel truth.

    is intent on producing numbers of poorly proofread texts and admittedly of not top quality,

    Try getting 30 of your closest friends and proofread several hundred thousand pages of material and see how well you fare in getting all the errors. Please cite something to prove that they are "intent on producing numbers of poorly proofread texts."

    and doesn't accept criticism from outsiders.

    Again, proof please!

    BTW, the aforementioned letter from Hart is not on the page linked to, and it contains errors of fact about copyright.

    I don't see any claims that the letter is on the page linked to. If if does contain errors of fact about copyright, please cite some.

    Maybe I'm way off here. I've never heard of this project before today, and thus my knowledge is limited to what I've seen here and my brief perusal of their web site, which more or less only consisted of checking to see what they had by F. Scott Fitzgerald (only This Side of Paradise) and Gabriel Garcia Marquez (nothing). If what you say is true, I'm certain that I, as well as other readers of Slashdot, will benefit from having some primary source material to peruse demonstrating your claims. Right now, all we have to go on are your quite unsubstantiated allegations.

    --

    /* "A fool does not delight in understanding, but only in revealing his own mind." */

    1. Re:*Ahem!* by Christopher+B.+Brown · · Score: 2
      It surely is like unto free software in that it promotes the free availability of literature.

      It may not use the GPL, per se; it may not attempt to challenge copyright in the way that RMS seeks to challenge the notion of proprietary software.

      But it certainly represents an analogue to free software.

      Books are rather less ephermal than computer software, as the Gutenberg Project surely shows that there is literature a hundred years old that is still worth reading, whilst much of the computer software of ten years ago isn't worth using. (The "computer literature of UNIX and Lisp" representing occasional literate exceptions...)

      There used to be somewhat wild claims about the value of the Gutenberg Project; as it has grown from nothingness into being a fairly significant library with diverse users, the values have become clearer.

      The project has suffered from some texts being of somewhat questionable quality; their transcriptions of some religious works have been useful in bringing in both a touch of religious fervor, to more actively stiumulate verification, as well as in perhaps pulling in some of the "scribely" skills mostly associated with religious traditions.

      I think they had some problems with some of the early OCR technology, finding accuracy to be a bit low. Time, independent OCR attempts on differently published editions of books, and learning curve can provide improving results...

      --
      If you're not part of the solution, you're part of the precipitate.
    2. Re: *Ahem!* by fireproof · · Score: 1
      The original Cmdr. Taco posting at the head refers to this letter, which originally was at the link, but PG changed it. Who is at fault, PG or Taco?

      I suppose that Hemos (who posted, not Taco) wouldn't be at fault if PG changed their site. However, it appear(ed/s) that the link simply points to the PG site itself, judging from the context and the construction of the link. If PG had the text of the letter present in their index.html file, and changed it after the story was posted, I missed out seeing it. If that's the case, I'm clearly in the wrong in my previous post.

      Thus they should be aware that current U.S. copyright law (since the Sonny Bono Copyright Term Extension Act of 1998) extended copyright from 50 to 70 years after the author's death, or extended it from 75 years after publication to 95 years, whichever was applicable. Evidently the content of the letter was copied from some 1992 writing, and needs correction.

      Very well. This is the sort of information I was looking for. Given their mission, they should be on top of this sort of thing, even if they don't handle materials published after 1922.

      And, in retrospect, given Marquez's birthdate and the like, I should have readily realized that his stuff wouldn't have been their. Quite a lapse in my thinking there . . .

      I'm glad you've now heard of Project Gutenberg. Perhaps it will make you think of the differences between it and open source. You have heard of that, haven't you?

      Open what?

      --

      /* "A fool does not delight in understanding, but only in revealing his own mind." */

  20. Some tactics... by B10MA55 · · Score: 4

    1) Use your domain name: Gutenberg.org 2) Get the crawlers to got thought the texts _at your site_ . You can wrap in pre tags... 3) Your ripe for a grant for outreach. If you don't have the "official" framework, contact some CS or English depts and see about some joint work here. 4) oss4lib is a new group that could be seen as having a relation to you... 5) Perhaps some outreach letters to English depts at various levels, from grade school up. 6) Bells and Whistles: how about some history on gutergerg, past and present. Entertainment. 7) Given talks at various places helps. You might meet some connected people on the way... 8) In general, of the e-libraries, what tactics are the successful ones using. Seems a good learning place. I do like the tasteful layout and quickness of your cover page. I have always been impressed with Gutenberg! Good luck

    1. Re:Some tactics... by bockris · · Score: 1

      Your first suggestion is the one that would be a lot of help. If I am at a 'foreign' computer without my bookmarks I have to use Yahoo to find PG. This is not so for every other site I frequent, which have URLs are are closely aligned with their names. Likewise when I tell someone about the site, I can't give them a URL but end up telling them to search Yahoo for it. They would be more likely to visit, if I could write it down for them.
      Change the URL immediately!!!

  21. I'd think there's a need for some that don't save by Christopher+B.+Brown · · Score: 3
    The way that Project Gutenberg gets material is by people contributing the often rather substantial effort of typing in material, as well as working to verify its correctness.

    That really is a quite considerable cost, in much the same way that the production of "free" software requires substantial effort.

    It is somewhat unfortunate that there have been such peculiar positions as:

    A friendly dissuasion from this yielded the first posting of a document in electronic text, and Project Gutenberg was born as Michael stated that he had "earned" the $100,000,000 because a copy of the Declaration of Independence would eventually be an electronic fixture in the computer libraries of 100,000,000 of the computer users of the future.
    It did not add to the project's credibility when they on the one hand indicated that their funding was maxxing out at around $30K per year, whilst claiming that they were producing "billions" of dollars in value. (Note that the PostgreSQL HOWTO suffers from the same sort of thing...)

    A claim of $30K on the one hand, and $Billions on the other, do not reconcile very well.

    Not unlike the situation with the FSF, they could probably more readily use contributions of time rather than of money, although some of both doubtless prove valuable to some degree...

    --
    If you're not part of the solution, you're part of the precipitate.
  22. another link ... by Anonymous Coward · · Score: 1

    I really liked that link.

    Here is another: classics.mit.edu

  23. Vote for Gutenberg on Distributed.net by Cuprous · · Score: 2

    Anyone who is doing the distributed.net project, you can vote for the charity money that will be won to go to Project Gutenberg.

  24. A great project! by dclatfel · · Score: 1

    I've followed them and downloaded their etexts for a number of years now, and I must say that Gutenberg is one of the finest, most selfless projects on the internet.

    My favorite thing to do with Gutenberg etexts is to load them up in TextEdit Plus on my Powerbook and "Speak document" while I work. It's very cool ... and the voices in OS 9 are much better than they have been in the past.

    Three cheers to project Gutenberg, and anyone out there who hasn't already checked them out should do so ASAP!!

    --
    Share data. Share code. Share ideas. Share the wealth.
    http://stockfilter.org
  25. Linux and Copyright-Free ENGLISH DICTIONARY ? by Anonymous Coward · · Score: 2

    Is there a copyright-free English dictionary that could be distributed with open-source software like word processors for Linux? This seems to be an important feature that current Linux distributions are missing. I have in mind something better than /usr/dict/words.

    Creating a good dictionary from scratch is hard work, but if you can get the structure and the word list e.g. from a copyright-free source then the hardest part is done. Therefore, a good starting point would be to take the structure of the copyright-free Spiers English-French Internet Dictionary, i.e. cut out the French translations to leave the English core. Is anyone else interested in this?

    1. Re:Linux and Copyright-Free ENGLISH DICTIONARY ? by superfly · · Score: 1

      Take a look at WordNet. You can use their online version or download it.

      It has also been formatted for the DICT protocol. I wrote a web interface that accesses WordNet and a number of other dictionaries. (dict.org has one too, but I like mine more... and also I noticed theirs after I was finished.)

    2. Re:Linux and Copyright-Free ENGLISH DICTIONARY ? by sumner · · Score: 1
      Is there a copyright-free English dictionary that could be distributed with open-source software like word processors for Linux?

      Yes. Project Gutenberg has an old Webster's dictionary.

      Sumner

      --
      -- rage, rage against the dying of the light
    3. Re:Linux and Copyright-Free ENGLISH DICTIONARY ? by fiji · · Score: 1
      Gutenberg has the complete Websters unabridged dictionary (1913 edition).

      Go to the Gutenberg Search page and search for title dictionary (but turn off match whole words).

  26. GB text is a little hard on the eyes... by wilkinsm · · Score: 3

    Every submitter formats the text differently, and the inline ("botton of page") footnotes are a real annoyance.

    However, I would like to say that via GB, I've read every Charles Dickens and Sir Arthur Conan Doyle novel they have e-published, to much satifaction. I started on other authors, but then a friend introducted me to the Dune and Hyperion series. :)

    I think it's safe to say now that webifing the text would be a wonderful idea. If you were to index them in the web search engines, you would then definately get more hits. I'd love to be able to type in a search engine "to be, or not to be" and get sent to the correct page in the GB e-text.

    Once you do that, launch a ad banner campaign with suggestive quotes. ie. "The staircase was darken with gloom...(click here to read more...)"

    BTW: I read "Sun Tsu" as well. Way cool...

    1. Re:GB text is a little hard on the eyes... by jfunk · · Score: 2

      Every submitter formats the text differently, and the inline ("botton of page") footnotes are a real annoyance.

      Definitely. I'd like to see a more "interactive" way of doing footnotes. Check out how they're done in LyX. That's really nice. Maybe LyX versions of the books should be done...


      BTW: I read "Sun Tsu" as well. Way cool...

      A suggestion: Read Machiavelli. There are two books, often put together, called "The Prince" and "The Discourses." I started reading the PG version and ended up buying the dead-tree version because I wanted to read it on the bus and didn't want to waste my laptop's battery life.

      Basically, If you enjoyed playing Civilization, you'll find a fair bit of familiar stuff in there. Now I have CTP and my playing style has changed dramatically.

      Now, if I could only fit the texts on my TI-86....

  27. PG plus latex equals much goodness by mattorb · · Score: 1

    A friend of mine introduced me to this during undergrad, and I was instantly a fan. Sure, the texts aren't that pretty to look at as is, but dump them into an Emacs buffer, add a few LaTeX markup tags, and suddenly you've got a decent-looking copy of whatever. (This is especially nice with texts which are relatively short -- I remember in particular having a tex-ified version of the Communist Manifesto. :-))

    1. Re:PG plus latex equals much goodness by anonymous+cowerd · · Score: 1

      > I remember in particular having a tex-ified
      > version of the Communist Manifesto. :-))

      If you liked Manifesto then I know you'll just love my forthcoming transcription of Capital, in three volumes, just shy of three thousand pages. Coming soon! OK, maybe not soon, as I'm only good for about a hundred pages a weekend, probably the middle of next year.

      Yours WDK - WKiernan@concentric.net

      Criticism has torn up the imaginary flowers from the chain
      not so that man shall wear the unadorned, bleak chain
      but so that he will shake off the chain
      and pluck the living flower. - Karl Marx

  28. Hey! How about associating with... by wilkinsm · · Score: 1

    ...a text to speech company. It would put a new spin on "AudioBooks."

    BTW: Gutenberg texts suffer from alot of typos - about half way through the work, the quality really started to suffer badly...

  29. Free Dictionary/Encyclopedia (ie, like noah) by Anonymous Coward · · Score: 1
    I was thinking recently of this very same thing. On the Solaris system I used back in college, there was a utility licensed to the engineering dept called 'noah' which was sort of an online dictionary/encyclopedia. It would give just enough definitions to be useful for a word, as well as little related facts, and pronunciations, etc.

    I was wondering about how feasible it would be to start a GPL version of this. Ie, start with a bunch of words that would be commonly looked up, and we'd come up with definitions for them paraphrased from a variety of sources, so it wouldn't be plagiarizing.

    If the resulting text wasn't stored in plain-text (too large) but compressed, there could also be specialized tools to grep for keywords anywhere in the definitions, etc. Is this a decent enough idea? Having noah was really REALLY handy, just at a prompt type, "noah asperity" for a good definition of 'asperity'. Really useful

    1. Re:Free Dictionary/Encyclopedia (ie, like noah) by WorkJabez · · Score: 1

      To get this sort of thing working, you'd need to write to a number of dictionary publishers, and ask them nicely for a copy of a previous edition's typeset tape. It'll be in a horrible format, all
      control codes, but it will be possible to reconstitute it to a decent structured form.

      I'm suspicious of the claims made of the 1911 Webster that's going around. Dictionaries have come a long way since then; the style is more discursive, less abbreviated, and there are many key words that just weren't around then.

      I've done a lot of dictionary text conversion; it's not easy, but can be done. With many eyes, the process should be simple.

  30. Resisting change in technology. by Radagast · · Score: 1

    >>Seems PG is a conservative outfit that resists
    >>change in technology,

    >Unfortunately, I fail to comprehend how
    >translating texts printed on paper to
    >an easily-reproducable format that can be
    >easily obtained via the internet qualifies as
    >resisting change in technology.

    It doesn't. However, PG in general, and Hart in particular (as if you can really separate the two) are stuck in a reasonably old-fashioned mindset when it comes to textual information. Because the project started way back in the Seventies (I believe), the choice to use only plain ASCII might have made sense then. It certainly doesn't do so now.

    PG would benefit greatly from a structured information format, preferably one that could be transformed down to plain ASCII when needed (most formats that would be appropriate already do this). Using something like SGML or XML would give them the benefit of structure in the information, like footnotes, italicized sections, page breaks, etc., in a machine-readable format. Also, they would have the option of using Unicode, which would benefit them greatly, since 7-bit really doesn't cut it for anything but English text.

    I, and I'm sure many others, would be happy to provide an XML system for them free of charge, but as I've understood from interviews, Hart has his mind set on continuing to use ASCII, because he feels it makes it available to everyone. Personally, I think it reduces everyone to the lowest common denominator, and could be solved in a better way. My two centavos.

    --
    --Joakim Ziegler
  31. The uses of etexts by Mr.+Protocol · · Score: 3

    Some folks want Gutenberg to move past ASCII and become more web-friendly, more non-English-language friendly, more Y2K-friendly, whatever. I happen to believe they're on the right track. They are trying to provide a baseline of texts which can be adapted to specific purposes.

    That's how I use 'em. I've downloaded a few such texts and made them into Newton books, which I put on my Web site. (I'm a retro-geek. I prefer Newton to Palm.) I couldn't do that with an HTML page, or at least, not as easily.

    The one thing I found in doing this myself is that some Gutenberg texts, at least, aren't error-free, even if they have been proofread. I've proofed two such books so far and I h've had to correct around a dozen errors in each. Now, the books I'm converting are by a British writer named Ernest Bramah who's completely obscure today. I happen to have original editions in hardback, but with a writer as obscure as Bramah, there are damn few of us out here with original editions to check. I could wish the Gutenberg proofing process were a little more thorough. There isn't even a central place to report such errors to: the Gutenberg help line just told me to forward the corrections to the original text provider, which I did.

    On the other hand it does make me feel like I'm actually giving something back.

    1. Re:The uses of etexts by Vidar+Hokstad · · Score: 1
      HTML would be bad. But XML would be good.

      XML can easily be converted to a long range of formats, including pure text. XML also supports Unicode well.

      If they started requesting to get the texts in XML it would be trivial to write a few scripts to transform it to pure ASCII text, to HTML, and to other formats, so that they could offer it for download in those formats in addition to the XML source.

      Whats important here is that they can still provide pure ASCII texts, while preserving more information in a format that is easy for someone else to automatically process into whatever format they like.

  32. Go public by DragonHawk · · Score: 2

    Simply make an IPO on NASDAQ. If possible, associate yourself with Linux as well. Maybe get an endorsement from Bob Young of Red Hat. Wall Street will be beating down your door to give you money, without knowing why or what you do. Use the money to buy banner ads. :-)

    --

    dragonhawk@iname.microsoft.com
    I do not like Microsoft. Remove them from my email address.
  33. Re:RELATED: Copyright-Free English-French Dictiona by Eccles · · Score: 2

    Hmm. I looked at that site, and it *looks* like they expect authors to use Word to enter documents. It talks about putting words in italics, which doesn't make much sense for pure ASCII editing. It's not particularly clear, though; will they accept a tagged HTML document?

    Also, quick dummy's question: what is the situation with HTML and Unicode? I've always assumed the HTML docs were ASCII, but presumably our international friends have some nicer way to work with HTML and different alphabets.

    --
    Ooh, a sarcasm detector. Oh, that's a real useful invention.
  34. GPL is not a magic bullet by DragoonAK · · Score: 1

    The GPL is really a license for open-source program/documentation development, and nothing more. You don't want 100 different revisions of a fiction text out there, each slightly "improved" by a different author. Can you imagine reading a book, perhaps a chapter at a time, and having it constantly change on you? The plot inconsistencies would make The Phantom Menace look like Shakespeare. I love the GPL, but please, let's be serious about where to use it.

  35. Needs to be Easier by Skyshadow · · Score: 2
    The Project needs to focus on being easier to use. A sort of "Avant Go"-ish interface where I could select a text online and have it sync to my Pilot without my having to think about anything would be a good start. I mean, I know I can put these texts onto my Palm, but I want it to be really easy.

    If getting more users is really as important to them as getting more texts online (and there really isn't an awe-inspiring amount there yet, so far as I can tell), then they need to be able to pass the mom test (you know, could my mom use it?). I mean, I really *like* having a book on my Pilot at all times -- it saves me in situations where I'm unexpectantly bored. I'd bet I'm not the only one. PG needs to cater to this.

    ----

    --
    Every year during my review, I just pray the words "slashdot.org" aren't mentioned.
  36. PG domain name by GnrcMan · · Score: 2

    I just tried to register "gutenberg.org" so that I could give them a domain name. gutenberg.com, net and org are all taken. :( Any sugestions for a good domain name I can point at them?

    --GnrcMan--

    1. Re:PG domain name by goon · · Score: 1

      projectgutenburg.org

      --
      peterrenshaw ~ Another Scrappy Startup
    2. Re:PG domain name by GnrcMan · · Score: 1

      I noticed that that one was free, but that seems a bit long. I really think they deserve a very good domain name (as a side note, etext.* is taken as well!)

      --GnrcMan--

    3. Re:PG domain name by db48x · · Score: 1

      gutenberg.org and gutenberg.net are already registered to Project Gutenberg. They were registered three years ago in 1996. Perhaps someone could tell Hart how to hook www.gutenberg.org up to his website?

      Daniel

      PS: you can use Network Solutions Whois Service to find this information.

  37. Re:Oldest Open-Source project alive by Malacai[GDI] · · Score: 1

    so!!?!?!? What was the freakin answer?

    heheh... I'm no Perl expert and don't care to figure it out myself. Hook a lazy brother up!

  38. Imagine!! by GeorgeMcBay · · Score: 1

    Beowulf cluster. Awww yeah!!!

    1. Re:Imagine!! by arcade · · Score: 1

      Beowulf cluster. Awww yeah!!!

      Argh, I hate it when you do that. I clicked the link, and found this.. fantastic.. writing. Now I've just reformatted it, and is printing it. *urg*. Damn you! Finding me all this time-consuming literature.


      --

      --
      "Rune Kristian Viken" - http://www.nwo.no - arca
  39. Why I've never used it... by HamNRye · · Score: 5

    Ok, that's a falsehood, I have used it, once. About 2 years ago I downloaded Notes From The Underground. It lingered on my hard drive with some Mark Twain that I had also downloaded at the time. I don't believe that I ever read them, because it's too darn uncomfortable to read a full novel on a computer.

    Eventually I picked up Notes from the Underground As a Dover Thrift Edtiton. It cost me all of $1.00. I couldn't print it myself for that much. Also I picked up Faust, The Theory Of The Leisure Class, The Devil's Dictionary, The Queen of Spades, Oedipus Rex... and the list goes on. These were brand new. None of them were more than $2.00. And that was suggested retail. Used books fall into much the same category, as they are usually $2.00 for a paperback.

    In this era we publish more books than ever before but fewer authors than 30 years ago. Why not use E-texts to promote some authors who cannot get published by the big boys like Bantam, Del, Tor, etc... Why not have a more user friendly site? Why not invite reviews? Reccommendations? Etc...

    Why not make it so that PG is accessible to the masses. Let people have their stake in PG, make them a part of something. That is what draws people to participate in these projects. Slashdot is not the best news site out there for news, but it is the best community out there for news.

    When I first found PG it seemed like one of those great ideas. I bookmarked it. I stopped back, nothing had changed, A year later I stopped back, still didn't see anything that really caught my eye.

    In short, I appreciate what PG is trying to accomplish, but I cannot find where it has any real relevancy to me. Not when the price of the information on a user-friendly, portable media that never needs winding or batteries is available for so little. To truly draw attention and keep it, you need to fight our pitifully short attention spans, and our desparate need for convenience. Why not encourage people to write for PG, not copy. Why not encourage the stockpiling of information, not fiction. What about an app that facilitates the finding and reading of e-texts, something more than "more"...

    PG has been around long enough to have garnered the recognition it deserves. If it is concerned that it is not busy enough, then it should be wondering why. It has always seemed to me that PG tries to lure it's readership with the mantra that "This is for the greater good..." Help us... Instead of playing on our consciences, fufill a need. As of this writing there are ~50 responses from people who have all heard of PG. Some use it, some don't. But they all know about it.

    PG, give me the slightest reason to come and keep coming, and I will. Until then, I can get Vonnegut for $0.25 at the library and PK Dick for $2.00 at Novel Futures. And god knows that our independent booksellers are struggling too. (Tangent: Don't buy from book behemoths, as smaller booksellers die out our culture moves further into the realm of vanilla pop garbage!)

    ~Jason Maggard
    "Give me convenience or give me death." ~Jello

    1. Re:Why I've never used it... by TCaptain · · Score: 1

      I heartily agree with HamNRye. While PG is a laudable venture, it will never really take off unless it can appeal to a greater slice of people. For example, I am a voracious reader and the thought of free Etexts makes me drool but for two points: 1) I don't want to pay the currently high prices for a device to read Etext at my convenience, nor do I like to sit at my PC to read an Etext, free as it may be. 2) Most of the texts at PG that interest me are available for dirt cheap at many bookstores (used or otherwise, for example I just picked up a new hard cover containing all of Doyle's Sherlock Holmes writtings for 15 bucks canadian), even downloading them free and printing them out is more expensive (or close to it when you consider a nicer hard cover book). Once either market pressures, society or technical innovation eliminates the above two problems, I can see PG really taking off. Just my two cents

      --
      "I'm not a procrastinator, I'm temporally challenged"
  40. PG & Movies by Dacta · · Score: 2

    I remember reading a Wired interview with the PG founder back in '96 or '97.

    They were talking about how movies were begining to come out of their copyright period, and how he wanted to make a public domain MPG of "Gone with the Wind" before he died.

    I'm not quite sure what the copyright status of early (say, pre-WW2) movies is, now. Anyone?

    --Donate food by clicking: www.thehungersite.com

    1. Re:PG & Movies by treat · · Score: 1
      Therefore it is illegal for anyone to copy and preserve these films.

      If someone did make 'illegal' copies of such films, who would sue/press charges?

  41. Answer to Q1: MS-Word is optional by Anonymous Coward · · Score: 1

    Although the Spiers French-English Dictionary website often mentions MS-Word, they do state on the how to join the project page, that

    Typing can be done in Microsoft Word, WordPad, Word Perfect, or other common processing programs, and should be entered exactly as it looks on the page.
  42. The goverment should have a law for new books! by Bacteriophage · · Score: 2
    I'm sure most authors nowadays use their computers to type up books anyway, but instead of having the list of not-online books continue to grow, all publishers should have to submit the "Plain Vanilla Text" of each book they print to a government database. Then it can be decided if the author will allow the book to contribute to the Gutenberg(spelling?) Project after a year or so, after the book has taken in most of its sales. I believe this is a great idea, and should be put into effect immediately. Don't allow the burden of these great volunteers to grow exponentially. :(

    "There are no shortcuts to any place worth going."

    --
    "Be regular and orderly in your life, so that you may be violent and original in your work." -Flaubert
  43. Hart is a weird character. by regs · · Score: 2

    Back in '97, Wired did a feature on PG. The original Gutenberg ftp site was hosted on a UIUC machine. I have some friends who were there at the time, and have regaled me with stories of what a pain in the ass the guy was. The FTP site that is alluded in this article by one Mark Zinzow was on a machine, mrcnext (which no longer exists but still has a DNS entry) adminned by a friend of mine at one point. Anyway, the point is, this article has a lot of interesting things to say about the Project and especially Michael Hart. Check it out.

    --

    --

    --
    "In Cyberspace, no one can hear you be sarcastic"
    1. Re:Hart is a weird character. by hey! · · Score: 2

      After reading this link, I'm definitely going to write Mr. Hart a generous check. Not just because the project is worthy (which it is), but because eccentricity on this grand scale deserves support in its own right.

      If I had a dime for every time I compromised on something I believed in because everyone around me seemed to be sure that doing it a different way would be better, I'd send it to him.

      --
      Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
  44. Agreed! by Parity · · Score: 2

    I think, actually, that project gutenberg ought to store their files in a simple semi-formatted way.
    Like, lines beginning with \ are escapes with codes for 'title' 'author' 'chapter' 'paragraph' and 'footnote' Like,
    \title The Slashdot Effect
    \author Rob Malda
    \chapter Chapter One: What is The Web
    etc.
    (apologies if there's a real book by that title)
    Most of it'd just be plain text. With -just- enough formatting that a perl-script (or future-language script) can transform it into the pretty-format of the day with a bit of analysis,
    but not so much as to make it unreadable.

    Just a thought.

    --Parity

    --
    --Parity
    'Card carrying' member of the EFF.
    1. Re:Agreed! by samantha · · Score: 1

      They could even make formatting structure out-of-line with the main etext if they want to preserve that format. The formating would have relative position pointers into the text (assuming it changes seldom if ever). Making the text more useable in different contexts and preferences would go a long way to gathering more interest.

    2. Re:Agreed! by kugano · · Score: 1

      Sure! No need even for custom scripting. Just use TeX/LaTeX (as your \ commands resemble) with tex2html or one of the other open source TeX to HTML converters.

      -kugano

      --
      kugano
  45. PG is a great thing by Anonymous Coward · · Score: 1

    Project Gutenberg is a great thing. I have no affiliation with it other than as a reader. I think it's great that slashdot took the time to post this story and I hope that PG is successful in getting more publicity.

    I'd like to address some of the rather negative posts that people have made regarding PG because frankly their short-sighted:

    PG's etext's should be made more web-friendly:

    Most of the people who make this comment also suggest a lot of nonsense about adding HTML formatting to texts. This would be a huge mistake. The beauty of having the texts in as simple a format as possible is that it is always possible to add the formatting later. I'm sure that the creative readers at slashdot could come up with about fifty ways that the formatting could be added later, dynamically if need be appropriate to the display device. It is also particularly annoying to see the aforementioned comment on slashdot for the same reason. If you want to see the texts presented in HTML grab the texts (I have seen many folks on here bragging about their great bandwidth), write up a perl script to format them and start serving them up from your site, BANG! You've just created a great new companion site to PG.

    This is not helping folks (somewhere else):

    The gist of this comment seemed to be that PG was not useful, because the texts were mainly western literature and in english? This is so bogus it's hard to address. The bottom line is that you have to start somewhere. The first document ever done by PG was the United States Declaration of Independence. Yes, this shows a bias, but, you have to start somewhere. I have never seen anything from the project saying that any works were to be excluded and in the meantime yes, the works that are there are useful to people all over the world. If PG got some more publicity, maybe more people from around the world might hear about it and could contribute.

    Anyway, anybody interested in great literature should take a look a PG. They have a lot of great stuff to read available for free.

    1. Re:PG is a great thing by Anonymous Coward · · Score: 1

      .... a lot of nonsense about adding HTML formatting to texts. This would be a huge mistake.

      Why? How is a book supposed to provide proper illustrations if it is restricted to ASCII characters?

      If you want to see the texts presented in HTML grab the texts ... write up a perl script to format them and start serving them up from your site.

      But the point was that PG loses information in their obsolete process. It is not possible to restore the lost information unless one can refer to the original text. But PG won't tell you which text was used. So it turns out to be easier just to OCR the text and illustrations anew.

      The gist of this comment seemed to be that PG was not useful, because the texts were mainly western literature and in english? This is so bogus it's hard to address. The bottom line is that you have to start somewhere.

      Yes, PG was a good start in 1971. But it hasn't kept up. Its backwardness in restricting everything to ASCII will prevent it from accommodating non-ASCII texts. Already there are other archives publishing works that are not in ASCII. Concentrating our attention--and channelling our funds--to PG wrongly demeans those other projects, that obviously cannot restrict themselves to ASCII.

      PG can do whatever it wants. But what it is doing should be open for discussion, not controlled by one person's prejudices.

    2. Re:PG is a great thing by dingbat_hp · · Score: 1

      Most of the people who make this comment also suggest a lot of nonsense about adding HTML formatting to texts. This would be a huge mistake.

      There's a principle here (quoting from the W3C's WAI) called "The Principle of Closest Markup".

      Basically, if you're ever going to do markup, then you should do that markup as soon as possible and as close to the source of the content. It doesn't matter too much what the format is, just that some computer-readable marker gets placed in there while it's still known exactly where the footnotes and paragraph breaks are. It's much easier to re-format from a marked-up text to plaintext than it is to try and automatically add markup to plaintext.

      OK, so HTML markup isn't the greatest thing out there, but it has the huge advantage for a project like Gutenberg that everyone (and their dog) knows it. Maybe using a subset like the Slashdot Core would add useful functionality for little cost. If you want plaintext, then stripping those would be trivial scripting.

    3. Re:PG is a great thing by Tom+Womack · · Score: 1

      I'd like to address some of the rather negative posts that people have made regarding PG because frankly their short-sighted:

      PG's etext's should be made more web-friendly:

      Most of the people who make this comment also suggest a lot of nonsense about adding HTML formatting to texts. This would be a huge mistake. The beauty of having the texts in as simple a format as possible is that it is always possible to add the formatting later.


      The problem I've found is that the texts are at present in a format which LEAVES INFORMATION OUT: when I converted the Gutenberg Les Miserables and the Father Brown stories, I had to go through marking up the chapters as chapters (leaving Perl to mark up the paragraphs as paragraphs), then read through the entire text to find out paragraphs which actually contained tabular information and had got mis-markuped.
  46. Portable device to read texts by hpgoh · · Score: 1

    My first ever post to SlashDot! What a moment!

    I'm impressed by the texts available - from Charles Dickens to Geoffrey Chaucer, and Mark Twain, and even The Hackers' Dictionary of Computer Jargon.

    My question is, what portable devices are available these days for reading texts such as these downloaded from the Internet. I would love to able to use one these on the train and tram, on the way to and from work - better than a broadsheet newspaper. I had a look at the Rocketbook and Softbook mentioned by a previous poster, but those devices seem to be very restrictive in terms of availability of books. I guess WinCE machines could be an (expensive) alternative. What about Palms? I don't actually own one myself, so I don't know about how hard they are on the eyes for extended periods.

    "Who makes Steve Guttenberg a star? We do, we do."

    1. Re:Portable device to read texts by Tom+Womack · · Score: 1

      What you want is a Psion 3mx.

      Hand-sized, nice contrasty 480x160 screen - I use mine quite a lot to read ebooks and the like.

      £169 in the UK.

  47. Maybe not the GPL by kjack · · Score: 1

    Maybe not the GPL per se, but something along those lines could easily be implemented for a novel. A main author could write an intro chapter, post it to the web, set deadlines for each chapter submission, and then piece together a finished project. Of course the publishing industry being what it is, this is unlikely to happen anytime soon. Nice dream for literature types like me though.

  48. government sponsorship by gwmccull · · Score: 2

    As distasteful as the thought may sound to some of us, it may be time to solicit the help of the government. Specifically, I believe it was California's governor, Gray Davis, who was recently talking about building a virtual library of all the texts in the California State university system. In this case, collaboration may make sense.

  49. completely unconstitutional, at least in the US by / · · Score: 2

    Article I, Section 8 of the US Constitution enumerates the relevant federal power as "Congress shall have the power... To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries;" The key phrase here is "for limited Times". Retroactively extending the duration of copyrights or bestowing perpetual copyrights is plainly unconstitutional.

    Needless to say, unconstitutionality has never prevented legislators from passing unconstitutional acts, and the late Sonny Bono had his day in Congress a year ago and changed the rules, and we'll all suffer for it. There are some legal battles being fought on this issue, but I can't seem to drag up the references.

    --
    "If one is really a superior person, the fact is likely to leak out without too much assistance" -- John Andrew Holmes
    1. Re:completely unconstitutional, at least in the US by Anonymous Coward · · Score: 1

      There are some legal battles being fought on this [copyright extension] issue, but I can't seem to drag up the references.

      See the Berkman Center site for the case to overturn the Bono Act. All the briefs are there. Also, the briefs are being written in a novel openlaw process--everybody, including the other side, gets to contribute!

      Project Gutenberg is not part of the suit. PG is restricting itself to pre-1923 works. Also, Open Source people have had odd reactions--they seem to believe that strong copyright laws are good, as long as there is a free license.

      The spirit that originally motivated Project Gutenbergers ought to move on to a larger movement that unites with Open Source advocates in many fields other than free books. The public domain needs to be as important in our thinking as the environment has become since the 1970s. Open Source and free online book people need to unite with other advocates of a better intellectual property principle for our laws and public policy. This would include the human genome, software patents, patents on the food (agricultural products) necessary for life on this earth, all digital media, vaccines and medicines, and many other areas where large multi-national corporations now based in the U.S. are attempting to assert exclusive intellectual property rights.

      The lawsuit against the copyright term extension is only a first step, but it could present the Supreme Court of the U.S. with some ideas that could form a better intellectual property theory to move on to the 21st century. Otherwise, if we Open Source people are continually turned down, we face being isolated and marginalized, over against a rampant free-market capitalist monopoly of our ideas.

    2. Re:completely unconstitutional, at least in the US by samantha · · Score: 1

      Some of the items listed which are quite costly to develop and must recoup those costs to be doable at all. Especially, hi-tech agriculture products, and vaccines and medicines. Much software is also not cheap to produce in terms of effort, scarcity of sufficiently skilled people, creativity and so on. That something is needful for life is not an argument that it must be free of cost to the user or be fully in the public domain.

      I do agree that somethings are absurd to patent like the human genome and most software. I do not agree it is wrong to copyright a book or a software work or a record or that simply making such digital makes copyright wrong. Much depends on the nature of the copyright and what its terms enable and disable. The authors and artists must be paid for their works as long as we have a system that functions on money at all. If this can be done w/o copyright then wonderful but we cannot rush to cry "public domain" w/o bothering to take care of paying our creators and investors well for their efforts. Not if we wish to continue to receive the fruits of those efforts.

    3. Re:completely unconstitutional, at least in the US by jzitt · · Score: 1

      IANAL, but... it would seem to me that extending a copyright for, say, 10,000 years would still count as "limited Times". Sort of like avoiding giving someone a life sentence by sentencing him to three consecutive 50 year sentences, with possibility of parole after 120.

    4. Re:completely unconstitutional, at least in the US by copito · · Score: 2
      Much of the work of the Supreme Court is in ferreting out the original intent and applicability of the Constitution and federal laws. In this case, the original intent is clear, as documented by Tim Phillips. (Well worth reading, as is the entire ~dkarjala site). Thomas Jefferson wanted a term that was equal to the mean remaining life expectancy for an adult. This was 19 years at the time.


      In other writing, he is very explicit about naturalness of public domain.

      It would be curious...if an idea, the fugitive fermentation of an individual brain, could, of natural right, be claimed
      in exclusive and stable property. If nature has made any one thing less susceptible than all others of exclusive
      property, it is the action of the thinking power called an idea, which an individual may exclusively possess as
      long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of every one, and
      the receiver cannot dispossess himself of it. Its peculiar character, too, is that no one possesses the less,
      because every other possesses the whole of it. He who receives an idea from me, received instruction himself
      without lessening mine; as he who lights his taper at mine, receives light without darkening me.

      --
      --
      "L'IT c'est moi!"
  50. Palms? Nah . . . by fireproof · · Score: 1
    What about Palms? I don't actually own one myself, so I don't know about how hard they are on the eyes for extended periods.

    Palms are nice, and are cheap when compared to other handheld devices, and are amazingly useful, but they are limited to about 13 lines of text per screen, with an average of about 30 characters per line (that's my guess anyway . . .), so I don't think they are best suited for something like this.

    The development guides that are available from 3com even note that they are meant to be used as an auxilary device to a PC for the most part. That being said, in a pinch, one would certainly work, though.

    --

    /* "A fool does not delight in understanding, but only in revealing his own mind." */

    1. Re:Palms? Nah . . . by larryj · · Score: 1

      I've read a few PG texts on my Pilot. It was easier to read than I thought it would be. If you have a Palm device, definitely give it a shot.

      --
      What if the Hokey-Pokey really is what it's all about?
  51. They have pie... I mean pi. by Crixus · · Score: 1
    I've been excited about PG since about 1993 or so when I first learned they had pi to one million places, and the first 100,000 prime numbers available for download from their FTP site. Since then I have been a big supporter.

    --
    Ignore Alien Orders
  52. Try Eisnor Interactive by Runna^Muck · · Score: 1

    Just read an article on Di-Ann Eisnor in FastCompany magazine. She started the first (according to them) offline advertising agency for online brands. Apparently she was able to cause a ruckus for MiningCo changing to About.com so maybe they could do something for PG. No, I don't work for them, I just happened to see the article this evening.

  53. Is Hemos matching deeds to actions ? by maroberts · · Score: 1

    If so, he should put a few strategically placed links to Project Guternberg on SlashDot and any other web sites he has influence on.

    Other Slashdotters with web sites should do the same; after all, "Open Source" books should be encouraged as much as "Open Source" code....

    --

    Donte Alistair Anderson Roberts - hi son!
    Karma: Chameleon

  54. PG helper software by adriccom · · Score: 1
    VacuumPress Braindump

    After the Project Gutenberg exposure on /. today I was invested with the idea for a needed piece of software (mentioned in a few /. posts). Two things seem needed, both of the same purpose:

    • A client program that can suck an etext out of PG (et al) straight (or nearly directly) into a PDA (palm3 here, etc)
    • A cgi/script to do nearly the same thing on the server end (that is, mangle the etext into a DOC and then a pdb) (The example given in the slashpost is AvantGo, a nifty web page caching system for Palms and such. When it works, all you do is click on a link in your browser and the helper/etc queue the pdb up in your Installer)
    As is only proper, I intend to hack on this myself (client first), but am hoping to have help. I am a lousy hacker in java perl and python, but learning as best I can.

    I figure to start in on the client using the best case scenario: a unix system with palm doc tools, pilot-link, pilot-link-perl and pyrite (the palmos module for python). Once I get something running, I will try and exchange palm doc tools code for perl or python code, eventually getting it into one module. (I intend to attempt it first with python, but I am quite open to perl, too) I would like to try and implement the serverside as a perl or python CGI (as I am without a better idea). Someone better than me could probably whip out a java1.1 client for multiplatform if some java code for making DOC files can be found, and the same goes for a servlet. I'll poke around and see if any such codes is in any of the obvious places.

    Anyway, these are my ideas. I would like yours. Feel free to flood my mailbox, etc. Email to adric@adric.com and try and put something like VP in the subject line so I can filter it from the spam :) A copy of this document and anything later will be at my site at: adric.home.mindspring.com under hacks.

    Oh yeah, I propose a name for this beast: VacuumPress

    This document is copyright 18 nov 99 by adric@adric.com (me!) and any software resulting from it will be DFSG / OpenSource compliant.
    --
    <script>alert("I never liked JavaScript, really; it just seemed a bad idea.");</script>
  55. It needs to be more computer-friendly by jaso · · Score: 2
    I agree that Project Gutenberg is a great thing, and I also think it's wonderful that it's getting some additional publicity.

    It definately does need to have markup codes, though. I'd personally prefer XML, because it would allow the documents to include book-relevant tags like <chapter number> and such, which would make the e-texts a great deal more machine-readable, and accessible for everyone. (in addition, it would make it a lot easier to re-publish the texts.)

    I design and typeset books for a living, so I know what I'm talking about when I say that it's a lot easier to remove or process existing codes than it is to insert them. A machine can easily reformat a marked-up document; if plain text is wanted, a one-line perl script can be written to remove everything within angle brackets. The reverse is not the case. Computers are not currently smart enough to know where to add tags, so right now, every single tag in a document has to be inserted by a person.

    All too often, that person is me. ;-)

    1. Re:It needs to be more computer-friendly by slim · · Score: 2

      I absolutely agree. PG make the argument that ASCII text is the only format universally readable to (almost) all computers.

      However, if they were to mark up their texts in the XML-derived-markup-language of their choice, then their work would be so much more of a service to humanity.

      From Elliotte rusty Harold's "XML: Extended Markup Language" (a bit old now), discussing Jon Bosak's marked up versions of the complete plays of Shakespeare:

      "What does this system offer over a book or even a plain text file? To a human reader, the answer is not much. To a computer doing textual analysis, however,it offers the opportunity to easily distinguish between the different elements into which the plays have been divided. For instance, this system makes it simple to extract all lines spoken by Romeo in Romeo & Juliet."


      Then there's stuff like text to speech -- markup would help the reader with intonation, etc.

      ... and so on.


      --

  56. We need a more restrictive version of the GPL by Stephen+VanDahm · · Score: 2

    I don't really understand licenses that well -- this is just my uneducated opinion.

    I don't think the GPL would work well with something other than software. Once I tried to think about how people could copyright music under less restictive licenses. You'd want to copyright a song (not necessarily a given recording of the song) so that coffee shop/bar bands could legally sing it, but you have to do something to preserve the integrity of the art. I don't think the GPL really does that, because with the GPL people can modify your work and distribute those modifications. This has practical value in the software community, but in the music community people want their work to remain unique and intact. I assume that authors would feel the same way.

    What you'd need, in my view, is a copyright license that allows people to distribute an etext freely and ensures that no one down the line can take that freedom away. However, people should be forbidden from altering the etext, and the author should always receive credit for the work. That way, you can give your stuff to Project Gutenberg without fear of compromising its integrity.

    This is just an idea, and I know that it isn't a perfect solution yet. But I think that a license based on these ideas could be worked up and actually used by authors, musicians, and artists to promote the exchange of ideas and information. That's really the spirit of the GPL anyway, right?

    Take care,

    Steve

    1. Re:We need a more restrictive version of the GPL by Tim+Pierce · · Score: 1

      ...with the GPL people can modify your work and distribute those modifications. This has practical value in the software community, but in the music community people want their work to remain unique and intact. I assume that authors would feel the same way.

      That's very strange. I would have said that the exact opposite is true.

      There is a great tradition in music of creating new derivative works from earlier songs. These days we mostly get "cover songs" that are usually heavily bastardized versions of classic popular songs, not really worthwhile, but jazz and blues music is deeply rooted in a history of borrowing, begging and stealing music. Jimi Hendrix's cover of The Star-Spangled Banner is a good well-known example.

      That is not to say that the GPL would be a good model to use for music, just that musicians are hardly averse to reusing and reinterpreting each others' work.

    2. Re:We need a more restrictive version of the GPL by Stephen+VanDahm · · Score: 1

      musicians are hardly averse to reusing and reinterpreting each others' work.

      You have a good point -- jazz musicians have a huge repertoire of "standards" that they reinterpret every time they perform one of them. In fact, it's considered an honor if a song you write is accepted into the canon of standards.

      I'm not opposed to the reuse and reinterpretation of a musicians work -- I was thinking of blatent plagerism when I wrote what I wrote. It is one thing to offer your own interpretation of Hoagy Carmichael's "Stardust," but it is another to claim that you wrote "Stardust" yourself from scratch. It's also wrong to slap together some crap and claim that Hoagy Carmicheal wrote it (I don't know why you'd want to do that, but it's still wrong). Ideally musicians should be free from as much legal crap as possible, but still be assured that their own work and reputations are safe from plagerism and other abuses of the sort.

  57. Amazon by gargle · · Score: 2

    *Amazon (someone mentioned this) is a _bad_ idea. Profit motive and releasing free documents don't coincide well.

    I suggested Amazon earlier, but I guess I should have argued my point rather than just suggesting it. Why don't profit motive and releasing free documents coincide well?

    Profit motive and free documents can coincide perfectly, and work to each other's mutual benefit. The free software world shows that the profit motive, demonstrated by companies like Red Hat, may in fact be the *best* way of supporting the development of free stuff (software, documents, and who knows what else). What Project Gutenberg needs is publicity, and who can do publicity better than companies like Amazon with plenty and of money and marketing skills?

    But why would Amazon want to help PG? For the same reason why enlightened bookstores make it easy for customers to browse through books -- letting customers browse increases sales; putting links to PG texts brings this browsing experience online (imo, there isn't much worry that the customer will just read the whole book online rather than buy it -- reading a whole book online is just too unpleasant).

    Furthermore, the PG deals with copyright expired books, so the market is different in most cases; linking to PG is just another value added service that online booksellers like Amazon can provide for their customers.

  58. Re:I support PG, but...web-friendliness? by sporri · · Score: 1

    Why should PG become web-friendly this project is about the text, the words not the illustrations, typography or whatever.

    Download the ASCII and format it at home for printing, on screen reading, throw it at a text to speech synthesizer or whatever you want to do with it. It is free and you can do with it what you like. The ASCII format is more portable than html. I can even boot my old C64 and read the PG text there if I want to.

    I have had horrible time over the years getting rid of the formatting of online text's I want to read (The HTML principia discordia is a good example). I like the raw text format because I can download the text, throw it into a word processor and change the font and print it, make a .pfd, format it for the palm and upload it somewhere else, not forgetting to site PG who make it possible.

    So if you think PG should become web friendly format the texts for the web yourself. PG might need that but we will still need the e-texts in a standard format

  59. If only we had the technology by Cigs · · Score: 1
    After my initial disappointment that this wasn't a campaign to have the American Film Institute digitally remaster the first five Police Academy films (a noble cause methinks) it seemed like a good idea.

    Unfortunatly as with much of the internet today the technology to make it really workable and useful isnt there. Until they develop a small walkman type thing with a screen big and clear enouugh to make reading easy it will never catch on. At the moment the power cable on my PC wont stretch the 12 mile train journey I got to take every day, and my laptop gives me headaches if I stare at for too long.

    A worthy cause though, there is a lot of stuff out there that if it isnt saved now, will never be available, so I'll try to spread the word amongst my more learned friends, both of them.

    To be serious for a moment though, as I am into music and I am constantly aghast at all the great music that has never been issued on CD and therefore is currently unavailable, I wonder if there is a similar scheme for albums? With mp3 technology, which the internet gods did get right, surely there must be somewhere to get all those currently deleted albums, anyone got any ideas?

    ---

    1. Re:If only we had the technology by Cigs · · Score: 1
      Good call, good points.

      I know in the Uk there is the BPI (Brittish Phonograph Institute?)which would be the equivalent of Billboard, and aside from compiling charts and royalties etc, one of its duties should (or perhaps is) be to protect music.

      If something is not currently available no-one is making money from it, an on-line archive of music which is available to download at a very small cost would be useful, also it would preserve it. Anyways, its my idea, I'm going to IPO it next week on Nasdeq, I'll make a fortune and I'll live hapily ever after.

      ---

  60. Re:slashporn by Ziviyr · · Score: 1

    This fat chick?

    I suppose its a matter of oppinion... What does this have to do with Project Gutenberg anyways?

    --

    Someone set us up the bomb, so shine we are!
  61. Code Quality of Gutenberg texts by Florian · · Score: 2
    Being a professional philologist, I must criticize the code quality of Gutenberg e-texts. Gutenberg texts rarely acknowledge the edition they rely upon and lack any structural markup (indicating the pagination, italics, spelling variants etc. of the original text). From the viewpoint of scholars and 'professional' readers, they are practically unusable because of that. Imagine Linux and GNU were not cleanly coded re-implementations of a sophisticated operating system (Unix), but a DOS clone hacked in BASIC, and you get the picture.

    The question here isn't whether to use ASCII, HTML or LaTeX, because there already is a highly developed, sophisticated markup language for electronic text editions, TEI-SGML, specifically designed to preserve all structural information of the original text. Some e-text projects such as the Victorian Women Writers Project code in TEI-SGML. This is not only good for scholars/literature hacks, but also allows lossless reformatting of the source code into HTML, ASCII, PDF, RTF, etc..

    The Gutenberg Project certainly was a good idea and a great achievement when it is founded, but might have to rethink its coding policy. Other e-text projects are already doing better here.

    --
    gopher://cramer.plaintext.cc http://cramer.plaintext.cc:70
    1. Re:Code Quality of Gutenberg texts by dmso · · Score: 1

      I had the same problem with the Milton and Shakespeare texts. Even the Oxford Text Archive (are they still around?), which tried to mark up their texts, weren't really that good. So, I ended up rolling my own -- ah, the joys of transcribing _Richard III_, 1597 quarto ed... it did teach me SGML, though. Anyhoo... the texts are handy enough for light reading, but aren't really appropriate at all for scholarly work. I'd rather support etexts that were flexible and open enough for both uses.

  62. Slashdot Interview? by K. · · Score: 1

    Why not do an interview with Michael Hart? At
    the risk of being labelled a troll for the
    second time in a week, he'd be a lot more
    interesting than John Vranesevich.

    K.
    -

    --
    -- Proud descendant of semi-nomadic cattle-herders.
  63. Open letter to Michael Hart by cabalamat · · Score: 1

    This is the text of an email I've just sent to Michael Hart, the director of PG:

    Michael,

    I'm currently reading the coverage on Slashdot of Project Gutenberg.

    I agree that you don't seem to get much publicity. I would like to see PG achieve greater prominance. If more people knew about it, more people would read its books, and more people would input new books into it, which would again cause more people to read its books.

    I think that one way to help PG would be for its books to be put on the web in HTML. This would make it easier for people to read them online. It would also help people to find the books in the first place. The most usual way for people to find websites is via search engines. If all the books are stored in HTML, preferably with the right META tags to describe the content, then web search engines are more likely to point to PG pages.

    I've read your arguments in favour of using plain ascii, and I partially agree with them: I think all the books should be available in ascii. However, I also think they should be available in HTML -- because this is the easiest format for people to read them on the web. As you say:

    The Project Gutenberg Etexts should so easily used that no one should ever have to care about how to use, read, quote and search them

    I agree with this sentiment. The best way to be have both ascii and HTML versions of each etext is to use an internal markup format, from which the texts can be easily converted to any output format, with output formats including ascii, HTML, and perhaps others such as LaTeX or RTF.

    I have a website (http://www.comuno.com/) which currently contains other publicly-available e-texts, i.e. the FAQs for the Usenet newsgroups. I would like to host the PG e-texts on my website too, in HTML format. I realise that this would require work to be done:

    1. the internal markup format would have to be defined

    2. software would have to be written to convert from the internal format into ascii, html, and the other output formats

    3. new etexts would preferably be written in the markup format

    4. 'glue' code would be needed to automatically collate the etexts on the website.

    I would like to collaborate with you on doing this. I would like any software that I write as part of this to be placed under an open source licence. Some words about my background: I have been a professional programmer for 13 years, andover the last few years have been heavily involved with web- and HTML- based work. I also have experience in designing markup formats and programs to convert to multiple display formats.

    Project Gutenberg could also use Netscape's RSS format (see http://www.byte.com/column/BYT19990916S0002 for details). This would allow PG to publishing to other websites when new etexts are released, which should help gain publicity.

    BTW, another idea is be to use the obvious web address of www.gutenberg.org. Actually, I'm surprised you aren't using this already, since you have registered the domain.

    --
    *** Philip Hunt. Reply to phil@comuno.com ***
    *** Linux: because there's no Bill to pay ***

  64. No, it's all about the XML... by speck · · Score: 1

    Don't get me wrong, I hate M$ just as much as the next slashdotter, but converting the PG books to the openebook format is a good idea. The fact that M$ will doubtless try to co-opt this format is beside the point; the point is that it is a published standard, and more importantly that it is XML-based.

    It really doesn't matter whether the PG texts can be plugged directly into a web browser or not, the important thing is to make the texts into structured texts. Really it doesn't matter what XML format is used, but since one already exists for books, it might as well be the format that is used, if it is technically sufficient.

    From an XML format, it is trivial to produce the plain-ASCII format that PG seems to be so fond of (one might even say irrationally fond of). It is also trivial to produce a web-ready HTML document for online publishing, one that works with all browsers. Heck, we could even make it work for lynx, for people without a text editor... :-) Storing the documents in HTML is a bad idea, for a few reasons:

    • HTML is hard to parse. Ok, it isn't that hard these days, but it's still harder than XML. Besides, formatting should not be stored with the text (see below).
    • Who knows where HTML will be in 10 years? Odds are that the spec will change considerably as browsers and the needs of the web evolve. I think it's a pretty good bet that XML itself will stay relatively stable, though.
    • PG was right not to use particular markup schemes in their documents, chiefly because of the volatility of markup-scheme systems. What an XML scheme does is to store the structure of documents. PG's flat-text format clearly loses that structure.

    The thing is, when someone types/scans/edits a work and submits it to PG, they ought to have a way of specifying the things that are lost in the flat-file format: footnotes, chapter and section headings, bibliographies, etc. Once this happens, the works can be linked to with more granularity than as ftp://.../book.txt. This is critical (as several posters previously have stated) if books are to become full citizens of the noosphere.

    Hmm, well when I write a sentence like that last one I know it's time for bed. I hope my arguments are clear. Re openebooks, my main point is that M$'s ownership of the standard is irrelevant as far as its usefulness is concerned. If anyone is interested in discussing these issues and maybe taking a look at what a good XML format would be, please email me.

    1. Re:No, it's all about the XML... by hey! · · Score: 2
      I agree, but I wouldn't denigrate the need for pure ASCII represnetations either. XML is more useful than plain ASCII, but ASCII is going to be useful to everyone who can pipe the output to a printer, no matter what software they have.

      I think they should accept XML submissions, process the XML to produce pure ASCII, and then make both availbale. That way they have the power of XML while still having archived the least common denominator.

      --
      Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
  65. Yes, by gum! XML, not HTML by speck · · Score: 1

    I couldn't agree more with your and jaso's statements. The point is not to mark up the books so that they look pretty on the web, the point is to add the structural elements that plain-text doesn't preserve back into them. HTML is a bad idea (as I've argued above somewhere), but it is trivial to produce it for presentation on the web if need be. Likewise, from an XML source, it is trivial to produce a flat text file. Going from a flat text file to a format that includes structural information, though, is decidedly non-trivial.

  66. Gutenbook! by knife_in_winter · · Score: 1

    I have noticed some people have mentioned the need for some sort of client software for Project Gutenberg.

    It just so happens that I thought this same thing some time ago.

    I am working on my own GPL'd project called Gutenbook. Right now it is not much, just a rapidly prototyped Perl/GTK application. It downloads and parses the Gutenberg index and allows you to select a title. Once selected, that Etext is downloaded and displayed for you to page through.

    As I say, it is only a *rough* prototype right now and I have been too busy to work on it as much as I want. I have plans to port it to Objective-C and C with GTK++. (I think Objective-C *rocks*.)

    I have exchanged emails with Michael Hart and some other of the Gutenberg people and have their support. I just need more time! I would love to get feedback on this.

    Please check out the link above. The prototype is available for download. Please also take it easy on the server. It is a lowly Sparc 2. It enough people are interested, feel free to make a mirror.


    Nothing can possiblai go wrong. Er...possibly go wrong.
    Strange, that's the first thing that's ever gone wrong.

    --

    Tyler's words coming out of my mouth.
  67. A reason why PG is important; debate on the canon by himself · · Score: 1

    (Warning: use of "canon" and other English Lit. terminology ahead.)
    In England, IFRC, a copy of every book published goes to their national library -- and the same is true of the LOC here in the US. A Slashdot reader's comment to this article suggested that publishers should be compelled to submit an ASCII version of each of their new books to PG. Given the introduction of those systems that print single copies of books on demand [aren't they going to start showing up in Borders RSN?] it's as unlikely that book publishers will give away their raw materials as it is that software publishers will open-source their products. :7)

    Project Gutenberg seeks to make as many books as they can, _now_, available. Is this important? Yeah, because publishers won't make much money off of public domain materials, but that doesn't make those works any less important. Look at what books they made you read in school, what pointy-headed academic types call the canon, and see how many of them are available via PG. Other than some of the rubbish that one wild-eyed assistant prof made me read for my English degree, *most* of my high school and college reading lists are downloadable. These are valuable works that we'll still be reading for a long time.

    (There are, of course, people who will dismiss most the the canon as Dead White Eupoean Males and by extension most of the PG etexts as tools of the patriarchy but that's a debate that'll never be settled. Read Harold Bloom's book "The Western Canon" for a chapter on each of what he (and many others) consider the most important books of Western literature, and then download the original texts from PG.)

    It is possible to read these plaintext books on screen: make the text white and the background black; find a comfy spot and curl up with a laptop, or blow up the point size so that you can sit comfortably in your desk and read it on a monitor; take frequent breaks to rest your eyes. Sure, it's not going to advance your career as a sys admin, but you have to come up for air sometime. Anyone who commutes on a train or subway owes it to themself to read at least one "good" book a season. And the more I talk to people, the more people I'm finding who accidentally read something they skipped in school and rediscover books.

    Is this relevant to Slashdot? Sure is, if for no other reason than the "information wants to be free" line.

  68. Cool Idealistic Imagery by GnomeAttic · · Score: 1

    Someone mentioned putting gutenburg texts onto palms or something and that got me thinking... I always thought it was cool on Star Trek when Picard was reading some play or old French book on one of those little computer pads. You know, th thins ones where they write all their reports and stuff. This would essentially be the same thing! Am I the only one that thinks thats really cool? Probably...

  69. College Professors by ronfar · · Score: 1

    Probably the best way to get Project Gutenburg recognition would be to have classics professors mention it in their classes. Hmm, next time I see my cousin ( a Greek and Latin professor ) I'll suggest the idea to him.

    --
    All the creatures will die, And all the things will be broken. That's the law of samurai. (Jubai, 1605)
  70. Re:I support PG, but...web-friendliness? by hey! · · Score: 2
    I think that XML would be a better choice. It would be a bad idea to stake our intellectual heritage on a platform that is subject to a fight between vendors. However, XML could be rendered a number of different ways, including browser specific HTML or plain old ASCII, and will eventually be rendered directly by browsers.


    It would also facilitate content based indexing. After all, it's the content that counts.

    --
    Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
  71. Re:Numbers for a slashdot effect anyone? by Vidar+Hokstad · · Score: 1
    The first time our page was on Slashdot we had visits from about 10.000 unique IPs, resulting in more than 100.000 hits, within the first 24 hours. We then kept on getting hits...

    (To the ones that bombarded me with mail after the second time we were featured on Slashdot: Be patient, we'll get to your mail, eventually... Let's just say I got MANY mails)

  72. It's not just for reading; it's about publishing by hey! · · Score: 2
    I think people misunderstand when they complain the PG texts are too hard to read because they're not formatted or they don't like reading on their computer.


    In fact, you can use the Gutenburg text create your own edition, with your own illustrations, introduction and footnotes, and publish it. Perhaps for distribution to your students if you are a teacher,or perhaps to the world at large.


    The Guteburg copyright restrictions are a lot like the BSD license. They're aimed to get the work used as widely as possible. If you modify the work, you just strip out the Gutenburg notices and it leaves you with the unencumbered text to do what you will.


    This is a incredible idea, and one that deserves support. Maybe they should be nominated for a MacArthur grant?

    --
    Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
  73. Plain Vanilla ASCII ? by SimonK · · Score: 1

    It is interesting that Project Gutenberg have chosen to put things on line only in ASCII text.

    I wonder if they might benefit from keeping their master copies in some kind of basic plain text markup language based on XML or SGML. Having things like the chapter structure, or more importantly the scripting information for plays, in the document in a machine readable format would seem to make it easer to search the collection, and also easier to reintroduce formatting to allow prettier looking hard copy. I don't know about anyone else, but I find reading on paper much easier than on screen, and nice formatting in that context is important.

    I don't mean to belittle what they are doing - I think it is excellent work - and I emphatically don't want them to keep the master copies in HTML or some equally labile format, or to start to introduce physical markup, but it would be nice to have some idea of the structure written more clearly into the text.

  74. Then fork a new distro... by hey! · · Score: 2
    Most people seem to agree that the PG provides a useful service by putting paper texts into electronic format. Almost everyone thinks that it would be better in some other format (ASCII, XML, SGMS etc.)


    Since the content is freely reusable, strip off the Guteburg notices and make your own archive of e-text's in your favorite format! They've done the hardest part of the work, HTML-izing the work with CSS should be a breeze.

    --
    Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
  75. Re:I disagree. by Jerf · · Score: 2

    Who says it has to be this way [mostly meant for stuff like Shakespeare or other important literary works]?

    Project Gutenburg.

    Who says books have to be this way to be online?

    Project Gutenburg.
    http://www.promo.net/pg/history. html#beginningphil

    Where does it say that this is PG's purpose,

    Here: http://www.promo.net/pg/history.h tml#theselection

    and who wrote that?

    Project Gutenburg.

    Wow, you've impressed me with your fine reading skills. Next time, try reading the source material before engaging in combative behavior.

    You are doing a hell of a lot of whining about something that is completely free and explicitly tells people to add markup if they so choose. If you can't understand them, it's probably because they are thinking in the long term, and you arent'. Quote from the above referenced page:

    Alice in Wonderland, the Bible, Shakespeare, the Koran and many others will be with us as long as civilization. . .an operating system, a program, a markup system. . .will not.

    Quit whining, AC.

  76. Typing out pages from a book... by Psychofreak · · Score: 1

    WHy don't they use some OCR (optical charcter recognition) to help with this? They could probably have raw files with the text, formating required, in a matter of days. Wouldn't that be a big help? Or is there a lack of Open Source OCR software?

    --
    Laugh, it's good for you!
  77. I agree... by speck · · Score: 1

    I'm with you there. I guess my point is that they should store the books as XML, and produce plain-text versions from it for public consumption. I certainly agree with you (and PG) that plain-text is a necessary format to have available (for end users). The conversion from XML to HTML (or, say, PDF) would be a lot easier than converting from text, though.

  78. The On-line Books Page has moved. by LadyNymphaea · · Score: 1

    Its new location is at:

    http://digital.library.upenn.edu/books/

    The CMU address still works for the opening page, but the site manager is recommending that everybody link and bookmark the new URL.

  79. How the hell did Microsoft lose the DOS source? by jfunk · · Score: 2

    One problem might be that Micro$oft has claimed in the Calera suit that it lost the source code to DOS.

    Hmm, seven major versions, countless minor versions, over the span of many years...

    Ooops, all lost. Even Win95/98... Joe Bob brought them home for an elementary school project and his dog ate them all...

    The court didn't actually *believe* that did it? For a lie, that's pretty damn boldfaced!

  80. Audio Public Domain Books by 23D · · Score: 1

    Logosproject.org is just getting started but is trying to be an audio version of project guttenberg, with real audio and MP3 versions of public domain books. They are read by anyone willing to spend the time reading a book into their computer and encoding it. Submit something.

  81. How to promote PG? by hzo · · Score: 1

    Team up with the guys from PALM and/or HANDSPRING.
    For them, the availability to download books into their devices
    has cash value since it can attract new customers.
    They'll eventually place some pointer to PG into
    their hand-held manuals or their web sites.

    Try to get some authors to sponsor PG by providing
    etext stuff (or even books?) donated to PG.

    Speak to Tim O'Reilly. ;)

    The /. crew provides a column "My Favorite
    Literature Download of the Month"
    which can
    bring new insights to geeks who usually read more
    Perl than literature pearls (training the other
    half of your brain cannot be wrong and
    might even improve your programming skills :)


    --

  82. PG banner ads? by & · · Score: 1
    I visited the PG web pages looking for a PG banner ad to put on some web pages I'm designing, but there weren't any there. I think it'd be a good idea to design some and make them available from the PG web sites, to make it easier for others to support PG through advertizing. I'd make them myself, but graphic design isn't my forte.

    --

    --

    --
    Bitwise, Andrew.

  83. Re:Excellent - a free ENGLISH DICTIONARY exists! by divbyzero · · Score: 1

    Thanks for the vote of confidence! I actually wrote the web interface to WordNet that's at the Princeton site, along with the Tcl/Tk version of the offline WordNet browser.

    Although it does the trick pretty nicely as a dictionary, it might help to remember that WordNet's real strength is in the fact that it is really a type of thesaurus. A dictionary is basically an associative array (words to definitions) but WordNet is an n-dimentional graph or net (hence the name) of relationships between words (synonyms, antonyms, "is a type of", "is a part of", etc). This makes it far easier to actually browse than a dictionary.

    To answer your question, I'm afraid I don't remember the number of words in the WordNet database. If you download the kit from the site at Princeton, it'll tell you there. You'll also have the added benefit of it running much faster, since the network and server at Princeton are both pretty slow.

    Fun bit of history as to why WordNet is open source? It has the same kind of background as the Internet itself... an academic effort sponsored by DARPA.

    -- Div.
    But my grandest creation, as history will tell,

    --
    But my grandest creation, as history will tell,
    Was Firefrorefiddle, the Fiend of the Fell.
  84. A pragmatic action by Troutgirl · · Score: 1
    I run a couple of book review and recommendation websites (MysteryGuide.com and ScienceBookGuide.com). We often review out of copyright titles (e.g. GK Chesterton, Charles Darwin, Wilkie Collins). I'll start putting links to PG editions of these books on my review pages as my little contribution to the cause.

    IMHO, PG needs to establish relationships with websites that have attractive content. People don't wake up one day and think, "Gee, I have a mad urge to read an e-book" -- they cruise sites like mine to get ideas about what books to read. So you need to capture reader interest near the checkout line, so to speak, when the offer of an instant free copy is maximally attractive. For example, I once needed to refer to Dickens's _A Christmas Carol_ once on short notice to do a parody -- and PG came to the rescue. Or I got an email from a teenager in like Norway who was reading a Conrad novel in the middle of the night, only to find the last chapter was missing. PG texts are lifesavers in situations like those.

  85. Problems with PG's web page by linux2000 · · Score: 1
    The biggest problem with PG that I see is the lame web page. How we access data is sometimes more important than the data itself. PG has all the fundamental parts - indexed by title, by author, and a search engine - but the web page presents it all very poorly.

    The main web page needs to be simple and powerful. Put the search engine front & center! Don't make me click a link to get to a search engine. Put the A-B-C-D-E-... links for Author and Title lookups right there on the main page, and don't make me have to scroll down to reach it. The least important thing is the gigantic text file of every book you have available, yet you put that on the main page, occupying over half the visual space on my browser. Another huge chunk of visual space is dedicated to FTP sites containing the texts, and even HOW TO USE ftp sites (!) -- the instructions for GETTING THE INSTRUCTIONS takes up an entire paragraph.

    The fundamental aspect of good web pages for the next century is: MINIMAL WORDS

    For example, the bottom of the PG web page says:

    If you try to download a book and you get an error, try to find a solution in our Help page
    I feel uncomfortable wasting my time reading that entire sentence just for the concept:
    HELP with downloading
    An entire sentence digested down to 3 words, and you can make all 3 a link to the help page, giving the user a larger target to click on than just the word "Help".

    The third paragraph talking about FTP also has combined within it discussion of subscribing to a mailing list/newsletter. Different concepts should be visually separated.

    And lastly, there's way way way too much text at the top of every Etext that has nothing to do with what the user is attempting to read. Learn from the GNU project - one simple paragraph with the basic facts, and a pointer to a web page where they can read more. This solves another problem for you - if you have to change that text, you only have to change 1 web page, not the tops of 10,000 documents.

    SUMMARY
    Make the user's life quicker & easier, and you will get returning visitors. The way your web page looks today, I don't want to come back.

    I hope the PG team accepts these comments as constructive criticism, because I strongly believe in the purpose and goals of PG. Keep up the good work!

  86. Hold on. Its coming! by jamesl · · Score: 1

    Changes are coming and PG is well positioned to take advantage of them.

    Electronic books (Rocket et al) will eventually make it into the mainstream. I saw one for sale online @ $199 -- $100 less than two months ago. The Peanut Reader for Palm works surprisingly well, and there are lots of Palms out there already.

    Popular press prices WILL come down as printing and distribution expenses are eliminated and demand rises. When people are used to downloading books to portable electronic devices and paying very little, PG popularity will take off. Hold on, Its coming!

    I suggest charging a nominal fee, say $1 for each download and use the revenue to promote (maybe even advertise) PG. This means actually paying someone to get on TV, talk to newspaper editors, beg for free banner space etc.

    Free is good, if you know about it. $1 aint bad if that's what it takes to find out about it.

  87. Gutenberg! by teleny · · Score: 1
    I've been a PG fan from about 7 years ago, when I still thought of using Gopher as "hacking". (The fact that I was doing it through a semilegal tapline into Yale U. is probably part of it.) I remember the pleasure of reading "Alice in Wonderland" (I wanted that first book I read off a screen to COUNT) sitting upright in bed with a PowerBook perched on my lap, and wires strung all around.

    I never thought of it as needing any more publicity than it gets (after all, I know about it.) I spent a summer doing RC5 when Bovine pledged $8K to the Project; and I've often toyed with the notion of sending some of my favorite old novels there. ("Three Weeks", by Elinor Glynn, the uncut"Pelham", by Bulwer-Lytton.) I've often used it to make gift versions of "Agrippa", by William Gibson for friends, as well as my BBS Housewarming Kit (Agrippa, The Hacker Crackdown, and a Blue Box plan), which I've used to get file points for boardz all over the local dialing area and beyond.

    So, I guess you might say I'm a fan. I think that what's necessary is a bit of pizzazz. It was OK when it was one of the only things out there, but nowadays, it's not at all thrilling for people who expect anything on the Web to jump, flash, and leap off the screen. It needs to play up the fact that it's not just classic novels: there are movies in there, music, pictures...there are quite a few childrens' books, a truly classic cookbook or two...a treasury of literature on every reading level for people who might want to learn English, or empower themselves with a knowlege of Western Culture in general.( I don't think that it's bigoted to point out that it's a lot more empowering to learn a foreign culture associated with technology, than it is to try to reinvent the wheel as it pertains to one's own. The West has had to do this several times.) Perhaps a small M$ Bookshelf-like selection included with Linux distros? This is one of the most inspiring things to be put out on the web: I'm sad that it doesn't get eyeballs.

    --
    teleny, friend of cats.
  88. ASCII vs HTML (XML, RTF, LaTeX, ...) by Michael+Woodhams · · Score: 1
    There is some debate on the relative merits of ASCII (as used by PG) and non-ASCII (particularly HTML) formats. ASCII will always be readable and can easily be used as a base for more elaborate formats. HTML allows formatting, convenient viewing, and features such as links and illustrations.

    It seems to me there is a technical solution to give us the best of both worlds: Define an authorised minimal subset of HTML to use. Write a program to automatically strip this simple HTML from the texts to yield ASCII. Write a program to do a 'diff' between an ASCII and HTML version of a text, and update the HTML version with modifications made to the ASCII. Write programs to convert minimal HTML to XML, LaTeX or whatever your favourite format is.

    With these tools, you can easily maintain the HTML and ASCII versions synchronised, and add other formats as required with other conversion programs.

    One problem with this approach is "what about when the language you wrote the programs in becomes obsolete". I have several answers to this: First of all, if the minimal formatting is not too complex, neither will the programs be - they can simply be rewritten. Secondly, FORTRAN and COBOL compilers are still available - once popular languages last forever. In 50 years, Perl and C++ will still be compilable.

    --
    Quattuor res in hoc mundo sanctae sunt: libri, liberi, libertas et liberalitas.
  89. ANTHOLOGY.ORG - Sugestions? by turg · · Score: 2
    I don't know why they don't use the name (names) they own.

    Anyway, the reason I'm posting is that I've registered the name anthology.org (not yet active) to provide a directory to etexts from various collections/projects. Sorta like a card catalog (or maybe the interlibrary loan database? whatever) -- probably a yahoo-style navigation. I'm sorta surprised that I no-one else has done this (with high visibility, anyway) -- does anyone else think this would be useful?
    -
    <SIG>
    "I am not trying to prove that I am right... I am only trying to find out whether." -Bertolt Brecht

    --
    <sig>Guvf vf abg n frperg zrffntr
  90. I USE my PalmIII to read e-books by Patola · · Score: 1
    I currently use my Palm III handheld to read a lot of books. From Principia Discordia to Aristhotle's Rhetoric, and also for reference (I have a lot of awk, vi, emacs etc. reference books on my Palm), I regard them as being indeed useful. There are lots of sites for electronic texts converted to the common "doc" format (it is not MS Word DOC), proving that a lot of palm users actually use this handheld device as an e-book reader.


    Consider this: it may not be ideal, but it is suitable. Reading a small surface is not so bad. Remember those little bibles that some religions distribute freely? it's the same feeling.


    And you can even read in the dark, using the backlight. 3COM rules, when everybody have a Palm nobody will ever need books (even with pictures, I mean, my Principia Discordia has all the pictures of the original book).


    Patola

    --
    Patola (Claudio Sampaio)
    Unix System Administrator
  91. Why the argument for "Vanilla ASCII" is crap by Tetsujin · · Score: 1

    Gutenberg is cool, one of these days I'm gonna buy their CD so I don't have to download everything. But I have one serious beef with them. Everything's in plain ASCII, formatted for an 80 column screen or line printer. (Some of it is even double-spaced). This is nice if you wanna read on a VT100 or something, but what if you want to read on a Palm Pilot, or make some nice printouts, or just on-screen with a nice (proportional-spaced) font, huh? Then you've got a jumble. I can understand the need to have ASCII versions -available- but here's the problem with the Gutenberg Project's assertion that the ASCII version is enough - it's -easy- to convert non-ASCII (HTML, SGML, LaTeX, etc.) into ASCII - it's something that can be completely automated. Going the other way can't be automated so easily - pretty much the best a person can do is mark it up while they're reading it. Of course, it's also dumb to make everyone out there mark up their own copies by hand if they want it in another format. Maybe Proj. Gutenberg could start making all their new E-Texts in SGML or something, so all their hard work doesn't look like shit on-screen. ---GEC

    --
    Bow-ties are cool.
  92. Project Gutenberg "forces" no one by anonymous+cowerd · · Score: 1

    > A more significant beef about PG is that it is
    > centralized and dominated by one person, who does
    > not share the philosophy of Open Source production
    > that most of us do. Instead of forcing individuals to
    > contribute to this project, why not help them set up their
    > own web sites to publish their own works, or other
    > works they have scanned?

    Read the PG copyright notice, which is at the top of most PG text files. Michael Hart does not force anybody who contributes to Project Gutenberg to post his work on only his one site. PG's copyright terms are actually more liberal than the GPL. I have beta copies (not completely proofread yet) of my Project Gutenberg transcriptions on my own site:

    http://www.con centric.net/~Wkiernan/text/Gutenberg_at_Frownland. html

    Contributors (and anyone else, too) are allowed, by the terms of the PG copyright, to redistribute PG works on one of two conditions: either they strip off the PG copyright header, in which case they can reprint the work with no further restrictions; or otherwise, if they leave the PG copyright header on, they must contribute 20% of the profits to Project Gutenberg. Certainly that's not requiring too much of a redistributor, to ask him to strip off the PG copyright header from the top of the text file, before he uses it anyway he pleases.

    And for those HTML fans who criticize PG's least-common-denominator ASCII format, many of the works in the PG library are available in both ASCII and HTML format. While I myself prefer plain text, after I finish transcribing my next two books, I'm going to fall back and make HTML versions of all the books I've done thus far. (It's going be a while; the next two books amount to a little over 3000 pages. At a hundred pages per weekend, I'm "booked" until about next June.)

    Yours WDK - WKiernan@concentric.net

  93. Why XML? by anonymous+cowerd · · Score: 1

    I know nothing about XML. What software reads XML files? Why should I use XML instead of HTML? I have several thousand pages worth of books which I have transcribed or intend to transcribe into ASCII for Project Gutenberg, and I was planning on making HTML versions of them all. But if, as you say, XML is so much better, maybe I want to make XML versions instead? Also, how does XML handle text in foreign languages, with letters with accents, text in the Greek alphabet, and all the rest of that? I would very much appreciate the input of Slashdot readers on this question. If any of you can offer suggestions or pointers, please email me at Wkiernan@concentric.net.

    Yours WDK - WKiernan@concentric.net

  94. Re:GPL != Holy_Grail by AxelBoldt · · Score: 1
    If somebody works and produces a product that you use, enjoy, whatever, shell out a few bucks, geez. Making all books (or software, for that matter) free doesn't provide a lot of incentive to people.

    People who write books or software generally don't need a monetary incentive; they do it because they love doing it. Certainly the best people do it because they love it, just like in almost any field. The most compelling incentive is always that you are proud of what you are creating, not some meager pay check.

    Do you really want the trust-fund, independently wealthy types as our only writers?

    No. Do you mainly see the trust-fund, independently wealthy types writing free software? No? Why then would it be different for free literature? People, rich or poor, long to write, you don't have to pay them. Just like people love to program, even without payment.

    --

  95. Principia without formatting? by Skid · · Score: 1

    It would seem to me a version of the Principia without formatting that at least approximated the paper book would be sort of... well, not like the Principia.

    Personally, I prefer fnord.org's scanned-in copy for an online version.

    --
    These are *MY* opinions.
    They will not be *YOUR* opinions until the Orbital Mind Control Lasers are operati
  96. Gutenberg needs organisation by Ektanoor · · Score: 2

    Since I entered the Web in 94 Gutenberg has been one of the most important points I have found there. What it has done is of fundamental importance. Let us note that some of the texts are considered World Literature. Besides project Gutenberg allows us to reach literature that hardly one can find today.

    However I am very critical of project Gutenberg in other point. It is good to be conservative. Specially if we consider the nature of this project. However it is too much conservative.

    Project Gutenberg always suffered from a illness of having a very primitive search interface. Or by preserving for too long an interface that is morally old. The problem is that sometimes it may not be only necessary to search books by author or title. There are a lot of other search classifications and tools. One of the most important is to search for specific context much like Altavista or Excite do. If project Gutenberg wants to deliver availability then it needs to work on this.

    The other point is the cumbersome nature of texts. I agree that it was rather dangerous to choose a text format that could deliver some incompatibility in the future. But that was good in 1994. Today HTML is standard, SGML is standard, XML is new but it is also a standard, TEX may not be so popular but it is also a standard, PDF may carry a commercial tone but anyway it is a standard. And there are tons of tools for converting and reconverting from one standard to the other. So it is time to rethink the standards.

    Other point is organisation. Project Gutenberg was and is badly organised. This may look as a seen for some but I really think that a little bit of marketing would help the project a lot. And maybe a little commercial flavour would help even more. Much like what RedHat is to Linux. Gutenberg needs a face. It needs a design. It needs to deliver people something. Frankly no one is borned with the name Oesopus burned in big letters in the brain.

    I don't pretend that Gutenberg should become another Amazon. But I think that by making literature a free tool, by delivering an infrastructure in a very GPL'ed nature and, by building a commercial basis for more complex tasks and material support, I believe that Gutneberg may become another lighthouse of the Web.

    Sincerly it would be sad to see project Gutenberg closing its doors. Yes we hackers may give some help on making tools and helping project Gutenberg with some design and technical support. Humanists may help by translation, classification and analysis. We may try to push a marketing campaign all over it on our own resources. But this will not save the project if there is not an organisation. If there is not a mechanism to deliver people that the world does not end on The Matrix and Coca-Cola. And if we don't take the care to feed the project with some material resources that may be needed for its future.

  97. Okay, anybody who wants to help PG... by Glenn+Hauman · · Score: 1

    ...I've been wanting to do this for a while, but have been short-handed. But I'm sure there's somebody reading this who can help me.

    BiblioBytes (http://www.bb.com) gives away books on the net. We sell ads on the pages, and authors share in the ad revenue... which means authors of more recent works also put their works online, such as fiction from Neil Gaiman, Peter David, Nancy Kress, Barry Longyear, Ron Goulart and others, and nonfiction like The Temp Survival Guide by Brian Hassett.

    PG grants reproduction rights to anybody who wants to distribute their books commercially, as long as PG gets 20% of the revenue-- but I'm shorthanded and haven't had the resources to do the conversions necessary on books in the public domain when there are living authors and expiring contracts to take care of first.

    So I'm appealing to you folks. Anybody who wants to mark up PG texts to our stylesheet (basically HTML with certain conventions) is invited to contact me to work out the details. In compensation, I'm willing to give an additional 15% of the revenue generated to whoever does the markup and formatting.

    The books will then be in a web friendly format, and PG will get funding from it. And so will you.

    Send me email at comment@bb.com if you're interested.

    --
    Best-- Glenn Hauman, BiblioBytes

    http://www.bb.com

  98. Bartleby.com by stevevl · · Score: 1

    Please also see the Bartleby Library site, which has many full-text books. Concentrations include reference (Bartlett's Quotations, Emily Post's Etiquette, Strunk's Elements of Style, Fowler's The King's English), poetry anthologies (over 1800 poems in six classic collections), Theodore Roosevelt (8 books, including Autobiography). http://www.bartleby.com Sincerely, Steven van Leeuwen Editor and Publisher Bartleby Library