Project Gutenberg Made Accessible
scishop writes "Mazarin is an open-source interface to Project Gutenberg's library. Mazarin increases the accessibility of Gutenberg's 10,000+ books as it formats the books for HTML display -- providing paginations in addition to generating table of contents and other advanced markup features -- along with enabling users to carry out full-text searches on the entire library."
I can not test the claim of all 10k works, but I tested what I thought would be most likely to be left out, and I found that they were there.
I Tested Martin Luther.
(if it was not for the printing press the reformation would not have been as sucsessfull as it was)
But did they have to make the tutorial presentation a fullscreen flash file?
Most of PG's more well-knownalready are formatted into HTML.
I searched on "oil" and came up with numerous passages from various versions of the Bible, and a few recipes from an Italian cookbook. Attempted to search again, but amazingly the site fails to respond...
Nothing but the finest in meaningless drivel
Interesting idea, I can't get to the website but a feature I'd want is the content shared P2P so you don't have to rely on a central server for the content.
;).
A central webpage index could just have ed2k links to the files: sharereactor for books. When they update the book they release a new hash-link and the file onto the network.
It being P2P it could open it up to more then just public domain books too
Hmm, nicely formatted error messages. Does anyone know what this is? I'm assuming it's a mod_perl handler of some sort.
-- Sorry, I can't think of anything funny to say here.
10,000+ books. Right, so I've got to read all of them before I can post a comment?
Oh wait, this is Slashdot.
Where's the Kaboom?
There's supposed to be an Earth-shattering Kaboom.
I would say not, it was needed, Luther saw the abuses of the Church in Rome, and tried to correct them, he never wanted to break from the church, and infact the break officialy did not happen till 200 years after Luther died, when Rome said there are 2 churches "them and us"
A guess would be that the script is accessing the database remotely. Thus, if the server is getting slashdotted, there is no way it can talk to the remote database. Instead of die, they should have sent a small text message of "Remote database unreachable."
;)
Hind sight is 20/20
In a place beyond time and space, in a land far better than this, look for me there...
This sounds like it just adds complexity and does not make gutenberg's data accessible.
There were several research projects for which I used pg as a corpus. However, pg's a terrible hassle for the first-time researcher, since the format of the introductory text ("we're gutenberg, here's the copyright, blah blah") is inconsistent.
You have to remove the introductory text to avoid bias in the corpus, however there are so many pathological special cases (different formats, spelling, languages, words used, punctuation, case) that it requires several hours of Perl coding to successfully strip the header text from 75% of the documents with >99% accuracy. Yuk.
If gutenberg is serious about making their work more accessible, they should think about the simple concern of ensuring consistency in the header text format.
I think a lot of the unfortunate twists in European history are due to the Catholic church becoming so corrupt as to cause a reformation in the first place.
Anyone want to buy an indulgence?
since some seem to have trouble on the index page... here it is:
Project Gutenberg is the brainchild of Michael Hart, who in 1971 decided that it would be a really good idea if lots of famous and important texts were freely available to everyone in the world. Since then, he has been joined by hundreds of volunteers who share his vision.
Now, more than thirty years later, Project Gutenberg has the following figures (as of November 8th 2002): 203 New eBooks released during October 2002, 1975 New eBooks produced in 2002 (they were 1240 in 2001) for a total of 6267 Total Project Gutenberg eBooks. 119 eBooks have been posted so far by Project Gutenberg of Australia.
Click here for the full PG story and here for the latest News , and learn about the Stockholm Challenge Award recently won by Project Gutenberg in the category Culture.
The key link is search page.
Do you need a website upgrade?
What's the best way to read online texts? There are a bunch of PG texts I might like to read, but reading them in a web browser, as a big text file gets tiring after ten minutes or so. I'm not sure why I can read a book for hours, but the screen for minutes, but there you have it. I don't think that HTML will help this problem -- does anyone have recommendations for better ways to read these files?
I love sexy robot voice tutorials! mazarin tutorial
"If the facts don't fit the theory, change the facts." -Albert Einstein
Karma? There's a serial modder out there.
Bah. Posting HTML is so 1996. You can do so much more with these texts. One example is Open Source Shakespeare, which takes all of Shakespeare's texts, indexes them, presents them in an attractive manner, creates a concordance, provides a full-text search engine, organizes the lines by character, etc.
All of the texts are open source, and you can download the database and source code from the site, too. Check it out.
Monday May 24, @03:14PM : Project Gutenberg made accessible
Monday May 24, @03:15PM : Project Gutenberg made inaccessible
It was very convenient for the Roman Church to have a practical monopoly on what was widely acknowledged at the time to be the main source of information, the Holy Bible. When the printing press was invented, this diluted that monopoly, since then the ordinary people could afford their own copies of the Bible and became independent from the Church for information. Luther was one of the first to realize that, when he urged people to read the Bible. A consequence of that was that people learned to read. Until early in the 20th century, the literacy rate for countries which are mostly Lutheran, e.g. Scandinavian countries and parts of Germany, were much higher than in southern Europe, where people were mostly Catholic.
A modern analogy:
Catholic Church --> RIAA
Lutheranism --> P2P
"Project Gutenberg Made Accessible"
Oh, the irony that is slashdot.
The way to a man's heart is through the left ventricle
Information doesn't want anything. It merely is.
and what is wrong with monopoly? Uniformity breeds community.
Gotta turn a living you know...
http://www.gutenberg.net/etext04/awbv110.txt
there in HTML.
The first volume was converted to HTML by hand by someone else and to pdf, by machine, I think, whereas my site simply has the e-text:
http://rjs.org/gutenberg/Stevens_Thomas/
So an automated process would be a boon. What I'd really like to see is an OS text-to-voice reader program. I wrote a wxPython program to assist conversion from scanned text to PG format: http://rjs.org/gutenberg/OCR2Gutenberg/, but I have never been able to find a free set of spoken word wave files or speech library.
Ray
http://rjs.org/ - biking, astronomy, photography
Good grief. I think you should review your church history! At the time the Roman Catholic church was a massively corrupt bureaucracy that supressed ordinary people, was largely usurped by those who wanted power, and didn't teach about God's grace to mankind. In fact, much of the doctrine taught was contrary to the gospels. Papal bull, anyone?
I take a different view: just imagine all the problems that we'd still be dealing with if the Reformation had never happened!
Wouldn't it be great if Google were involved in Gutenberg in a major way?
Quote:
...donating to the good cause. If you don't want to donate money, volunteer to proofread, or it might be worth it for writers out there to consider a notation in your will that will allow your works to pass either directly into the public domain, or, as i have been in contact with lawyers to discuss, simply passing the copyright of your own works on to project gutenberg. This allows them more work to publish, and if you're in a contract somewhere that allows for royalty collection, you can set it up so that those royalties switch to project gutenberg at the time of your death.
Now might also be a good time to contribute an hour a week to a literacy project, or to make a donation there. Adult literacy is a serious issue all over the world, and that includes right here in the states, where there really are bright people out there who could have better lives if they could read. I can't think of a more on-topic subject than project gutenberg to discuss adult literacy and the need for both literacy teaching and to support free literature for the masses such as this project provides.
Just my $0.02...
solemndragon
"I'd say 'Have a good time,' but arson is still illegal.
At the risk of pointing out the obvious, Michael Hart's decision to make the basic format of PG texts "plain vanilla ASCII" has resulted in texts that are highly accessible by any meaning I can think of for that word. They are also compact, platform-agnostic, and durable. Texts contributed in the 1980s are fully usable today.
While there have been constant complaints about PG using the "wrong" format, opinions on the "right" format have been the flavor-of-the-month (or at least several flavors per decade). Had PG decided to use a "better" format, all of their volunteer time would probably have been taken up converting (say) WordPerfect to RTF to HTML to SGML to XML, leaving relatively little time to digitize and proofread texts.
"How to Do Nothing," kids activities, back in print!
It's great - I now have that on my laptop hard drive, mountable by Alcohol, so I'll never be short of anything to read, especially when the web's not available...
I can't find the torrent file I got it through, but if it helps the filename is pgdvd.iso and the size is 4,139,646,976 bytes.
When the printing press was invented, this diluted that monopoly, since then the ordinary people could afford their own copies of the Bible and became independent from the Church for information.
Not only that, but Luther translated the Bible into the common tongue. He used to hang out in pubs and the market and make notes of how people really spoke so that his translation would reflect day-to-day usage. The result - which is solidly argued in The Sovereign Individual and elsewhere - is that the common man realised once he read the Bible for himself that he didn't have to prop up the corrupt and extravagant monstrosity that was Rome then - economically or otherwise.
Catholic Church --> RIAA
The modern nation state is not a bad analogy either - extortion of taxes by force and the threat of jail, mean grasping and extravagant - and totally unnecessary for true free enterprise. But that's a whole other discussion...
--- Hot Shot City is particularly good.
How do you know that? Apart from the religious dogma that postulates the existence of a homunculus called the "soul", we do not know much about how consciousness arises. What we do know is that information doesn't exist in a vacuum. Information needs a physical medium to exist. Check "An Introduction to Information Theory", by John R. Pierce, Dover Publications, ISBN 0-486-24061-4, chapter 10 - "Information Theory and Physics" for a basic explanation why. Now, assuming a certain body of information and a system to handle that information, we have no idea if a sufficiently large amount of information with the right manipulation system will have consciousness. Sometime in the next few decades we will have machines with the same complexity and information-handling power as a human brain, then perhaps we will be able to create a conscious machine with free-will.
Anyhow, that's not the point. "Information wants to be free" is just an easier way to say that human beings have an urge to share whatever information they have with other humans. History has shown that, given efficient communication media, it's very difficult to maintain information secret.
and what is wrong with monopoly?
Intrinsically, nothing. Some public utilities are natural monopolies, it wouldn't be practical to run several different water, gas, and electricity supplies to each house, for instance. Sometimes a monopoly is useful in developing a new technology. The Bell Telephone Co., in the first half of the 20th century, did create a relatively cheap and efficient phone system using a monopoly. Microsoft created a widely used personal computer standard using a monopoly. There are some circumstances under which a new technology spreads faster if a monopoly exists. But a monopoly also induces slackness. Monopoly holders will not be eager to try harder. When growth starts levelling off, a monopoly usually stagnates. That was bad for Christianism, it was bad for the telephone system, it was bad for personal computers... may I generalize?
hell the Gutenberg Project is faster than /. for news.
Well, in my converstaions with information I have determined that while some information does indeed want to be free, other information does not want to rock the boat. Some information simply wants to be left alone. There are also some sub-groups of information that are blissfully ignorant of their situation and do not realize that they are not already free.
I have not had the time to speak with all information, so this is merely anecdotal evidence of the diversity of opinion among informations.
--
As a matter of fact, I am a lawyer. But I play an actor on TV.
Information wants you to give me a dollar.
If I had a diet of worms, I'd be abrasive and confrontational as well.
This message has been scanned for memes and dangerous content by MindScanner, and is believed to be unclean.
I've created an RSS feed from the Project Gutenberg list of etexts. The RSS feed contains titles, authors, descriptions and links to the relevant page or file on http://www.gutenberg.net/
PGDB.rss PGDB.rss.gz
Quite to the contrary, these books were added by Rome at Trento. Until them they were usually copied along the Bible without being considered part of the Canon, just like the Shepherd of Hemas before the Montanist heresy.
It was only when Luther decided to have them printed apart from the Bible that Rome decided to try to accuse him of tooking them out of where they never belonged...
You misquote, actually "you art Peter, and upon this rock I will build my church", Mt XVI:18 RSV, and the unanimous consent of the Fathers of the primitive churches is that the rock wasn't Peter, but his confession.
Again you misquote: "I will give you the keys of the kingdom of heaven, and whatever you bind on earth shall be bound in heaven", Mt XVI:19 RSV. This is spoken of the church which started with Peter and the Apostles, and it goes without saying that an institution having for its head a man instead of Christ ceases to be a legitimate church.
n fact, the analogy of the keys relates to the custom of giving the keys of a city to the person appointed by the king to have authority there. The authority is taken by the king if it is not duly used.
Leandro Guimarães Faria Corcete DUTRA
DA, DBA, SysAdmin, Data Modeller
GNU Project, Debian GNU/Lin