Digital Future of the Library of Congress
lesinator writes "On Monday the 28th the US Library of Congress
is holding the eighth lecture in its series on
Managing Knowledge
and Creativity in a Digital Context. Previous speakers include
David Weinberger on blogging,
Brewster Kahle -
founding member
of archive.org and the wayback machine, and
Lawrence Lessig on intellectual
property
and the creative commons. After the lecture questions will be taken from the audience and the internet. C-Span
will be broadcasting the lecture
live at 6:30 PM EST, and also has
archives of previous lectures. Audio archives of previous lecture are available at Audible.com in the Selected Free Media section."
Maybe the fine folks at audio.com might consider making their audio clips available by means other than the Real or MS media players?
I Want To Believe
Here an interesting talks they might give:
i) What if the Apostles had had technological means to prevent the reproduction of the New Testament?
ii) Would our culture be diminished if the people who rediscovered Beowulf had been unable to decrypt the manuscript?
iii) Is the continual repitition and reworking of myth and fable through the Oral Tradition disrespectful of the content creators who first recorded these stories?
Athletic Scholarships to universities make as much sense as academic scholarships to sports teams.
While it's an interesting question, it really depends on how you want to store the contents of each book.
Would you store each page of each book as an image? As flat ASCII text (except of pictures and diagrams, of course!)? What kind of indexing would you do? Basic indexing of book names? Full-text indexing of the contents? All that storage adds up!
In summary, the library of congress (depending on the method used) could probably fit into something ranging from a couple of gigabytes to a couple of petabytes.
Online Starcraft RPG? At
Dietary fiber is like asynchronous IO-- Non-blocking!
It is amusing that this story follows directly after a story about Microsoft proprietary file formats.
The Library of Congress should insist that all 'publications' be submitted to it in open formats. What good is it if they have something on file that nobody can read! The extreme is that they have to have a licensed copy of every piece of software that ever created a file. If all the formats have to be open then at least historians can cobble together something that can read a file of interest.
With the ip laws as stupid as they are now, we run the real risk of losing the record of our age.
C-SPAN is clearly concerned with ratings. Didn't you see the stuff they pulled out for Sweeps week? I think it was something like "old guy reading boring text to empty room."
They might even be able to generate revenue by having the ascii text freely available and searchable, while the images would cost money. That way folks just interested in the text can find it easily, while scholars and others who need to see the source material can have access at a moderate price.
If brevity is the soul of wit, then how does one explain Twitter?
"Managing Knowledge and Creativity with DRM"...
Sponsored by Apple and Microsoft!
I can never understand why there isn't more acknowledgment of our debt to Project Gutenberg on these issues.
Michael Hart was digitizing books before digitizing books was cool, as far back as 1971, and the Project's efforts have been hugely successful on very little money. Nevertheless, I rarely see any official or media acknowledgment of the Project's efforts. If anyone should be on that panel for their ability to give advice from practical experience and performance in this field, while on a shoestring budget, it would be Hart!
OoO
Please do not publish outside of
With the current wave of outsourcing, privatization, and government use of commercial contractors, I wonder if Amazon or Google don't have a major role to play in the process of cataloging/archiving/serving digital content in the future.
Although LOC could never be replaced by a Google or Amazon, these private companies could provide services that augment or reduce the cost of LOC-like services. For example, if Amazon scans a book, why should LOC scan it too?
Two wrongs don't make a right, but three lefts do.
Also, you would generally split the load between 4-6 of these scanners for a job this big. The software is automated, and will OCR/Convert/Archive the file is one step.
As a general rule, you can fit 10,000 b/w text pages in 1GB of storage.
DAMN YOU OCTODOG! DAMN YOU TO HELL!
The LOC has announced that they are accepting volunteers to digitize texts. Their first volunteer is Earl the night janitor, who has been busily keying in the last 20 years of New York City phone books. He hopes to move on to Chicago soon.
I'm not good in groups. It's difficult to work in a group when you're omnipotent. - Q
Are they requiring publishers to submit PDF files for new entries yet? Or files in another open format? Man, I'd hate to see taxpayer's money wasted on doing work that they could avoid doing by simply mandating PDF submissions from publishers.
I can see that some publishers may just say, "oh, my book isn't gonna be in libraries if I don't submit PDF, so much the better, I'll sell more copies". I hope these fellas realize how badly they're shooting themselves in the foot.