Slashdot Mirror


Digital Future of the Library of Congress

lesinator writes "On Monday the 28th the US Library of Congress is holding the eighth lecture in its series on Managing Knowledge and Creativity in a Digital Context. Previous speakers include David Weinberger on blogging, Brewster Kahle - founding member of archive.org and the wayback machine, and Lawrence Lessig on intellectual property and the creative commons. After the lecture questions will be taken from the audience and the internet. C-Span will be broadcasting the lecture live at 6:30 PM EST, and also has archives of previous lectures. Audio archives of previous lecture are available at Audible.com in the Selected Free Media section."

11 of 141 comments (clear)

  1. Re:At last! by Shadow+Wrought · · Score: 4, Interesting
    Well I owuld think that they would have to start with an image first. Once they OCR'd it and generated ascii text files, they could save a tremendous amoutn of space by simply deleting the images. However, after that much effort in imaging all those pages, I just can't see them doing that. The best bet is probably two databases, one of ascii text and one of images.

    They might even be able to generate revenue by having the ascii text freely available and searchable, while the images would cost money. That way folks just interested in the text can find it easily, while scholars and others who need to see the source material can have access at a moderate price.

    --
    If brevity is the soul of wit, then how does one explain Twitter?
  2. Re:Nice, but how long? by yuriismaster · · Score: 3, Interesting

    Well, I would imagine that unless they have a massive staff and many OCR scanners or automation with REALLY good OCR, this may take a LOONNNG time.

    I'm not quite sure about the length of a BLOC, but this is a job for not-quite-manual labor. Each book requires a simple task: Scan page 1, flip page, scan page 2, page 3, flip, ad infinitum.

    One way to save on time would be to contact the publshers of any book made after 1985-ish, where you can get electronic copies from the author. Some older books may have been already digitized, but it's still going to take more than 25 years unless there's a massive army working on this.

  3. Re:At last! by WillAdams · · Score: 2, Interesting

    There's a cue for a question I've been wondering about for a while.

    What was the first reference / usage of ``LoC'' as a unit of knowledge measurement?

    The first time I recall seeing it was in Michael Gear's novels, _The Artifact_ if memory serves, ~1976.

    Anyone have an earlier instance?

    William

    --
    Sphinx of black quartz, judge my vow.
  4. Hello, Project Gutenberg?!? by Infosquawk · · Score: 5, Interesting

    I can never understand why there isn't more acknowledgment of our debt to Project Gutenberg on these issues.

    Michael Hart was digitizing books before digitizing books was cool, as far back as 1971, and the Project's efforts have been hugely successful on very little money. Nevertheless, I rarely see any official or media acknowledgment of the Project's efforts. If anyone should be on that panel for their ability to give advice from practical experience and performance in this field, while on a shoestring budget, it would be Hart!

    --


    OoO

    Please do not publish outside of /.
  5. Re:Outsource parts of LOC to Google or Amazon? by HeedlessYouth · · Score: 2, Interesting

    You mean like this?

  6. Publication of New Testament by dpilot · · Score: 3, Interesting

    Authorship of the New Testament is not a simple question at all. First off, the Apostles didn't sit down and start collecting the New Testament. That was done hundreds of years later by some chaps in Rome or Turkey who also had political axes to grind. Every few decades or centuries, there's also Yet Another Translation, and in the forward they talk about the prayer, consideration, and attempts to divine the True Word of God that went into it. Common belief is that over the centuries there has been so much prayer, consideration, and attempts to divine the True Word of God that today's bibles MUST be correct. Yet in spite of all that, I have this feeling that precedent is even stronger in the Bible than in the US legal system, and that we're still carrying the weight of perhaps improper decisions made over a thousand years ago, plus trying to justify them.

    Then you also get to the issue of what is and isn't in the Bible. Consider "The suppressed Gospels and Epistles of the original New Testament of Jesus the Christ, Complete" http://www.gutenberg.org/etext/6516 for an example. Would the Apostles have wanted them published, or not? What about "The Forgotten Books of Eden"? Or less/more controversial, how about Maccabees, Sirach, Tobit, and company - the ones in the Catholic, but not the Protestant Bible? (Perhaps Maccabees is the most historically verifiable book IN the Bible, too.)

    By the way, most of the Bible ended up being written down much later - after even US copyrights would have expired. Good thing Steamboat Willie doesn't date back to BC.

    --
    The living have better things to do than to continue hating the dead.
  7. What about a backup copy? by voss · · Score: 3, Interesting

    It would seem if the LOC is going to have X number of Petabytes on computers...why not have a second copy stored AWAY from DC. If something were to happen to DC at least we would have backup copies of everything...and we probably should have a separate backup location at a third site.

  8. Re:Some ideas by Anonymous Coward · · Score: 4, Interesting

    It's been continually re-written. For example, until 1954 Jesus never actually said "I am the Son of God"; when Pontius Pilate accused him of claiming to be the Jewish Messiah, he cryptically responded "It is you who said it." The fact Jesus didn't claim to be the Son of God but was surrounded by intense believers was one the essential "mysteries" of Christianity that you were supposed to accept as a Christian.

    In 1954, the American "New International" edition just editted the trial dialog and "re-interpreted" "it is you who said it" into "I am the Son of God." I don't think the European and Catholic churches have editted that part yet.

  9. Small representations. by Grendel+Drago · · Score: 2, Interesting

    Have you ever seen someone's hundred and fifty page thesis, diagrams and all, fit onto a 3.5" floppy? People who wrote their theses in TeX or LaTeX, with a few postscript diagrams. I was impressed by how tiny the code for a real, well-produced book could be.

    'Course, the problem is that these representations work if you're entering in the content with that method in the first place.

    --grendel drago

    --
    Laws do not persuade just because they threaten. --Seneca
  10. Are they requiring publishers to submit PDF files? by melted · · Score: 4, Interesting

    Are they requiring publishers to submit PDF files for new entries yet? Or files in another open format? Man, I'd hate to see taxpayer's money wasted on doing work that they could avoid doing by simply mandating PDF submissions from publishers.

    I can see that some publishers may just say, "oh, my book isn't gonna be in libraries if I don't submit PDF, so much the better, I'll sell more copies". I hope these fellas realize how badly they're shooting themselves in the foot.

  11. Re:At last! by caseydk · · Score: 2, Interesting


    I was working on this project just a few years back (2001-2002).

    Our estimates projected that by 2005, it would be take about 4 TB of digitization EACH day to keep pace.

    The first storage phase called for 180TB server.