Internet Archive Challenges Google
richards1052 writes "The Internet Archive, whose main claim to fame is the Wayback Machine, designed to archive the internet's web history, has created a new project: the Open Content Alliance. It's purpose is to open the nation's library collections to universal web search.
A number of major library systems, including the Boston Public Library and Smithsonian, have refused to sign up with competing ventures by Microsoft and Google because they do not provide for universal access to digitized books. These commercial ventures prohibit books being accessed by competing search engines.
So far, 80 libraries and research institutions have signed on with Open Content Alliance. They must pay for the scanning of their books while Google and Microsoft offset that cost for their participating institutions."
I believe I've commented on something like this before. Might be a good idea to archive the books lest somewhere in the future we re-live something like the Spanish Inquisition where important literature was lost. Its also making this society a bunch of couch potatoes. What ever happened to walking into a quiet library, the smell of stale books, looking around at people. Its slowly being replaced by reading books online and hitting ctrl-w to close annoying popups while you read. Currently I have about 30+ Cisco (CCIE/NP/IP/etc) books and each come with their PDF's. At first I thought, neat I can read them on my laptop... Nowadays I find its easy to just open the book, nothing like butchering my books up with highlighters... This world is coming to one where companies will be fighting to keep us locked in our houses. Call me a troll, just speculation
Infiltrated dot Net
... but on a much larger scale?
Have EVDO, will travel.
The Libraries Shun Deals to Place Books on Web story in The New York Times covers the subject fairly well.
I buy a lot of books. I've got probably 10,000 or so. I wish I could search through them. Some for reference, sometimes because I read something that sounds familiar that I want to find where I first read it. I'd also like to read them on my PC sometimes, or even on my phone like when I'm waiting for a while somewhere. And I'd like to copy/paste short passages from them into messages I send on the Internet.
If this project is really "open", can I have my own libarary scanned? How much does it cost? I own the rights to copy my own books for my own personal use. Does something make these other "official" libraries eligible to use their full rights to their content in a way that I cannot?
--
make install -not war
How many of these libraries think of Open Source and software platform choice? How many of them make sure their web sites are platform agnostic, equally accessible from all browsers? These people are willing to stand up and are willing to pay more to preserve their liberty. Hats off to them. But does this stand also extends to not having their documents locked down in a proprietary format encumbered with licenses and restrictions? I would very much like such ideas, being independent of vendors, would extend to Corporate America too.
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
In particular, if you accept that free exchange of ideas will promote intellectual progress, then is it not also reasonable to suggest that free exchange of artistic content will promote cultural progress? This is the central notion that Lawrence Lessig advocates: that overly restricting the distribution, reuse, and remixing of art and entertainment will inherently stifle culture. (Note that Lessig does not advocate wanton infringement nor abolition of copyright: merely a 'sane' balance between the rights of content creators and the rights of content users.)
With respect to this current initiative, it would appear that they intend to scan and index books that are oriented towards information, as well as those oriented towards entertainment. In my opinion, this is a good thing. There is much that people can learn and grow by having easier access to ideas, where "ideas" means both informational sources, as well as artistic sources.
In case you missed this discussion back on October 2, Carnegie Mellon has a service which helps to better digitize these books. It's called Recaptcha, and it uses otherwise wasted human cycles to convert text that was hard for computers to OCR.
There's a story about this in The New York Times this morning (free reg required). It begins:
The opposition between the Open Content Alliance and Google may not be as much as it seems at first glance. From the NYT article:
It looks like Google will digitize the collection for free in exchange for exclusive rights to offering searches of the digital data, but the libraries don't give up rights to have someone else digitize the stuff again and do with it as they see fit. So they can go with Google for now if they want and the O.C.A. later as they have the resources. This seems pretty reasonable to me. I don't know what the deal Microsoft is offering looks like, but I wouldn't be surprised if it's much more restrictive.
"You call it a new way of thinking; I call it regression to ignorance!" -- Operation Ivy
Nobody expects the Spanish Inquisition.....
(Can't believe I'm the first one to respond with that. Of course by now I'm probably not. )
Nobody expects the Spanish Inquisition!
(I couldn't bear to leave you hanging.)
Ben Hocking
Need a professional organizer?
I agree that a nice, hard bound book is, at the moment, more pleasant to read. However, technologies such as e-Ink and others that allow you to read something digitally without the eye-strain of using a back lit monitor are catching on. I think a few factors make digital copies more advantageous - cost of duplication, storage, protection from damage, searchability.
Storage: I just moved, and I moved three bookcases full of books. That sucked. If those were all digital, I'd have hauled my computer from A to B and brought all of my books with me. In addition, I moved to a smaller house. Trying to find a place for my three bookcases of books has been impossible.
Cost of duplication: With digital copies, books can be distributed without the overhead costs of printing and shipping.
Protection from damage: Many of the books housed in libraries, particularly places like the Smithsonian, are no longer in print. If it's destroyed, regardless of whether it's an accident or a malicious act, it's gone. The library may be able to get another copy from a benevolent individual or the last copy may have just been destroyed. With a digital copy, you can make back-ups of your back-ups... safeguarding the content of that book.
Searchability: This is my favorite... Who hasn't spent 30 minutes skimming a book trying to find THAT ONE PAGE!? It drives me nuts. Searching would make books sooo much more convenient.
You are using English. Please learn the difference between loose and lose; they're, there, and their; your and you're.
Yeah, conspiracy theories are usually quite easy to posit. That doesn't mean they have a bit of merit. Get over yourself—you're the majority, and you're not being persecuted in this country. (Yes, there are Christians being persecuted in countries where they're not the majority, and it is genuinely a travesty. Don't you dare try to use their suffering to perpetuate your persecution complex in this country.) That future you posit is actually less likely than Bush masterminding 9/11 (which he didn't).
Ben Hocking
Need a professional organizer?