Microsoft Joins Yahoo! Book Search Plan
tanman writes "The BBC is reporting that Microsoft has signed on to 'work with the Open Content Alliance (OCA), set up by the Internet Archive, to initially put 150,000 works online. The move comes as Google faces growing legal pressure from publishers over its own global digital library plans.'"
...never innovate.
How many of those books will be Microsoft books?
How is this in any way linked to Yahoo? I thought that Google was the one doing this. Methinks the poster has been into the halloween candy a little early..... and that candy corn had an "E" on it for some strange reason.
I couldn't fail to disagree with you any less.
www.openlibrary.org is the website for content of the Open Content Alliance.
Afterall, you didn't expect to hear about them supporting Google's plan did you ;)
liqbase
It should be pretty obvious .. MS has joined up to ensure this online library ends up being in one of thier proprietary "xml" formats.
Danger, Danger Will Robinson!
-GenTimJS
This is obviously a MS conspiracy to "digitize" all forms of print media, thereby making paper irrelavent, and thus create a lack of work in the pulp industry, which will reduce deforrestation, and unemployed loggers will have to work in salt mines that Bil Gates owns. The miners will need to sign up on Yahoo Messenger, but they won't realize that Y! is merging protocols with MSN Messenger next year, and since they already have MSN passports they'll have duplicate identities. They'll forget to use one of the identities, so that Microsoft clones can take over their unused identity, and thuse a clone army will be born to crush Google.
And you thought it was a simple effort to make it easier to access print resources online! Ha!
Saskboy's blog is good. 9 out of 10 dentists agree.
One important fact that's overlooked, though, is that if Google has digital copies of all those pieces of works, that "digital database" could be stolen or comprimised. If that were to happen, publishers could never totally eradicate all the stolen books that would be floating around on the Internet or dark nets.
Furthermore, it's possible that technical weaknesses in Google's online book search implementation might be used to reconstruct the entire book. For example, search for what you know to be the first sentence in a book. When Google returns an excerpt with the second, third, and fourth sentence, then just do another search for the fourth sentence, and Google will return an excerpt with the fifth, sixth, seventh sentence, etc. I'm not claiming that's how Google's search feature will work; I'm merely presenting the possibility that technical weaknesses might be exploited to the detriment of the publishing industry.
Continuing in their quest to copy everything Google does? Even Paul Thurrott, the infamous Windows guy, is making snide remarks.
Seriously. Microsoft would NEVER be doing this if Google didn't exist and hadn't been doing it.
"Sufferin' succotash."
This is an Opt-In system compared to googles Opt-Out deal. Google should follow MSN and Yahoo on this one. If you look at the contributors this could really go strong.
Microsoft just keeps on begging for lawsuits!
Coding projects blog - Code Slim
The press has concentrated on Microsoft's joining which is fantastic, but we also had 14 key libraries join which is also great news.
http://www.opencontentalliance.org is a good site for this stuff.
Something I am jazzed about is a cool bookviewer at http://www.openlibrary.org/ showing the first books from University of California sponsored by Yahoo! and the "vision book" there tells the story of what we envision and some of the announcements.
onward!
-brewster Digital Librarian Internet Archive (administers the Open Content Alliance)
So I guess MS's plans for the war on Google is so far just to back up enemies of Google?
As far as I understand it, Google is merely indexing the works, so one could locate a book, and would then be able to get it from somewhere else. This (Microsoft) idea is to actually make the full texts available. Both services are useful, but they are very certainly two different services.
I like Google's version better. How do we sign on as supporters to their version of the project? Do they have something set up already for local public and school libraries to be able to use? That seems like one way that they could get a lot of endorsements and awareness from the public about what they're doing.
Abstinence is a government conspiracy. www.SafeSexZone.co
I have one simple demand. I want every single book, magazine, and recording available on the internet. There are hundreds of thousands, if not millions, of books and periodicals that are unread and unsearchable right now because they are rotting away in some library or private collection. Human knowledge needs to be preserved and expanded. Is it unreasonable for me to be able to have access to every single textbook on C++? Forget the legal issues. We'll get some country to pass a law that it is ok to archive information like this...but it needs to be done. Too much knowledge is being lost.
"Seriously. Microsoft would NEVER be doing this if Google didn't exist and hadn't been doing it."
KDE agrees with you.
Both projects actually aim to do the same thing, although their approches will vary. Google's plan is to digitize and index every single book in the world. While this work is done, each and every author/publisher can opt out, by sending word that Google is not to let their books available to read. Google will still let a small amount of text to be read of these books as specified by Fair Use. Fair use in this case is still being determined by the courts, under two seperate lawsuits. Google is working with Publishers and major universities and libraries. Information on Google's publisher program can be located here: https://print.google.com/publisher/?hl=en_US and Googles , and info on Google's Library program can be located here: http://print.google.com/googleprint/library.html Basically, any book written before 1922 is considered public domain and both services will index and let anyone read any of these books from cover to cover. The OCA (Open Content Alliance) is digitizing and indexing all books that were written before 1922, and then only those that publishers opt-in, or other works which are considered public domain. Hope that helps.
"Microsoft said it would initially focus on works already in the public domain. This opens up a whole new innovation from Microsoft that will allow all users access to otherwise restricted works - if they have a Hotmail account and use MSN messenger on XP sp2" Yeah!
The enemy of my enemy is my friend.
Well, friend... We -are- talking about Microsoft here. I'm wondering when Yahoo will find MS's knife in its back.
Take life easy: one bit at a time.
Why does your website have "website" in its URL?
I'm one of the software engineers who worked on the Open Library's Flipbook viewer. I just put up a blog post with further technical details on what we have done here:
u cing-open-library-and-ajax.html
http://codinginparadise.org/weblog/2005/10/introd
Check it out.
Brad Neuberg
This actually will end up helping google. Now, instead of Google trying to convince publishers that what they are doing is right, by themselves, they have Microsoft and Yahoo's help. A hesitant publisher may think google is just getting cocky with it's ideas, but with microsoft and yahoo also pressuring them to allow this, they will probably hop onto the bandwagon.
When Google and Yahoo get done with this, I'm going to search for all instances in the public domain of the word "a". Any wagers on the number of search results?
Sasktel.net is the provider's site, and so they distinguish between their pages and customer pages by putting the annoying "website" in the URL. I think their department of redundancy department thought that up.
MS will probably have a little easier time with publishers, thanks to their advocacy of DRM. It'll be interesting to see if/how the works they archive are crippled.
I saw it on Slashdot, it must be true!
The internet archive has been involved with this for more than 8 years. Amazon also has had the search inside the book for longer than Google has been running google print.
I would expect that Google would not display context past the end of the chapter, so you'd have to know the beginning to each chapter. Also, any technical book with illustrations or figures would be useless for harvesting, if the text relied on the illustrations to make the point. Sure, it might be possible to game Google Library, especially for a novel, but it's going to be more difficult than buying it, or borrowing it from the library.
Actually the two systems are completely apples and oranges.
The Google system is a search system through open and closed content.
The Open Content Alliance's goal is to make the content available.
So the Google system should be able to index OCAs work just as Yahoo will do.
The overlap comes in areas where Google has already secured rights or where the work is in the public domain, in which case Google is providing the content as well.
The "opt-in" part of making your content available is available to everyone irrespective of Google or OCA. It's called a license to redistribute. Creative Commons licenses are quite flexible and already fill that niche pretty well.
I see OCA as more of a book-focused initiative to get public domain and licensed work available to the public.
-- John.
Would it mean that the new releases of MSOffice would have Thesaurus referencing literature works for the examples of word usage?
"I have one simple demand. I want every single book, magazine, and recording available on the internet."*
Piracy puts the kabosh on that idea.
"There are hundreds of thousands, if not millions, of books and periodicals that are unread and unsearchable right now because they are rotting away in some library or private collection."
1) Most libraries know how to take care of their collections.
2) America already is the fattest nation. Do we need to make it easier to be lazy?
"Human knowledge needs to be preserved and expanded."
Human knowledge NEEDS no such thing. Stop with the anthropomorphasizing.
"Is it unreasonable for me to be able to have access to every single textbook on C++?"
Here's the big secret of the book publishing industry. Most of the material that's in books isn't original. How many books are needed to restate the ins and outs of C++?
"Forget the legal issues."
Funny how those who have no stake in the creation of information always take this stance. "Forget the legal issues", my job's going overseas.
"We'll get some country to pass a law that it is ok to archive information like this...but it needs to be done."
Government is the solution to the "They will not Gimme!" problem.
"Too much knowledge is being lost."
Fight for equitable copyright laws. Unfortunately it means leaving your chair.
*Better yet, Mr charity. Why don't you buy it all up, and then release it to the web. You all did it with Blender? Put your money were your ideals live.
USSR -- US ... Cold War...
... Search Engine Cold War!
Yahoo -- Google
===
Seems to be a game of 'who can do the most and seem the coolest'...
This is simultaneously good and bad for everyone... (kind of like the Cold War)
MoM++ - A Classic Expanded - [Master of Magic 1.5]
http://mompp.sourceforge.net/
Quoted from "Microsoft to offer book search":
"Principally and philosophically, we are aligning with the notion that intellectual property should not be proprietarily owned by any commercial company," Tiedt (MSN manager) said.
...if we can get some of those out of print Windows programming books from this. You know, the ones that go for $150 used.
Coder's Stone: The programming language quick ref for iPad
Once they start putting college texts online, then we'll talk. Paying $160 for the book that you use for one semester, and getting $30 for giving it back to the school so that they can resell it again next semester for $175... Psshhtt. Let's talk searchable texts, or downloading only one chapter for a partial price. Kind of like iTunes - don't want the full cd? Buy one song. Novels? Give me the paper version any day. Of course, the sooner I go blind from staring at the beautifully unnatural glow of my computer screen, the sooner I don't have to worry about this issue anyway. How about something that gives suggestions based on what you've read? We have that for music, shouldn't be too hard for books. There are so many possibilities available to us if these things are available online. Everyone is so uptight about "rights" that they don't see what can really be done. The problem isn't the people that "steal" - it's the system that's not working. When you overcharge for something, people find other ways of getting it. They share books. They download music and movies. Instead of persecuting them, take a look at why the system is having problems, and fix that instead.
This sentence is false.
Microsoft is an industry-leading company (whether y'all like it or not) -- for them to begin copying everything Google does establishes Google's domination over them.
They just keep making the door wider and wider for Google to step right into their own markets with behavior like this. I firmly believe that the Microsoft marketing machine is making some serious mistakes in their fight against Google.
Berto
Microsoft: We'll help in your project. But we'll make the files unreadable to all other pieces of reader software by breaking standards and making people pay hundreds of dollars to access your free files.
Job? I don't have time to get a job! Who will sit around and bitch about being broke and unemployed then?
Is Microsoft already resigning to playing second-fiddle to ALL of Google's ideas? Can they not innovate anything on their own?
What have they 'innovated'? Flight Simulator was bought from the Bruce Artwick Organization. Viso was purchased. Solomon from Great Plains Software. Excel came from the same spreadsheet software the 'others' came from. Even Hotmail was purchased from someone else. Has Microsoft released anything that wasn't already available or previously available under the original/previous owner (prior to Microsoft buying the company or the product)?
Microsoft is an industry-leading company (whether y'all like it or not) -- for them to begin copying everything Google does establishes Google's domination over them.
In the same way GM and Ford used to be in the U.S. But look how easily that was upset by early Japanese efforts. Hell, even the Yugo posed a challenge. I think that speaks volumes to "industry-leading" being used to describe market dominance; which is probably the same thing for anyone trying to break in the market?
They just keep making the door wider and wider for Google to step right into their own markets with behavior like this. I firmly believe that the Microsoft marketing machine is making some serious mistakes in their fight against Google.
I think you're right on with this. They are being led by someone else's innovation, which means they are copying successful initiatives produced by others. Google Reader is built upon the success of Google's Gmail - not a successful product from another company. Microsoft seems incapable of coming up with new stuff. And that is how companies start to lose their dominance: look at Kodak (the only film producer/processor prior to digital camera) and Xerox (the only document replicator 25 years ago).
The Luddites were ahead of their time.
I'm glad they're doing something even if its only scrap work to something someone else already came up with. The idea of having digital versions of all books available to all(ala startrek) is a wonderful thing. Its a sad thing that we can't do all books under copyright or work out some sort of agreement. Knowledge to all.
For some reason I refuse to use either spell check or the spacebar properly.
Sounds like a nice idea, but for some reason I just can't feel any enthusiasm for anything Microsoft does these days; only irritation and anger. In this case, there are bound to be strings attached that will make this new book-searching service of theirs pretty much useless to non-Windows users. Everything always has to tie into their monopoly product. That's their core business strategy and that's the way it'll remain until something (Linux?) or someone (Google?) succeeds in making Windows irrelevant.
"A) I don't see that as a leak of that size as a likely scenario. That much data doesn't escape by accident."
Oops! CC numbers.
"B) Oh what a nightmare if it did and we had an electronic backup of every book in existence..."
That's not the nightmare, and you know it.
"The fact is that copyright infringement of books is already easy. All it takes is an automatic document feeder and a good PDF generator. $500."
Bet you I can infringe copyright faster with Nero, than you can with ADF and a scanner.
"I seriously doubt that illegal trading of music would be so big if iTunes or something like it had been around from the beginning. But the industry couldn't get their act together."
With the attitude on patents. I seriously doubt anyone would invest in a time machine to find out.
Microsoft + Yahoo runs absolutely no risks by doing this, as opposed to Google. I think Google's aims to "please everyone" just happened to backfire this time around. While it's a good idea on paper, it wasn't one in reality thanks to the regular copyright paranoia.
Beware: In C++, your friends can see your privates!
MS realizes that what Google is trying to do is huge. If Google succeeds they will have a monopoly on the searching of digital books. MS is not entering this market because it cares about the market. It is doing so to nip Google's dominance in the bud.
This is just an analysis of the business rationale. So, please don't reply with statements like, "Yea but wouldn't you rather Google have the monopoly than Evil MS?".
What is interesting is that Google is not in it for the pure benefit to humaniity of books being digitally searchable. They are in it for the advertising dollars. Period. If there were no ad dollar potential in digitized books Google wouldn't spend money on the project.
Don't kid yourselves people. Every move Google makes is made with one goal in mind: Advertizing. And to Google, Advertizing == Money.
in the same sentance without the words "destroy" or "useless" between them? Nevahr!
It looks like another nail in the Google Print Coffin. Author's Guild and AAP both suing. Google must be so incredibly pissed today. I know they try to put on a good public face, but Eric Schmidt must be throwing a chair across the room right now saying "I'm gonna kill them". ;-). With MS, Yahoo, and most importantly the authors and publishers backing this, Google Print is just about done.
Can we add this to the growing list of projects that Google has released that just haven't panned out, dare I say flopped? Google Search appliance, Google Web Accelerator, GTalk, Google Reader, Personalized Search, Google Ride Finder, Google Personalized Home Page, ummm... yes we can add it, Google Print. LOL.
Google should change their motto from "Don't be Evil" to "Throw enough shit against the wall, something will stick".
Go ahead flame away............... oh and by all means, all you google fan boys, write me and tell me how great those services are, it was just the rest of the world who rejected them that is stupid.
I grew up in a beach resort. You could walk into any grocery store and pick up a free booklet titled Sunny Day. There would be helpful maps, tips for tourists and cool coupons.
So was everybody walking around talking about how Sunny Day is so good for humanity and is a beacon of light in a greedy world? No. Everybody knew that Sunny Day was making money on the publication. That's why they put out the booklet. If they couldn't make money on it anymore, guess what? No more free maps. No more free coupons.
Google is just a big Sunny Day. They want to make money. They think free maps are cool, sure. But if free maps, free email, freely searchable books, free internet searching, etc. didn't contribute to advertizing dollars anymore they'd probably put those on the back burner and work on other projects that made Google richer.
MS wants more money and so does Google. Google just gives away free stuff.
It looks like another nail in the Google Print Coffin. Author's Guild and AAP both suing.
OK. They have been sued over their regular page indexing as well, but that did not end google searching. Google has the legal precedent here and seems likely to prevail.
Can we add this to the growing list of projects that Google has released that just haven't panned out, dare I say flopped? Google Search appliance, Google Web Accelerator, GTalk, Google Reader, Personalized Search, Google Ride Finder, Google Personalized Home Page, ummm... yes we can add it, Google Print. LOL.
Do you have any idea what you're talking about? Google search appliances do good business. Plenty of places buy them to index their internal networks. Gtalk? It has barely entered beta and you call it a flop? You know what? Some Parkinsons researchers I know were just commenting the other day how useful google scholar is and how they use it all the time. They had not heard of Google books yet, but all of them were interested when I mentioned it. Google has dozens of projects going, mostly just to test the waters and a lot of them end up integrated into google search. I know I use it for research. Maybe you should get a clue. These things may not be really popular, but they are profitable and useful and people use a lot of them every day. Speculating that legal action will kill a project is all well and good, but it would be even better if you had a clue about the subject or had read the laws and precedent setting cases before making said uninformed speculations.
Because amazon asked the publishers if they wanted to be included in "search inside the book" instead of demanding that the publishers tell them if the didn't want to be included.
The combined holdings of the key libraries Google is working with is staggering. Each library has millions of volumes (admittedly, many duplicates). The number mentioned by the OCA is 150,000 volumes to start. I imagine the OCA, if it works out, will ramp up that number. Still, it is inconsequential compared to Google's very ambitious goal.
"Microsoft: Your Passion, Our Profit."
Why do all the big players have do everything the next guy is doing? I miss the days when companies actually focused on one, two, or a few things. It seems like this is no longer the case.
-Slashdot Junky
.
Landfill Mining Co.
Managing the (Un)natural Resources of Tomorrow
"I think you all are missing the point: If I want to research a topic...lets say, books about clockwork...I would have to go to numerous large research libraries and spend countless hours finding materials. This is assuming that I can get access to those libraries. I might not be able to afford to travel from NJ to CA to go to the free university library."
I guess they don't have Interlibrary loans in your country. In America we do.
Plus people have been going to libraries to do research for decades. What makes you think you're special?
"Look, I am all for protecting author's rights. I think author's should be paid for their work. But I don't understand why books that are no longer published, even though still protected by copyright, can't be available, for a nominal or no fee, on the internet. It just makes sense to me."
Look. Don't insult our intelligence. You said "every single" and you ment "every single".
"Knowledge should be a public good. It should not be the domain of a few. This is not the middle ages."
Then you don't understand what copyright is then. I suggest you get all the books you can on copyright and READ THEM! Don't get your information from Slashdot.
Actually, all of the things I listed do not turn a profit. Each one costs far more then they make. On top of that they have not been very popular. So while a small community (like your friends that you sighted) might like using these ridiculously expensive tools for free... the reality is that Google is using the same model that all the internet dot bombs used. Spend spend spend, the good pr will get people to like us and we'll figure out a way to make a profit later.
Oh and your "beta" comment. You are the one who really should get a clue. There have been tons of articles talking about why Google leaves almost everything in beta... it's so they can avoid legal action, e.g. Google News.
Moving to the lawsuit. You must not be familiar with the law or legal precedent as it applies here. Maybe you just haven't followed the case. Google has effectively lost the print case before it has even gone to trial. For example, one of the founders throwing a temper tantrum on his blog while a case is pending... who does that? Well we know who does. The hail marry lynch pin they are trying to use is that each individual author has not contacted them; however organizations like the author's guild represent many many many authors. Google can't pick and choose who hands them legal documents on behalf of the authors.
By the way Google search appliance does not do good business. I don't know what world you are living in, but even Google has admitted that to date its not doing as well as they has hoped. They had promised improvements. Some businesses have tried it, but few are happy with it. I'd get into the technical reasons why it doesn't do well (I ordered one for a client when they first came out, so I also know first hand), but I doubt you have ever used one and it would be wasted.
Anyways, enjoy this Google stuff now, because it won't last if Google doesn't change their model, and soon. They need to start creating revenue other then their click ads from web search or they aren't going to last the decade. Yahoo and MS have far too much money, and make no mistake, they will catch Google's click ads. MSN is doing really well internationally as well as Yahoo. Don't forget that the two of them combined have more traffic then Google. They also have protal revenue that google hasn't been able to achieve. If MS buys into AOL, Yahoo becomes the biggest overnight.
In any case, as you can tell, I'm not a Google fan boy like yourself. I'm more of a realist.
Google is using the same model that all the internet dot bombs used. Spend spend spend, the good pr will get people to like us and we'll figure out a way to make a profit later.
Yeah, because all the dot bombs were profitable for four years running in the post dot bomb era. Google is making money. Guess what, Microsoft has not broken even yet on the Xbox. Is it going to be cancelled too?
There have been tons of articles talking about why Google leaves almost everything in beta... it's so they can avoid legal action, e.g. Google News.
Really? Care to provide some links? I've seen some idle speculation to that effect by uninformed yahoos that don't know designating something as beta has no meaning to most end users and thus provides no legal protection. They are in beta because they are side projects that have not finished development yet. Google talk is months old, has one bare-bones client for one platform. It is a beta, not a finished service called a beta.
You must not be familiar with the law or legal precedent as it applies here.
He asserts, yet has nothing to back it up. Look up "Kelly v. Arriba Soft Corp. " and note that all but one appellate court (the one in which the google case was filed) has filed supporting precedent. Why would that be, do you suppose? Maybe because they hope to force the issue to the supreme court (which is virtually guaranteed with split appellate rulings) thus tying the whole thing up in court for years while they try to pass laws to make what Google is doing illegal. Or did you not bother actually looking into this and you're just parroting uninformed crap you read somewhere.
The hail marry lynch pin they are trying to use is that each individual author has not contacted them; however organizations like the author's guild represent many many many authors. Google can't pick and choose who hands them legal documents on behalf of the authors.
That is a minor point of procedure and unimportant. Google is within it's rights to copy entire works and republish even if author's ask them not to. The whole offer to not publish works if an author asked is just a courtesy.
By the way Google search appliance does not do good business. I don't know what world you are living in, but even Google has admitted that to date its not doing as well as they has hoped.
I am pretty sure I read that the appliance division became profitable shortly after the first mini appliance sales cycle, but I don't have a link to back it up. Do you have one to repudiate it?
but I doubt you have ever used one and it would be wasted.
Actually I have used one, but never set up or configured one. It beat our old solution by a mile, although I have no idea what the cost differential was.
Anyways, enjoy this Google stuff now, because it won't last if Google doesn't change their model, and soon.
I see, and what successful, profitable, multimillion dollar company do you run?
any case, as you can tell, I'm not a Google fan boy like yourself. I'm more of a realist.
I'm not a "Google fan boy" as you claim. I don't even use Google for my daily searches (well it is one of several whose results I aggregate.) I do, however, appreciate a lot of the products they bring to market. Their search, maps, etc. are groundbreaking not just for the technology, but for not being annoying like all their competitor's offerings. Subtle, well targeted ads make a huge difference. Also, their tendency to use open standards, like Jabber for their IM offering is something that could greatly improve instant messaging for everyone. It is the first offering from a major player that puts the customer ahead of trying to get a lock-in on the market. I understand the limitations and disagree with some of their choices, but I also appreciate what they have done. I see idiots on Slashdot all the time bitch and moan about how often Google related articles are posted. Here's a