Why Are There More Old Songs On iTunes Than Old eBooks?
New submitter Paul J Heald writes "The vast majority of books and songs from the 20th Century are out-of-print. New data show music publishers doing an admirable job of digitizing older content, but book publishers fail miserably at putting old works in eBook form. I've done some research in an attempt to explain why: 'Music publishers can proceed with the digitization of their back catalog without competing to re-sign authors or hiring lawyers to renegotiate and write new contracts. Research has revealed no cases holding that music publishers must renegotiate in order to digitize their vinyl back catalogs. The situation for book publishers is substantially the opposite. In the landmark case of Random House v. Rosetta Books, the Second Circuit held that Random House had to renegotiate deals with its authors in order to publish their hard copy books in eBook format. ... Another advantage that the music industry may have is the lower cost of digitization. A vinyl album or audio master tape can be converted directly to a consumable digital form and be made available almost immediately. A book, on the other hand, can be scanned quite easily, but in order to be marketed as a professional-looking eBook (as opposed to a low quality, camera-like image of the original book), the scanned text needs to be manipulated with word processing software to reset the fonts and improve the appearance of the text.'"
Sounds like a business opportunity for someone, jump on it.
There's a tidy selection of "old books" (really old) on http://www.gutenberg.org/
Try buying Spinozas Philosophy in paper, it's expensive but you can get it at Gutenberg for free.
"If any question why we died, Tell them because our fathers lied."
I'm sure we all already knew this though.
the scanned text needs to be manipulated with word processing software to reset the fonts and improve the appearance of the text
No, really, at the scale that this is happening, the scanned text actually needs to be converted into TEI using some sane heuristics. It's a world-wide problem that needs a reasonable (less-)semi-(more-)automatic solution, not millions of people unsystematically fiddling in their word processors.
Ezekiel 23:20
Itunes is for profit. Lovers of preservation of old text is on Project Gutenberg. http://www.gutenberg.org/
The truth shall set you free!
Project Gutenberg has over 45,000 public domain eBooks available, free for download.
http://www.gutenberg.org/
1. go to ebookoid.com
2. download any of the million ebooks
Thank the crazy Russians for bringing us resources like this
...had a lot of acerbic observations on the topic.
"I said this in 1971, in the very first week of PG, that by the end of my lifetime you would be able to carry every word in the Library of Congress in one hand - but they will pass a law against it. I realized they would never let us have that much access to so much information." http://samvak.tripod.com/busiw...
He was scathing on the topic of the attempts (which are largely succeeding) to convert us from an ownership society to a rentier society:
http://comments.gmane.org/gman...
"I worry that 100 years from now that 99% of foods will be GMO's [Genetically .and this
Manipulated/Manufactured Organisms] and hence under copyright. .
will enforce a copyright-powered hunger/starvation/malnutrition of the body
just as current copyright extensions are powering such for the mind.
The goal of WIPO is that EVERYTHING should HAVE to be paid for, plus a .at a time when everyone COULD
royalty for the intellectual property. .
have everything pretty much free of charge from replicator technology.
100 years ago the atom-powered Nautilus and atomic bomb were fiction,
only 50 years later the Nautilus was being built, and it sailed into
my own home town and their crew came to my school. . . .
Do you REALLY think it won't be even more different in the future?
But WIPO still wants to charge hugely for replicated food, just as
it does for replicated books."
"How to Do Nothing," kids activities, back in print!
The music industry has a long and sordid history of ripping off the artists...in the main there's nothing to negotiate because the music publishers own the republishing rights.
Book publishers, contract to publish the book, in one format.
(The same negotiations often have to take place for paperback rights as well, so it's not like this is something new, and is, in the main simple boilerplate contracting with the author, author's agent, or estate) The renegotiations are hard because the publishers are greedy.
Of course all of your basic /. 'intellectual property is theft' technomarxists who never had to make their living off their own intellectual property couldn't be arsed to comprehend this...while musicians can sometimes eke out a living playing live (when they still own their own music, that is), there's not a lot of call for authors to read their books in front of adoring crowds night after night...
While book scanning can be done by machine, the machinery is going to be expensive and complicated. Your typical bibliophile can't afford it. Scanning a book by hand can take hours, even with a V-shaped book-scanning fixture and two cameras.
The technology for digitizing audio is much easier to acquire and use. Any audiophile can afford the hardware and software to do a tolerable audio rip. Anyone can set up a rip, or several rips, and do real work while the rip takes place in the background. The quality might not be to audiophile standards, but will satisfy most casual users.
Even after you've created an "ebook" of page images, it isn't really suited for use in modern ereaders. For that you need an ePub format, or something similar. The text has to literally be in text format to allow reformatting. A decent modern ebook can adapt the text to different display sizes and different type sizes. This is hard to do.
Compare a typical book produced by Project Gutenberg and a typical book scanned into the Internet Archive. Gutenberg produces true ePubs consisting of text possibly sprinkled with digitized illustrations. Gutenberg might start with automatic text recognition, but its books go through a distributed proofreading process before they're released.
While I value what the Archive does (any digitization is better than none at all) I've discarded most ePubs I've downloaded from them. There are simply too many typos in the text recognition. Their scanned raw images and PDFs are usable, though they lack the flexibility of true ebooks.
Simple, because they are morons. For the question at hand for those that really do not see the glaringly obvious: Recording sound to digital is orders of magnitude easier than making good OCRed ebooks out of print copies.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Authors' estates are notoriously greedy and short-sighted. I've seen several efforts come to grief on the fact that the heirs frequently have highly-inflated ideas of what the books are worth (Hey, they're classics!), and by God they want their "cut." Project Gutenberg had to fend off efforts by one "estate manager" to claim that materials which were clearly in public domain weren't (sort of a dwarf Warner Music). Another effort to publish "the complete Murray Leinster" foundered the same way.
I've never understood why, even really old, journal articles are routinely available digitally but textbooks aren't. The only thing more pissing annoying than this is when Google has digitised (and OCRed) the book but, due to copyright reasons, can't show you more than about three sentences at a time. Despite the fact that the original publisher cba to make a digital version available. It's there... but you can't get it. Incredibly frustrating.
One thing with old books is their value is small. Apple wants to sell everything for $10. I can go to a used books store and buy an old book for a couple dollars. I can go to Amazon and but out of copyright books for a couple dollars. I can go to Amazon and buy new books for a few dollars. Even at Amazon, though, many older books are more expensive that what one can find elsewhere. The difference between books and songs is that iTunes provided a new way to monetize old music. Sell single tracks to those who won't but the used music at the resale shop. It is simple, fast, and converting a track to digital is not hugely expensive. Here is another difference. Music no longer has DRM. I have many tracks for itunes because it was always possible to remove the DRM. I have few books from iBooks because the only place I can read them is on an Apple device. Amazon at least has the advantage of having readers on many devices. So, one buys an older book on iBooks, one pays more, one can only read it on limited devices, and publishers have to pay huge fees to Apple.
"She's a scientist and a lesbian. She's not going to let it slide." Orphan Black
Lots of books are out of print that were printed since the publishing industry went to digital production systems. That fiction book that's more than a year old and wasn't selling well? It's not coming out on dead trees again, but they've got it in Word. Older books may be in older formats, but even if they're proprietary formats, extracting the text (for books without pictures) isn't that hard.
It's a problem with publishing rights and contracts and publishers' predictions about profitability.
And even with books that require scanning, Dover Books did surprisingly good business for years selling fuzzy images of out-of-copyright books; these days it wouldn't be too hard to OCR and reimage most of them, but alternatively bits are cheap enough these days that they could be available in image formats instead of OCR.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
I worked in the music industry (in IT). I have no idea where the idea came from that the music publishers didn't have to renegotiate contracts to get digital rights to the music. In reality, when digital rights became important, the music companies spent a huge amount of time and money having teams for at least a decade tracking down rights-holders and negotiating digital rights in order to sell their back catalog, and of course made sure that their new contracts covered selling through the digital service providers. Book publishers have essentially the same legal challenge (though admittedly the details are different).
What is really different is the production logistics.
Music has been digitally produced for a very long time, using open standard formats, and for pre-digital material it's relatively easy to digitize audio (and video) from master tapes, so you only need to do "work" to deal with some very old, obscure media, which is only done selectively. And the music publishers have built systems that are very, very good at managing and format converting huge libraries of audio and video. So, 99% of the time, digitally selling back-catalog music and video is logistically fairly easy - QA, package, price, and send the files to the digital service providers.
Books, however, have been authored in a series of random formats, and for older books there's only the physical book or manuscript and nothing digital. Which means that you often need to physically scan every page in the book/manuscript, OCR it, clean it up, QA the result, etc. And even for the digitally authored books, you need to track down whatever specific physical media and formats each publisher or author used (MacAuthor on 3.5" floppy, LaTeX, MS Word 3 on 5.25" floppy, etc.). So, overall, physically and logistically really complex to deal with for every single back-catalog book.
Look at what Project Gutenberg has produced - an amazing collection, but it required a massive investment of (volunteer) effort to process the books into digital formats.
Enable 3D printed prosthetics!
Do you really want the state deciding what is sufficient 'talent' for people to be full-time writers?
Skipping the intense creepiness factor entirely I know exactly how that would go down in Texas. They'd object to it as a handout program until they realized they could set it up as handing out funds only to Biblical "scholars" or other people who just happened to coincidentally have similar political interests. If you tried to set up an objective board they would try to get control over who sat on the board.
I began my collection with these 80.000 ebooks.
magnet:?xt=urn:btih:5ed9585fb9db0489a9fcd437d7880ba59a4ceabe&dn=largest+fiction+library+english+ebooks+80000+authors+9000&tr=udp%3A%2F%2Ffr33domtracker.h33t.com%3A3310%2Fannounce&tr=udp%3A%2F%2Fopen.demonii.com%3A1337
Why are there more old songs online than old e-books? That's simple, songs are in a format readily convertable to digital. Old masters, just go through an analog to digital conversion that can be pretty much automated. Most don't even need that as they were converted to digital when CDs first came on the scened.
Books, on the other hand, particularly prior to electronic publishing often dealt with paper manuscripts. Those have to be scanned and converted, a much more labor intensive process. Even if they had been converted to an electronic format for editing and typesetting purposes and that format still exists, it needs to be converted into something modern PCs and tablets can read.
While the latter conversion is less labor intensive than converting paper manuscripts, one then needs to look at the potential market. There are many more buyers for an early Frank Sinatra recording than there are for a copy of The Red Pony. Unless there is demand for a product, an old book, in this case, suppliers won't produce it. And, as a corollary, even if there is demand, it has to be at a price point that is profitable for the supplier.
In the end, the answer to the question of "Why are there more old songs online than old e-books?" is simple economics.
The problem is this:
Making an ebook is way more time consuming than you think. OCRing is just part of the story. It's the copyediting that takes time and money.
Which is the main issue here: money. Old books just won't pull in the revenue needed to defray the costs of turning it into an ebook. That plus the rights issues? That's why we focus mainly on frontlist books as ebooks, as well as backlist from sellable authors. The rest will fall through the cracks and go away, since there is no business model in place to make it profitable for the publisher to spend the effort building ebooks from old titles.
Perhaps this could be solved through crowd-sourcing? It's totally worth investigating, since so many old books will simply disappear.
Try buying Spinozas Philosophy in paper, it's expensive but you can get it at Gutenberg for free.
Spinoza in a modern English translation with a proper introduction and notes will save the reader time and pain. I have tried reading the classics in Gutenberg, but they always send me back to Penguin Books and other sources.
The music industry has a long and sordid history of ripping off the artists...
This.
The music industry got cold sweat from the diversity of available media (vinyl, magnetic tapes, optical disks, whatever) and the easyness of internet sharing and binded the artists with all-encompassing contracts, taking the music out of their hands: you are not allowed to perform your own songs in public without your label sanctioning it (and making millions from your fans by selling them beer) first because, technically, they are not your songs any more.
In return, the label sends its armies of lawyers (along with corrupt government elements) to Hell and beyond to track, terrorize and imprison teenagers who had the audacity to publicly share even a small exrept of the (formerly yours) work, and grand you your 0.00000000001% cut from the process.
I only know the book publishing "established" rules a little bit, and -unexpectedly- they are not too much in favor for the writer either.
A good initiative, especially in the music industry, is to have different classifications: streaming is not vinyl is not concert, and those things need to be handled seperately. Artists need to stop signing those all-encompassing deals and a very good start would be e.g. to use a label to produce an LP, then go to a different label and let them handle your content via streaming on the internet: do NOT give up all rights for your art, because that is what you live from. The labels will resist of course but if enough artists do it, then it is done.
The three laws of thermodynamics:(1) You can't win. (2) You can't break even. (3) You can't even quit.
Why do you encourage Apple by buying stuff there?
Why aren't there more e-books? Is it because there aren't the resources needed to produce them inexpensively?
Folks: if people wanted e-books, then the industry would have come up with a machine to produce them. Henry Ford didn't do anything special except notice the huge demand in the public for automobiles. If there was a demand for e-books, someone would have pulled a Henry Ford and invented a way to produce them inexpensively too.
I think there's no demand principally because it's hard to read e-books on a computer display due to display issues. Writing is high contrast and susceptible to visual perceptions of pixellation. People don't see the lack of definition in a scenic image, but the eye does notice the wanderings of the edges of black letters against white backgrounds. When was the last time that you chose to receive a paper by fax? 300dpi is about the minimum that I can stand to read, and on a piece of paper that works out to be 2200 x 3000 pixels. How many devices are you aware of that have that resolution and are larger than 7 x 10 inches (18 x 25 cm)?
If you add to that other issues of convenience, I think you'll have your reasons why e-books haven't yet taken off. To read a book, you spend about three seconds in picking it up off the shelf and opening it. To read an e-book, you grab your tablet/computer/whatever, power it up, find your application for reading the book, and swipe through the screens until you reach the right spot. With a book, I can thumb through the pages to find my place, and I can insert a bookmark (or put my sticky notes if I have more than one.) With my e-book app, I probably get one bookmark. I can write and highlight in a book with a pen. I might be able to highlight in the app, and if I can then I have to remember the command for doing that and the gesture for marking out what I want to highlight. If I use books, I can put as many open books out on the table as will fit. If I use the usual e-book app, I can't look at more than one book at a time.
People have had hundreds of years coming up with the form of books: it will be a few more years before the e-book and tablet people will better it.
Where do you live that used bookstores are struggling? The one in my hometown is at least as busy as the local Barnes & Noble.
These words, "needs to be," I do not think they mean what you think they mean.
I don't think the story submitter is knowledgeable about audio. His statement " A vinyl album or audio master tape can be converted directly to a consumable digital form and be made available almost immediately" seems pretty far from accurate.
Old master tapes degrade. Dynamic range can be very poor. There is hiss, short dropouts are possible, etc. etc.
Vinyl -- even when in mint condition -- has a plethory of problems as well. Pitch, wow, flutter, rumble, clicks, pops, poor stereo separation, etc. etc. I can easily spend the best part of a weekend with properly restoring a vinyl album. Automatic declickers/denoisers/debuzzers don't do a satisfactory job. And yes, despite being a consumer, I do have the proper software at my disposal.
Bottom line: creating properly restored and mastered digital audio from old analog sources is quite hard. I don't buy the argument that digitizing books is harder.
1. Different licensing terms.
2. Differing technical hurdles involved in the digitization process.
Both have merits to some degree. If text conversion is more costly than that for audio and the rights negotiations represent a higher risk barrier to overcome, then fewer people will undertake the time/capital expenditure to digitally re-publish text works. Since the legal underpinnings of copyright are in part for the benefit of society as a whole, than it would behoove us to lower this cost/risk barrier. Using both a carrot and stick approach, give the holders of copyright a tax incentive to revising their publishing agreements to include ePub and other future formats. For those that drag their heels, revise the terms of copyright law to expire rights that don't keep up with publishing technologies.
Have gnu, will travel.
In days of old, when knights were bold
ebooks weren't invented
Then came Jobs and his iTunes mobs
now no old books are rented.
http://www.gutenberg.org/
I'm not sure if this is current but IBooks used to be harder to get into if your were self publishing a book than Amazon or B&N. Thus an author who recovered the rights to their book would have problems self publishing it in IBooks.
I own a literary agency that has specialized in SF/F for over sixty years. I can speak to these issues authoritatively.
What the situation is with books printed by publishers is very simple: the publishers do not own the books. Just some rights, under some conditions.
Typically, an author engages a literary agency to represent their work to publishers. This includes submitting the work to publishers who normally handle that particular genre or otherwise might specifically be interested at the moment; negotiating a contract with the publisher, and finding a balance between getting the work published, while retaining any rights not specifically purchased by the publisher. The publisher in turn relies on the expertise the specific agency has in a particular genre or genres. Most large publishers will not deal directly with an author. Most small publishers give it up after trying it for a while -- it's definitely "its own thing."
From the author's POV, the agency brings expertise on rights negotiation, knowledge of publishers, knowledge of foreign sales (either directly or through associate agencies), tax issues, and direct access to editors at the publishers they deal with. From the publisher's POV, they don't have deal with authors who have no knowledge of the legal territory, and whom, in most cases, they would never consider publishing anyway.
The agency also performs triage: the publisher can be assured that the agency felt the work, and the author, was worth representing, and that the work is in a genre the publisher wishes to address, and that takes a huge amount of cruft off the table (look at the self-published stuff on Amazon to see what I mean there. The vast majority of it is truly awful.) It does not guarantee a sale; but it makes it a great deal easier on the publisher, and it does make sales easier for all parties involved.
So when a deal is struck, what publishers purchase from the agency on behalf of the author is the right to produce a work in a particular format under negotiated conditions. For instance, a hardcover edition. That does not give them the right to produce it as an audiobook, or a softcover, or a movie, or a play, or a radio show, etc., although they may also negotiate those rights -- each contract is specific, and the better the contract, the more specific it is.
Most such contracts are most explicit in what rights they confer, and under what conditions, and in terms of time. Others go even further and are explicit in what rights they do not confer.
Another issue here is whether something is in print. Again, the initial contract negotiates what happens when and if the book goes out of print, and what defines that. The rights may revert immediately; they may revert after a period of time; they may not revert at all; they may revert if an additional print run is not done within X period of time, etc. It's all about the contract the publisher accepts.
The specific rights negotiated, particularly on older titles, will vary by publisher and by agency; the most careful agencies have been reserving electronic reproduction rights since the 1950's. At the time, it was a "so what" issue to the publishers. Today, it isn't.
So in many cases, unless the copyright for a book has expired completely, the author controls the e-book rights through the agency representing the author. In others, if those rights were not reserved, the publisher has control (this should be relatively uncommon.) If copyright has expired completely, then the works are in the public's hands.
What with the recent changes in copyright law in favor of longer copyright terms, a huge amount of what we think of as modern works are still under control of whatever contracts are extant, or the rights have reverted to the author, the author's estate, or the author's representative (typically a literary agency.)
When e-books hit the market, there was a great upset in the publishing industry, and they suddenly became extremely conservative on several front
Takes more time to scan in all the old books versus encoding the various tones and warbles we call music.
> A book, on the other hand, can be scanned quite easily, but in order to be marketed as a professional-looking eBook (as opposed to a low quality, camera-like image of the original book), the scanned text needs to be manipulated with word processing software to reset the fonts and improve the appearance of the text.'"
Instead of converting images to text, why not simply identify the rectangles boxing all the words, and reflow them? If I have a small PNG for each word of a book, and their positions in the page, I could write an algorithm to reflow the words into any desired row width and page size. Images could be captured the same way. They don't need to actually OCR the text unless they want to implement search.
please compare the size of a png of the word 'the' and it's size in UTF-8 (24 bytes.)
consider that people use ebooks on bandwidth and storage constrained devices.
This is a concern.
http://www.spiderrobinson.com/melancholyelephants.html