Why Are There More Old Songs On iTunes Than Old eBooks?
New submitter Paul J Heald writes "The vast majority of books and songs from the 20th Century are out-of-print. New data show music publishers doing an admirable job of digitizing older content, but book publishers fail miserably at putting old works in eBook form. I've done some research in an attempt to explain why: 'Music publishers can proceed with the digitization of their back catalog without competing to re-sign authors or hiring lawyers to renegotiate and write new contracts. Research has revealed no cases holding that music publishers must renegotiate in order to digitize their vinyl back catalogs. The situation for book publishers is substantially the opposite. In the landmark case of Random House v. Rosetta Books, the Second Circuit held that Random House had to renegotiate deals with its authors in order to publish their hard copy books in eBook format. ... Another advantage that the music industry may have is the lower cost of digitization. A vinyl album or audio master tape can be converted directly to a consumable digital form and be made available almost immediately. A book, on the other hand, can be scanned quite easily, but in order to be marketed as a professional-looking eBook (as opposed to a low quality, camera-like image of the original book), the scanned text needs to be manipulated with word processing software to reset the fonts and improve the appearance of the text.'"
1. go to ebookoid.com
2. download any of the million ebooks
Thank the crazy Russians for bringing us resources like this
...had a lot of acerbic observations on the topic.
"I said this in 1971, in the very first week of PG, that by the end of my lifetime you would be able to carry every word in the Library of Congress in one hand - but they will pass a law against it. I realized they would never let us have that much access to so much information." http://samvak.tripod.com/busiw...
He was scathing on the topic of the attempts (which are largely succeeding) to convert us from an ownership society to a rentier society:
http://comments.gmane.org/gman...
"I worry that 100 years from now that 99% of foods will be GMO's [Genetically .and this
Manipulated/Manufactured Organisms] and hence under copyright. .
will enforce a copyright-powered hunger/starvation/malnutrition of the body
just as current copyright extensions are powering such for the mind.
The goal of WIPO is that EVERYTHING should HAVE to be paid for, plus a .at a time when everyone COULD
royalty for the intellectual property. .
have everything pretty much free of charge from replicator technology.
100 years ago the atom-powered Nautilus and atomic bomb were fiction,
only 50 years later the Nautilus was being built, and it sailed into
my own home town and their crew came to my school. . . .
Do you REALLY think it won't be even more different in the future?
But WIPO still wants to charge hugely for replicated food, just as
it does for replicated books."
"How to Do Nothing," kids activities, back in print!
The music industry has a long and sordid history of ripping off the artists...in the main there's nothing to negotiate because the music publishers own the republishing rights.
Book publishers, contract to publish the book, in one format.
(The same negotiations often have to take place for paperback rights as well, so it's not like this is something new, and is, in the main simple boilerplate contracting with the author, author's agent, or estate) The renegotiations are hard because the publishers are greedy.
Of course all of your basic /. 'intellectual property is theft' technomarxists who never had to make their living off their own intellectual property couldn't be arsed to comprehend this...while musicians can sometimes eke out a living playing live (when they still own their own music, that is), there's not a lot of call for authors to read their books in front of adoring crowds night after night...
While book scanning can be done by machine, the machinery is going to be expensive and complicated. Your typical bibliophile can't afford it. Scanning a book by hand can take hours, even with a V-shaped book-scanning fixture and two cameras.
The technology for digitizing audio is much easier to acquire and use. Any audiophile can afford the hardware and software to do a tolerable audio rip. Anyone can set up a rip, or several rips, and do real work while the rip takes place in the background. The quality might not be to audiophile standards, but will satisfy most casual users.
Even after you've created an "ebook" of page images, it isn't really suited for use in modern ereaders. For that you need an ePub format, or something similar. The text has to literally be in text format to allow reformatting. A decent modern ebook can adapt the text to different display sizes and different type sizes. This is hard to do.
Compare a typical book produced by Project Gutenberg and a typical book scanned into the Internet Archive. Gutenberg produces true ePubs consisting of text possibly sprinkled with digitized illustrations. Gutenberg might start with automatic text recognition, but its books go through a distributed proofreading process before they're released.
While I value what the Archive does (any digitization is better than none at all) I've discarded most ePubs I've downloaded from them. There are simply too many typos in the text recognition. Their scanned raw images and PDFs are usable, though they lack the flexibility of true ebooks.
One thing with old books is their value is small. Apple wants to sell everything for $10. I can go to a used books store and buy an old book for a couple dollars. I can go to Amazon and but out of copyright books for a couple dollars. I can go to Amazon and buy new books for a few dollars. Even at Amazon, though, many older books are more expensive that what one can find elsewhere. The difference between books and songs is that iTunes provided a new way to monetize old music. Sell single tracks to those who won't but the used music at the resale shop. It is simple, fast, and converting a track to digital is not hugely expensive. Here is another difference. Music no longer has DRM. I have many tracks for itunes because it was always possible to remove the DRM. I have few books from iBooks because the only place I can read them is on an Apple device. Amazon at least has the advantage of having readers on many devices. So, one buys an older book on iBooks, one pays more, one can only read it on limited devices, and publishers have to pay huge fees to Apple.
"She's a scientist and a lesbian. She's not going to let it slide." Orphan Black
I worked in the music industry (in IT). I have no idea where the idea came from that the music publishers didn't have to renegotiate contracts to get digital rights to the music. In reality, when digital rights became important, the music companies spent a huge amount of time and money having teams for at least a decade tracking down rights-holders and negotiating digital rights in order to sell their back catalog, and of course made sure that their new contracts covered selling through the digital service providers. Book publishers have essentially the same legal challenge (though admittedly the details are different).
What is really different is the production logistics.
Music has been digitally produced for a very long time, using open standard formats, and for pre-digital material it's relatively easy to digitize audio (and video) from master tapes, so you only need to do "work" to deal with some very old, obscure media, which is only done selectively. And the music publishers have built systems that are very, very good at managing and format converting huge libraries of audio and video. So, 99% of the time, digitally selling back-catalog music and video is logistically fairly easy - QA, package, price, and send the files to the digital service providers.
Books, however, have been authored in a series of random formats, and for older books there's only the physical book or manuscript and nothing digital. Which means that you often need to physically scan every page in the book/manuscript, OCR it, clean it up, QA the result, etc. And even for the digitally authored books, you need to track down whatever specific physical media and formats each publisher or author used (MacAuthor on 3.5" floppy, LaTeX, MS Word 3 on 5.25" floppy, etc.). So, overall, physically and logistically really complex to deal with for every single back-catalog book.
Look at what Project Gutenberg has produced - an amazing collection, but it required a massive investment of (volunteer) effort to process the books into digital formats.
Enable 3D printed prosthetics!
Well, we are talking about keeping historical books around, in form as close to the originals as possible, right?
Well, my impression of TFA was that we were talking about recent books (20th century) which were out-of-print or unavailable -- not "historical" preservation. And it's about making such texts available to a mass-market audience for purchase. It's not about annotating manuscripts or some sort of academic analysis of manuscripts from hundreds of years ago.
Also, TEI keeps the semantics around, not just the fact that sentence such-and-such is printed in Garamond.
I don't understand. Your first sentence says you want to keep the "form as close to the originals as possible" but now you want to add tags, metadata, and semantic information which was not present in the original texts. Which one is it?
I'm not saying that TEI is bad -- I'm saying it's actually about adding information to make old texts more navigable and superimposing a particular type of analysis on them, not just preserving things in original form. Most people who just want to buy a copy of a book from 1990 in an ebook form don't care about some complex set of metadata that could be used for linguistic analysis or something. They just want the text, retypeset in an electronic form.
But you can always go down the lossy road and re-typeset it for whatever reason
Methinks you do not know what "lossy" means. In the case of preserving old books in electronic format, the full "non-lossy" version would be something like a set of full-color images, showing all details of page layout, font usage, spacing, placement of figures and images, etc. (Even there, for old manuscripts you might want to preserve binding information, gathering structure, etc. which would be lost in just a set of images.) That's desireable for analysis of a manuscript or something very old where reconstructing the exact layout of things is important.
Retypesetting the text is "lossy," in that it loses layout, typesetting, and pagination information -- but, for most people with standard books, they care more about the text than the layout. Also, it's helpful to process texts in this way for electronic formats, because it allows readers to take advantage of tools in readers like changing font size or other layout options.
What you're talking about with TEI is just as "lossy" as retypesetting (though some elements of page layout can be preserved, if desired), but you're also talking about adding in information that wasn't present in the original text. Kinda like ripping an mp3 track off of a vinyl record, and then recording some optional commentary over top of it: "If you listen here, you can hear the bridge, followed by a return to the refrain with backup vocals," etc.
TEI has its place, but I'm not sure it's particularly relevant to the simple idea of making ebooks available from books that may have been published a couple decades ago.
Try buying Spinozas Philosophy in paper, it's expensive but you can get it at Gutenberg for free.
Spinoza in a modern English translation with a proper introduction and notes will save the reader time and pain. I have tried reading the classics in Gutenberg, but they always send me back to Penguin Books and other sources.
Tax Executives Institute?
Hint: Google's search results are personalized. What was the first hit for you might not be the first hit for somebody else.
Why aren't there more e-books? Is it because there aren't the resources needed to produce them inexpensively?
Folks: if people wanted e-books, then the industry would have come up with a machine to produce them. Henry Ford didn't do anything special except notice the huge demand in the public for automobiles. If there was a demand for e-books, someone would have pulled a Henry Ford and invented a way to produce them inexpensively too.
I think there's no demand principally because it's hard to read e-books on a computer display due to display issues. Writing is high contrast and susceptible to visual perceptions of pixellation. People don't see the lack of definition in a scenic image, but the eye does notice the wanderings of the edges of black letters against white backgrounds. When was the last time that you chose to receive a paper by fax? 300dpi is about the minimum that I can stand to read, and on a piece of paper that works out to be 2200 x 3000 pixels. How many devices are you aware of that have that resolution and are larger than 7 x 10 inches (18 x 25 cm)?
If you add to that other issues of convenience, I think you'll have your reasons why e-books haven't yet taken off. To read a book, you spend about three seconds in picking it up off the shelf and opening it. To read an e-book, you grab your tablet/computer/whatever, power it up, find your application for reading the book, and swipe through the screens until you reach the right spot. With a book, I can thumb through the pages to find my place, and I can insert a bookmark (or put my sticky notes if I have more than one.) With my e-book app, I probably get one bookmark. I can write and highlight in a book with a pen. I might be able to highlight in the app, and if I can then I have to remember the command for doing that and the gesture for marking out what I want to highlight. If I use books, I can put as many open books out on the table as will fit. If I use the usual e-book app, I can't look at more than one book at a time.
People have had hundreds of years coming up with the form of books: it will be a few more years before the e-book and tablet people will better it.
Indeed, for e-books, "preserving layout" beyond just keeping the paragraphs (and sections, so that no individual section is too big for the reader's ram) separated is a detriment, as it interferes with readers' abilities to change the layout themselves for various reasons.
My older family members, for instance, like to change the font to a very large size, something that is not possible if the publisher spends too much effort getting the typesetting just right and freezing it in instead of allowing the device to do it on the fly.
Can you be Even More Awesome?!
I own a literary agency that has specialized in SF/F for over sixty years. I can speak to these issues authoritatively.
What the situation is with books printed by publishers is very simple: the publishers do not own the books. Just some rights, under some conditions.
Typically, an author engages a literary agency to represent their work to publishers. This includes submitting the work to publishers who normally handle that particular genre or otherwise might specifically be interested at the moment; negotiating a contract with the publisher, and finding a balance between getting the work published, while retaining any rights not specifically purchased by the publisher. The publisher in turn relies on the expertise the specific agency has in a particular genre or genres. Most large publishers will not deal directly with an author. Most small publishers give it up after trying it for a while -- it's definitely "its own thing."
From the author's POV, the agency brings expertise on rights negotiation, knowledge of publishers, knowledge of foreign sales (either directly or through associate agencies), tax issues, and direct access to editors at the publishers they deal with. From the publisher's POV, they don't have deal with authors who have no knowledge of the legal territory, and whom, in most cases, they would never consider publishing anyway.
The agency also performs triage: the publisher can be assured that the agency felt the work, and the author, was worth representing, and that the work is in a genre the publisher wishes to address, and that takes a huge amount of cruft off the table (look at the self-published stuff on Amazon to see what I mean there. The vast majority of it is truly awful.) It does not guarantee a sale; but it makes it a great deal easier on the publisher, and it does make sales easier for all parties involved.
So when a deal is struck, what publishers purchase from the agency on behalf of the author is the right to produce a work in a particular format under negotiated conditions. For instance, a hardcover edition. That does not give them the right to produce it as an audiobook, or a softcover, or a movie, or a play, or a radio show, etc., although they may also negotiate those rights -- each contract is specific, and the better the contract, the more specific it is.
Most such contracts are most explicit in what rights they confer, and under what conditions, and in terms of time. Others go even further and are explicit in what rights they do not confer.
Another issue here is whether something is in print. Again, the initial contract negotiates what happens when and if the book goes out of print, and what defines that. The rights may revert immediately; they may revert after a period of time; they may not revert at all; they may revert if an additional print run is not done within X period of time, etc. It's all about the contract the publisher accepts.
The specific rights negotiated, particularly on older titles, will vary by publisher and by agency; the most careful agencies have been reserving electronic reproduction rights since the 1950's. At the time, it was a "so what" issue to the publishers. Today, it isn't.
So in many cases, unless the copyright for a book has expired completely, the author controls the e-book rights through the agency representing the author. In others, if those rights were not reserved, the publisher has control (this should be relatively uncommon.) If copyright has expired completely, then the works are in the public's hands.
What with the recent changes in copyright law in favor of longer copyright terms, a huge amount of what we think of as modern works are still under control of whatever contracts are extant, or the rights have reverted to the author, the author's estate, or the author's representative (typically a literary agency.)
When e-books hit the market, there was a great upset in the publishing industry, and they suddenly became extremely conservative on several front