War and Nookd — eBook Regex Gone Haywire
PerlJedi tips a story that highlights one of the downsides to ebooks. A blogger who recently read Tolstoy's War and Peace on his Nook stumbled upon some odd phases, such as: "It was as if a light had been Nookd in a carved and painted lantern..." After seeing the word 'Nookd' a few more times, he found a dead-tree version of the book and discovered that the word was supposed to be 'kindled.' Every instance of the word 'kindle' in the ebook had been replaced with 'Nook.'
"The Superior Formatting Publishing version isn’t a Barnes and Noble book, so this isn’t the work of a rogue Nook marketer from B&N. Rather, it’s likely that Superior Formatting Publishing ported its Kindle version of War and Peace over to the Nook — doing a search and replace to make sure that any Kindle references they’d inserted, such as in the advertising at the end of the book about their fine Kindle products, were simply changed to Nook. The unwitting hilarity of a publisher doing a 'find and replace' and accidentally changing the text of a canonical work of Western thought is alarming. Many versions of e-books are from similar outfits, that distribute public domain works formatted for Kindle or Nook at the lowest possible prices. The great democratizing factor of the ebook formats – that anyone can easily distribute – can also mean that readers can never be quite sure that they are viewing the texts as the author intended."
But I went back and searched every kindle and cranny to set every instance of the word back to kindle to fix it.
I'm only human.
My work here is dung.
Such an amazing set of tools such as diff and grep would probably amaze them.
"Here Lies Philip J. Fry, named for his uncle, to carry on his spirit"
"I accidentally Western Literature, is that bad?"
It's not just intentional malice you need to look out for but also just pure distilled stupidity.
the preceding comment is my own and in no way reflects the opinion of the Joint Chiefs of Staff
sed -i s/wand/wang/g Harry\ Potter*
Don't blame me, I voted for Kodos
Unless it is in Russian. Any translation runs the risk of not being "as the author intended".
Starships were meant to fly, Hands up and touch the sky - Nicky Minaj
So, this story is definitely an amusing anecdote, but I feel like TFA has the wrong takeaway. The fact is, while this specific issue is obviously e-book related, the overall problem of poor quality, low cost public domain publications is in no way specific to e-books. There have always been low budget publishing houses that print poorly edited, poorly translated versions of public domain works. Spend some time digging around used book sales, you'll find an endless supply of these, most notably from the 60's and 70's.
No, the sad part is full price books from Amazon with incoherent pagination, horribly over recompressed jpegs and a verdant sea of spelling errors. I'd give Project Gutenberg a pass for those sorts of things except that the majority of PG books I've read are actually pretty well done.
When I'm paying top dollar for a product, I'd like some attempt at quality control....
Faster! Faster! Faster would be better!
They really shouldn't mess with the clbuttics.
:wq
Just more of the same clbuttic errors.
(Hint: "ass" was one of the 13 words.)
But soft, what light through yonder Linux breaks?
It is the east, and Juliet is the Oracle(TM).
Arise, fair Oracle(TM), and kill the envious moon,
Who is already sick and pale with grief
That thou, her maid, art far more fair than she
Every novel should have an MD5 hash....
It's a dangerous world of low cost ebooks out here
Nah, some of the expensive ebooks are worse; I've seen a number of people complain about e-books of recent high-priced novels where they've clearly OCR-ed the print book rather than use the actual digital text it was created from, because it's full of uncorrected OCR errors or 'corrections' to the OCR errors which are even further from what the text should say.
This is the problem exactly. I can deal with odd formatting from a PG book (though as you say, most are fine); what pisses me off is recent, full price ebooks where there has obviously not been the slightest attempt at editing or typesetting. One I got recently had a consistent problem where quoted text changed font & size after the first paragraph, which is pretty jarring. A full price book on my Nook should be a better experience than PG or scanned & OCR'd pdb were on my old Palm Pilot but sometimes these types of glitches just take you out of the experience & actually seem worse.
The Oatmeal's book "5 Very Good Reasons to Punch a Dolphin in the Mouth" I luckily got out of the library (through Overdrive) - the images are so small as to be unreadable, both on the PC & ipad. If you look at the Play store, there are lots of good reviews, but they're all from Goodreads & such for the paper version. I'm sure it's funny, if you can read it; if I'd paid money for this pile of bits I'd be pissed. Does the publisher not own an ipad or a Kindle Fire? Did they not load it on one single device & say to themselves, "hmm, this really sucks, let's fix it"?
There is a Wikipedia article about this issue:
http://en.wikipedia.org/wiki/Scunthorpe_problem
"The problem was named after an incident in 1996 in which AOL's dirty-word filter prevented residents of the town of Scunthorpe, North Lincolnshire, England from creating accounts with AOL, because the town's name contains the substring cunt.[1] Years later, Google's filters apparently made the same mistake, preventing residents from searching for local businesses that included Scunthorpe in their names.[2]"
There is also a stub article about a specific instance of the replacement effect: http://en.wikipedia.org/wiki/Medireview
You do realize that you can actually post the word "nigga" on slashdot, right?
apparently AKabral is one of many avatars of Ironyman.
oh, and the word being referred to is nigger
the preceding comment is my own and in no way reflects the opinion of the Joint Chiefs of Staff