Slashdot Mirror


Scan a Book In Five Minutes With a $199 Scanner? (teleread.com)

New submitter David Rothman writes: Scan a 300-page book in just five minutes or so? For a mere $199 and shipping — the current price on Indiegogo — a Chinese company says you can buy a device to do just that. And a related video is most convincing. The Czur scanner from CzurTek uses a speedy 32-bit MIPS CPU and fast software for scanning and correction. It comes with a foot pedal and even offers WiFi support. Create a book cloud for your DIY digital library? Imagine the possibilities for Project Gutenberg-style efforts, schools, libraries and the print-challenged as well as for booklovers eager to digitize their paper libraries for convenient reading on cellphones, e-readers and tablets. Even at the $400 expected retail price, this could be quite a bargain if the claims are true. I myself have ordered one at the $199 price.

107 of 221 comments (clear)

  1. CCD on a stick by Anonymous Coward · · Score: 1

    You still have to turn pages manually, I had expected they would have automated that (well, perhaps better if you still want to return the book to the library later).

    Any digital camera on a tripod can do the same thing.

    1. Re:CCD on a stick by naughtynaughty · · Score: 4, Informative

      A digital camera on a tripod PLUS ... Proper lighting Foot pedal interface Lots of software to take the pictures, manipulate the images and stitch them all together into an eBook So a bit more than just a digital camera and a tripod

    2. Re:CCD on a stick by Applehu+Akbar · · Score: 2

      "Any digital camera on a tripod can do the same thing."

      Both of the smartphone OSes have apps for that, and they perform just as well as a digital camera on a tripod, and rival a good flatbed. Back in the nineteen hundreds, if I wanted to save an article I was reading at the library, I had to check out the volume and bring it home, or bring it to the reserve librarian, who would make a not-very-good paper copy for me at a buck a page - assuming that some horrible copyright objection wasn't raised.

      Now, wherever I might be, I just whip out my iPhone and run JotNot, which snaps a picture of each page and saves it as a PDF, just like a flatbed scanner. I love living in the future!

    3. Re:CCD on a stick by stms · · Score: 1

      I wonder why something like this isn't included
      https://www.youtube.com/watch?...
      it doesn't seem that mechanically complex.

    4. Re:CCD on a stick by arglebargle_xiv · · Score: 1

      You still have to turn pages manually, I had expected they would have automated that (well, perhaps better if you still want to return the book to the library later).

      Any digital camera on a tripod can do the same thing.

      In theory, yes, in the same way that anyone can build their own home from raw materials. Scanners like this have been around for awhile, and if you can afford the five-figure price tag they do a good job. What these guys have done is lowered the cost from five figures to three. If it works as advertised (in other words as well as a $50,000 equivalent), it's a pretty amazing piece of technology. I'd really like to see some independent, third-party reviews of how well it performs before I go out and buy one though, just something like curvature correction is a major task in image processing when you have to deal with things like line diagrams.

    5. Re:CCD on a stick by naughtynaughty · · Score: 2

      Video cameras don't have high enough resolution to produce good quality scans of printed material. A standard 300dpi scan of an 8.5 x 11" sheet of paper results in 8.5m pixels. This particular device claims it has 16m pixels which would be about right to be able to cover a scanning surface area that appears to be bigger than an 8.5 x 11" sheet. Another approach might be to detect when a page has been turned using a low resolution video sensor and using that to trigger the higher resolution camera.

    6. Re:CCD on a stick by doccus · · Score: 1

      I hate reading scanned books. The barely legible text curls up near the binding if the scanner was trying to preserve the book. Works great if you rip the pages out befoire scanning. Kinda not recommended with first edition Dickens though... :-)

    7. Re: CCD on a stick by Michael+Qbs · · Score: 1

      Seems the product still in crowdfunding campaign. You already have it?

    8. Re:CCD on a stick by Michael+Qbs · · Score: 1

      The most important is how to improve scan result using software.

    9. Re:CCD on a stick by gzuckier · · Score: 2

      You still have to turn pages manually, I had expected they would have automated that (well, perhaps better if you still want to return the book to the library later).

      Any digital camera on a tripod can do the same thing.

      Heck. Get a fine tooth saw blade and separate the pages from the spine, then load them into a scanner with a page feed.

      --
      Star Trek transporters are just 3d printers.
    10. Re:CCD on a stick by gzuckier · · Score: 1

      "Any digital camera on a tripod can do the same thing."

      Both of the smartphone OSes have apps for that, and they perform just as well as a digital camera on a tripod, and rival a good flatbed. Back in the nineteen hundreds, if I wanted to save an article I was reading at the library, I had to check out the volume and bring it home, or bring it to the reserve librarian, who would make a not-very-good paper copy for me at a buck a page - assuming that some horrible copyright objection wasn't raised.

      Now, wherever I might be, I just whip out my iPhone and run JotNot, which snaps a picture of each page and saves it as a PDF, just like a flatbed scanner. I love living in the future!

      Get the book you want in audio format, then run it through voice recognition software.

      --
      Star Trek transporters are just 3d printers.
    11. Re:CCD on a stick by 1u3hr · · Score: 1

      Check if Library Genesis has an epub of it.

  2. Welcome to 2006 by ShooterNeo · · Score: 4, Insightful

    You've been able to do this for years and years a different way.

    1. Get a sheet fed scanner like a Fujitsu Snapscan ($400)
    2. Cut the binding off the book
    3. Place the stack of pages into the scanner
    4. Get a coffee

    And you're done, the thing's 600 DPI and does both sides in the same pass. It creates a PDF directly, and you then want to OCR the PDF, running a sharpen filter on the text, and decide on how much you want to compress the PDF. A 1000 page textbook ends up being about 700 megabytes, in crystal clear quality.

    1. Re:Welcome to 2006 by DavidRothman9947 · · Score: 5, Insightful

      Thanks, but what about those of us who might prefer nondestructive scanning? Also consider other factors--for example, the speed and quality of the scans, as well as the price. The Czur appears to be several times faster than a $600 model from Fujitsu that allows nondestructive book scans. If you're scanning lots of books, that won't be a trivial detail. As for quality, the Fujitsu is good but not nirvana. Let's see if the Czur will do better.

    2. Re:Welcome to 2006 by DrXym · · Score: 2

      Yes but... you destroy your book in the process. The ideal scanner would be one which allows a book to be scanned without destroying it and could correct for page distortion and other artifacts.

    3. Re:Welcome to 2006 by gnupun · · Score: 1

      2. Cut the binding off the book

      No need to cut anything off with this scanner (if you've seen the demo youtube video). So will users just check out books from the university/public library and scan it at home? Later they can upload it to bittorrent or other sharing sites.

      Is it likely this device will be banned because it allows easy circumvention of copyright laws?

    4. Re:Welcome to 2006 by Maxo-Texas · · Score: 3, Insightful

      Several comments:

      1 So do VCR's.

      2 Most books are available within a day on multiple sites.

      3 Most books are available within days at libraries.

      4 This only slightly speeds up/makes the process easier. Anything you can read can be transcribed.

      5 80 people can transcribe 80 different books quickly.

      Who knows- they might try- but it seems like a waste of their money to me.

      --
      She was like chocolate when she drank... semi-sweet at first and then increasingly bitter.
    5. Re:Welcome to 2006 by gnupun · · Score: 2

      4 This only slightly speeds up/makes the process easier. Anything you can read can be transcribed.

      The speedup is very high... any book scanned in an hour at zero cost (other than the one-time $199 scanner cost). Try transcribing manually (for example typing the contents of a book into your editor) and see how long and tedious a task that is.

      If the OCR quality is as good as they say it is, the book's pdf file size will be really small (less than 50 MB).

    6. Re:Welcome to 2006 by Aereus · · Score: 1

      The problem I forsee with this is for books that won't stay open on their own, or ones that barely do and have significant page curl. Still possible with the foot pedal I guess, but a lot more annoying.

    7. Re:Welcome to 2006 by gnupun · · Score: 1

      Go to 1:13 in the video: https://www.youtube.com/watch?... . They appear to have solved page curl via the "Flattening Curve" process.

    8. Re:Welcome to 2006 by drinkypoo · · Score: 1

      I saw a DIY on this a while back. You build an acrylic cube with two adjacent open faces, and a 90 degree book stand. Then you stick two cameras in the cube...

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    9. Re:Welcome to 2006 by naughtynaughty · · Score: 1

      I have a SnapScan, its sheet feeder won't hold an entire book and the process of scanning hundreds of pages each from many books will generate substantial wear on a SnapScan. There also tend to be misfeeds that you need to manually fix. SnapScan is great at what it does but I wouldn't want to destroy books and manually feed them through it if a cheaper, faster, non-destructive method existed.

    10. Re:Welcome to 2006 by trout007 · · Score: 2
      --
      I love Jesus, except for his foreign policy.
    11. Re:Welcome to 2006 by houghi · · Score: 2

      Prototype 1 could scan the majority of books without damage, but may tear one or two pages in some books. Out of 50 books tested, 45% had one or two of their pages either torn or folded. This is a very early prototype and there are many areas for improvement in the design.

      --
      Don't fight for your country, if your country does not fight for you.
    12. Re:Welcome to 2006 by Panoptes · · Score: 3, Interesting

      There are two curses of modern book publishing that cause problems whatever hardware you use. The first is so-called 'perfect binding' in which the folds of page gatherings, through which the sections are traditionally sewn together, are instead sliced off and glued to make a rigid spine with an exceedingly narrow angle of opening; the second is the use of low-grade, thin paper with high show-through that mucks up the scan.

      The best software I've found to scan and collate is Softi ScanWiz. With it you may scan one stack of pages, flip the stack and scan the other side - the program then shuffles the page images into the correct order. It also automatically adjusts brightness and contrast so as to minimise ink show through.

    13. Re:Welcome to 2006 by smchris · · Score: 1

      I have run over 1200 pounds of paper from townhouse through Canon to recycling. A few thousand books, mags and newsletters. Basically, I agree with the premise that it can be done with a $400 autofeeder and a spine cutter and I also agree with the objections. Is this a webcam on a tripod and something like gscan2pdf? Maybe. How well the software handles things like page curl is important to how worthwhile it is. But he is only asking a fraction of what an autoscanner setup would cost so it is not that expensive and it might even be a good supplement for the books you do not want to destroy. Tempting. It is not like autofeeders do not require some attention at hand too and you have to think about the cost of those replacement rollers.

    14. Re:Welcome to 2006 by Antique+Geekmeister · · Score: 1

      For older, more fragile paper, such as non-acid-free paper that's been sitting out in sunlight at all, or well-thumbed technical manuals, the paper feed will shred them.

    15. Re:Welcome to 2006 by shaitand · · Score: 1

      The current process is removing the spine and running through a document feeder not typing manually.

    16. Re:Welcome to 2006 by shaitand · · Score: 1

      They claim... no before and after evidence presented.

    17. Re:Welcome to 2006 by ncc74656 · · Score: 1

      They appear to have solved page curl via the "Flattening Curve" process.

      Also, if you're using your fingers or thumbs to hold the book open, the software is supposed to erase them from the image, so there's less for the page-flattening algorithm to do.

      --
      20 January 2017: the End of an Error.
    18. Re:Welcome to 2006 by Maxo-Texas · · Score: 1

      It only speeds it up one time when you have personal access to the book.

      Perhaps I should be clearer above, 80 different people can transcribe/scan 80 different books much faster than you can transcribe 80 books (even with the device).

      This device is more targeted at the individual user platform shifting books.

      My main point is that trying to pass a law against it is unlikely because it is unlikely to make pirated books available any faster than they are now.

      --
      She was like chocolate when she drank... semi-sweet at first and then increasingly bitter.
    19. Re:Welcome to 2006 by shaitand · · Score: 1

      Slashdot was actually hiding some of the conversation.

      But the current non-destructive method still isn't transcription regardless of what the GP said. It is a phone + app to take pictures, then apply a few tools to ocr and combine the text into your choice of digital file. This basically just puts the camera on a stand, provides a footpedal, and automates the toolchain to perform the same process. Not really a huge speedup.

      Honestly though, most pirated books come from simply cracking the encryption on the popular formats from B&N and Amazon. Which gives much higher quality, eliminates OCR errors, and preserves the books digital structure, TOC, etc. You can then enjoy all the advantages including having you place sync'd between devices and so forth. There are dumps of every kindle book on Amazon.com periodically.

      Not to say publishers won't attack this non-nonsensically but this is how people pirated books in the 90's not so much today. This really is only useful for people who want to legitimately back up and enjoy their own personal books.

    20. Re:Welcome to 2006 by gzuckier · · Score: 1

      You've been able to do this for years and years a different way.

      1. Get a sheet fed scanner like a Fujitsu Snapscan ($400) 2. Cut the binding off the book 3. Place the stack of pages into the scanner 4. Get a coffee

      And you're done, the thing's 600 DPI and does both sides in the same pass. It creates a PDF directly, and you then want to OCR the PDF, running a sharpen filter on the text, and decide on how much you want to compress the PDF. A 1000 page textbook ends up being about 700 megabytes, in crystal clear quality.

      I vaguely recall something recently about an IR scanner or something that could be focused finely enough to read the pages sequentially down through a closed book.

      --
      Star Trek transporters are just 3d printers.
    21. Re:Welcome to 2006 by ShooterNeo · · Score: 1

      That's incredible that it is even possible, though I suspect it might not ever become the common way to do this (the common way to do this is the author of the book just exports his digital file to a format compatible with e-readers, like word->mobi/epub or pdf. All new books are being published like this)

    22. Re:Welcome to 2006 by david_thornley · · Score: 1

      Assuming there's large enough margins on the pages, it's possible to rebind the book, perhaps in the original binding, and it'll be nearly as good as before.

      --
      "When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
  3. Ob by Hognoxious · · Score: 1, Funny

    Much cheaper, same functuionality

    P.S. If anyone from Staples is reading, your website is a bag of arse.

    --
    Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    1. Re:Ob by Hognoxious · · Score: 1

      This one came up first when I googled shredders. http://www.staples.com/InfoGua... But it wouldn't let me link to a specific product, just a list. Just picked it out of the history and now it's showing a printer.

      Just refreshed it & it's showing

      {"pricing":{"id":"StaplesUSCAS/en-US/1/CL167883/1781826","listPrice":99.99,"finalPrice":59.99,"savings":0,"nowPrice":0,"instantSavings":40

      ... and much more of the same.

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    2. Re:Ob by Chris+Mattern · · Score: 1

      Odd. The link takes me right to a Staples page for an 8-page crosscut shredder.

    3. Re:Ob by Hognoxious · · Score: 1

      It does now, but it certainly didn't earlier. Maintenance?

      P.S. Since I've neither installed nor removed any extensions in the meantime the AC higher up can fuck off.

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
  4. The actual big news here: by tlambert · · Score: 4, Interesting

    The actual big news here: The company doing the indiegogo is located in Shenzhen, China.

    This is the first one of these I've seen. It struck me as very odd that the video narrator was an almost perfect midwest accent, but had terrible grammar and word choice, but when looking at the location of the startup, it became more obvious that this was actually an Indiegogo out of China.

    Anyway, good on them; I expect that we will be seeing a lot more people doing crowd-sourcing from non-U.S. locations, given that VC thends to be pretty tight outside of specific regions of the U.S. (which is, in turn, why most startups that go anywhere are U.S. based, rather than being in Europe, or elsewhere, where the funding climate is pretty terrible).

    1. Re:The actual big news here: by AmiMoJo · · Score: 1

      It's not the first time. I bought a smart LCD module that was produced by a Chinese company and crowd funded on Kickstarter.

      Sony also has its own crowd funding site in Japan, just for Sony products.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    2. Re:The actual big news here: by stephanruby · · Score: 1

      Shenzhen, China is the capital of the world for making electronics. It actually makes perfect sense that we see hardware-based crowd-sourced projects from that city.

      Bringing a non-trivial hardware consumer product, at a reasonable price, to market is hard. It's a lot harder than software.

  5. Finally a foot pedal for hands free applications! by deviated_prevert · · Score: 1

    The only reason devices that can display printed sheet music like tablets and e-ink readers are not popular is that they are essentially useless for sight reading. A foot pedal for page turns could easily create a reader for musicians. It would catch on like wild fire and the music publishers could finally start to distribute good editions again. I have been saying this for years and no one listens, it is the usual routine with industry not seeing the forest for the trees that are still being cut to print music.

    Forget everything you assume about whether or not there is a market for large format e-readers. Categorically there is and all it would take is a foot pedal. So simple but currently the great music publishing houses are in crisis because of digital equipment and unless they get on the program and start to distribute standard editions in digital form they will all die and be bought out by the large corporate bastards who have essentially ham strung the music publishing industry with senseless worries about DRM. All they have succeeded in doing is to force musicians to cheat and file share scans of music and in doing so have also greatly degraded the once esteemed high art of printed music notation and distribution. Precious few are only now realizing the mistake they made with their fears about their copyrights being broken.

    I will gladly pay reasonable amounts for well edited digital sheet music, in fact I still buy from the best publishers that are still around. If I can also not have to waste money on the ink and printing racket so I can play music that is out of print I would be in heaven. I know most real musicians who read and understand the importance of well done sheet music will also do the same and pay for decent music editions in digital format.

    There is much more than just books and the literary arts at a cross roads because of today's technology! Lets get together and put the ink and paper out of business once and for all, I say it has become archaic and far to costly environmentally and socially.

    --
    This message was not sent from an iPhone because Peter Sellers really was a deviated prevert without a dime for the call
  6. reading by Tom · · Score: 1

    convenient reading on cellphones, e-readers and tablets.

    Strangely, most people seem to disagree with that very idea. Reading not convenient on electronic devices. Paper still is the best medium for books. If I have the book, why would I want to read it digitally?

    The one thing an electronic library is good for is rapid searching. If you need a vast amount of knowledge available at a fingertip, and on the road, not in your library, then it's great.

    For everything else, I and most other people prefer to turn around, take the book from the shelf and look it up there.

    --
    Assorted stuff I do sometimes: Lemuria.org
    1. Re:reading by nospam007 · · Score: 1, Insightful

      "Strangely, most people seem to disagree with that very idea. Reading not convenient on electronic devices. Paper still is the best medium for books. If I have the book, why would I want to read it digitally?"

      Because you can select the typeface, the font size, the border, there's built-in bookmarks, there's a search function where you can jump from place to place containing the search expression, there's a built-in word explanation/translation/wikipedia search built-in, you can highlight passages without damaging the book, you can synchronize it with the reader on the toilet, so that you read the exact same book also there, but the reader can stay in the bathroom and lots of other things.

      Hint: That's why thousands of bookstores are closed, because people prefer eBooks over paper ones.

    2. Re:reading by Chris+Mattern · · Score: 3, Informative

      Hint: That's why thousands of bookstores are closed, because people prefer eBooks over paper ones.

      No, thousands of bookstores are closed because people can select from a much wider selection from Amazon. Paper book sales increased 2.4% last year.

    3. Re:reading by houghi · · Score: 1

      I prefer paper books. The advantage for me of ebooks is portability. Say you want to take a book with you on a trip. So you carry a book. What if you want to take two? Now you have either 2 books or an machine the size of one.

      I personally have an ipad mini (gift from the company, so no monies from me). I commute by train (again paid for by the company) so I use that as a reader.

      The obvious downside is that if you break the reader or do not have access to power, a book will be way better.

      --
      Don't fight for your country, if your country does not fight for you.
    4. Re:reading by Chris+Mattern · · Score: 1

      Dead tree books can be damaged too and much more easily.

      Okay, let's do this. I'll drop my book from a height of ten feet. You do the same with your book reader.

      Also, when you damage a book, you've damaged one book. When you break your reader, you've lost *all* your books.

    5. Re:reading by Lunix+Nutcase · · Score: 1

      And yet e-book readers are an extremely common app used on phones and tablets. That a vocal group may not like e-readers does not necessarily translate into it being an opinion of "most people".

    6. Re:reading by AthanasiusKircher · · Score: 1

      Hint: That's why thousands of bookstores are closed, because people prefer eBooks over paper ones.

      No, thousands of bookstores are closed because people can select from a much wider selection from Amazon.

      THIS. And, well, there's the fact that Amazon can basically undercut any actual physical bookstore's prices, without having to pay for as many facilities (more expensive in high-traffic areas), staff to deal with customers... and of course the fact that Amazon seemingly doesn't actually need to even make a profit (ever, really) to keep investors pouring in.

      Physical bookstores obviously have a lot of trouble competing against something like that. Which is why so many have closed.

      Paper book sales increased 2.4% last year.

      And depending on whom you ask (and whose figures you believe), ebook sales have recently stalled in their increases as a percentage of books sold, or perhaps have even gone down slightly in the past year or so. Many publishers have started increasing stocks of paper books again this year; a number of them have declared that trends seemed to show that the so-called "inevitable" death of the paper book market was much further off, leading them to reinvest in more warehouse space again, etc.

      I'm NOT against ebooks, and I recognize lots of people like them. But the evidence so far seems to be that many people still prefer paper books at least for some use cases. Many people still seem to use both ebooks and paper books in different scenarios. There are advantages to both, and so far the dire predictions that paper books would become a "niche market" within a few years don't seem to be coming true.

      Maybe things will be different in a few years. But for now we've seen the decline of physical books stores mostly because of consolidation to giant internet sellers, not because of the predicted imminent demise of physical sales.

    7. Re:reading by iggymanz · · Score: 1

      Wait till you get older, reading normal books gets almost painful. On my Kindle I can make the fonts as big as I want. I hardly read paper books any more since now that I'm over half a century old it's kind of tiring after more than 30 minutes. But I can go for hours on an e-reader

    8. Re:reading by BarbaraHudson · · Score: 1

      You can legally lend, give or sell your copy of a physical book to someone else - no DRM.

      --
      "Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
    9. Re:reading by Tom · · Score: 1

      According to other statistics, e-book sales are already levelling off after an initial explosive growth.

      Also, people with actual arguments don't need to use insults. That's usually a sign that your argument is so weak you are embarrased of it.

      --
      Assorted stuff I do sometimes: Lemuria.org
    10. Re:reading by Tom · · Score: 1

      Let's talk again in 5 years, when your e-book reader is outdated and DRM prevents you from moving your books to a new one.

      Tell me that you can be 100% sure that you will still be able to read those books in 50 years, then name one computer program from 1965 that you can get running.Â

      --
      Assorted stuff I do sometimes: Lemuria.org
    11. Re:reading by Tom · · Score: 1

      But their expected growth was in the double-digits.

      e-books have a place. But the hype is over and now they are settling into a normal market. And they haven't replaced printed books. They are like DVDs to cinemas, not like cars to horse carriages.

      --
      Assorted stuff I do sometimes: Lemuria.org
    12. Re:reading by david_thornley · · Score: 1

      An eBook can also be displayed in whatever type size is desired (within reason). I have a relative with a degenerative disease that has affected her eyes. The only reason she can still read is that we gave her a Nook.

      --
      "When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
  7. OCR is the main problem by DrXym · · Score: 4, Interesting
    I read a lot of books from OpenLibrary (an awesome resource for old books). Most e-books are offered for download in EPUB and PDF format. The PDF is a direct book scan, the EPUB is OCR'd from the scan. Invariably the EPUB is filled with errors caused by OCR - hyphenated words not joined back together, page numbers appearing in the middle of text, words autocorrected to something else, chapter headings screwed up etc. Sometimes the OCR gives up entirely.

    It's simply easier to read the PDF although the file size is enormous and you're basically looking at images of some yellowing old book which means lots of panning and zooming particularly on small devices. And forget reading it on an e-reader.

    So yeah I think you could automate scanning of books, but the second step of getting it into EPUB format is the tricky part.

    1. Re:OCR is the main problem by Visarga · · Score: 1

      > means lots of panning and zooming particularly on small devices

      Read my post above about how to reflow the scanned image of a page to fit the mobile devices.

    2. Re:OCR is the main problem by guestapoo · · Score: 1

      Cuneiform is opensourced from enterprise grade software, which on a par with Abbyy (they are competitors in Russian market).
      From my experience, cuneiform (opensource version) has limitation, and was not updated since 2011, but it's still accurate than tesseract.

    3. Re:OCR is the main problem by guestapoo · · Score: 1

      As I know, Openlibrary books have been scanned and OCRed by Webarchive.
      They use ABBYY version 8, which is very old.

  8. Re:Finally a foot pedal for hands free application by Aereus · · Score: 1

    Buy a cheap set of USB racing foot pedals and a micro-usb adapter and voila, you can probably already do that. Or at the most a simple driver to interface the pedals as standard inputs and assign macros to them.

  9. Searching in scanned books by Visarga · · Score: 1

    I have scanned 100 books from my personal library and realized I can't find nice open source software to OCR the images and search over the text of the entire library for keywords. At some point I created my own clone of Google Books, with OCRopus for translating the images and my own front end for searching and hi-lighting keyword matches. It would be very useful if we had a way to manage searching in hundreds of books, taking notes and remembering the page/citation. It would work like a research library.

    1. Re:Searching in scanned books by temcat · · Score: 1

      Do your high-quality Linux OCR solutions include one that allows me to:
      1) select rectangular OCR areas of "image", "text", and "table" types for different OCR behavior;
      2) add or subtract rectangular sub-areas to or from these areas;
      3) OCR those areas while retaining basic character and paragraph formatting;
      - and all of that using a stable GUI-based software?

      I'm a professional technical translator who would like to be able to work on Linux. Being free as in speech/beer is not required, I'm prepared to pay the equivalent of FineReader price or slightly more. I did my own research recently, and none of the free or, bizarrely, proprietary stuff for Linux had all the required features. But I may have missed something.

    2. Re:Searching in scanned books by temcat · · Score: 1

      Sorry, I now see you said "OSS", not specifically "Linux". But I still hope you have something for Linux, too.

  10. how to make it more convenient to read by Visarga · · Score: 1

    It is often too difficult to read PDFs and scans on mobile devices. We could use a software to identify individual words in the scanned page and reflow the text to match the narrow screen size of phones and tablets. The reflowed document would use the original images of the words, only the rows and pages would be changed. Then we could read without panning and zooming.

    1. Re:how to make it more convenient to read by david_thornley · · Score: 1

      I bought a low-end tablet with a large screen (Azpen something) to read PDFs. It works great. It's crap for almost any other purpose, since it was really low-end, but it serves my purposes well.

      --
      "When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
  11. Good for the sharing community by Anonymous Coward · · Score: 1

    Personally I prefer to just download the books with utorrent, usually somebody already scanned it, so no need to spend a couple of hundred bucks.

  12. It's all in the software by Anonymous Coward · · Score: 1

    This is just a camera and some CPU board for image processing and interfacing (Wifi, USB, HDMI).
    If they opened their algorithms, you could probably do the same with a RPi and its camera module (assuming there is no AF or aperture control build in).

  13. Re:Finally a foot pedal for hands free application by wonkey_monkey · · Score: 2

    Forget everything you assume about whether or not there is a market for large format e-readers. Categorically there is

    Categorically? Have you done any market research? Or are you just projecting your own desire (so strong that you've essentially posted off-topic to bring it up) onto everyone else, because you can't imagine why they wouldn't want the same thing?

    A large format e-reader would be considerably heavier than a few dozen pages of sheet music. Yes, it could store more data, but that's not really going to be of much use to someone playing a fixed set. You can't fold it down the middle to save space. You can't make arbitrary notes on it. It (probably) doesn't photocopy too well to share with your fellow musicians (and you certainly couldn't put it into a feeder and leave it to copy while you make a cup of tea). It would probably be disproportionately expensive as well, since you would not be manufacturing them in the kind of numbers they make, for example, Kindle Paperwhites in - imagine the costs of equipping an entire orchestra. Page turns would have to be faster, and that black-white refresh would be a hell of a distraction. E-readers - last time I checked - are still not quite as bright or as crisp as printing on actual paper. And e-readers, reliable as they are, still have failure modes. The battery can run out, or simply fail. The footpedal is a separate mechanical device that can fail. Paper doesn't have a failure mode, apart from being actively destroyed.

    --
    systemd is Roko's Basilisk.
  14. Perhaps this entry should be marked as an Ad by Rob+Lister · · Score: 5, Informative

    Since this product gets free placement here at /., I figure it is okay to put in a word for the good folks at Distributed Proofreaders.

    Books are scanned and [sometimes roughly] OCR'd.
    Each and every word, period, hyphen, and ellipsis on each and every page is scrutinized by at least three proofreaders.
    Each bold, italic, underline and indent is evaluated by at least two formatters.
    The work is finalized in HTML, proofread as a whole, and published to Project Gutenberg in various formats, txt, pdf, html and epub.

    The resulting publication typically has far fewer publishing errors than the original book. This is especially true of books from the 17th century where drinking was part of a typesetter's expectation.
    Be a part of it.
    Sign up at http://www.pgdp.net/c/

    1. Re:Perhaps this entry should be marked as an Ad by Rob+Lister · · Score: 1

      Strictly speaking, Public Domain books are the only ones you should be converting anyway.

      But I'm happy letting others sleep with whatever morals they so choose.

    2. Re:Perhaps this entry should be marked as an Ad by Anonymous Coward · · Score: 1

      Strictly speaking, Public Domain books are the only ones you should be converting anyway.

      Uhh, why?

      Format shifting falls under fair use and is completely legal so long as you don't distribute or use it for commercial purposes.

    3. Re:Perhaps this entry should be marked as an Ad by Greyor · · Score: 1

      Man, this just made my day. Love the idea and I just created an account on their site. Sounds like a fun way to help preserve old books!

  15. Copy of the Fujitsu ScanSnap SV600 by jolyonr · · Score: 1

    I've had one of these for quite some time now, and it looks pretty much the same except more expensive and without the foot pedal option (great idea!)

    The important thing is the software rather than the hardware which is meant to be able to detect the curvature of the pages on a bound book and adjust for it. It sort of works most of the time on the SV600 but it's not especially fast and neither is it entirely reliable.

    I gave up on it mostly because the software for the Mac was pretty unreliable. I do note they release updates for it very regularly so maybe I should try it again as I haven't touched it in over half a year.

    Jolyon

    --


    Please read my Canon EOS tech blog at http://www.everyothershot.com
    1. Re:Copy of the Fujitsu ScanSnap SV600 by Anonymous Coward · · Score: 1

      ScanSnap SV600 Contactless Scanner @ 3 seconds per page - $795
      Czur Scanner and foot pedal @ Less than 1 second per two pages - $199

    2. Re:Copy of the Fujitsu ScanSnap SV600 by jolyonr · · Score: 1

      The scanning speed is one thing, the processing speed of the scanned files is another. I haven't tried this new system (obviously) but the Fujitsu is certainly pretty slow.

      --


      Please read my Canon EOS tech blog at http://www.everyothershot.com
    3. Re:Copy of the Fujitsu ScanSnap SV600 by guestapoo · · Score: 1

      I have seen this kind products for years (lamp-style scanner) all from China (or produced there) with different brands or no name. Now, I know where they copy from.
      Don't care much until I saw this appears in Slashdot with title likely about an innovation, I expected a scanner, like which was introduced in Slashdot before:

      Japanese Researchers Develop World's Fastest Book Scanner
      or
      Google's Book Scanning Technology Revealed

      or a DIY scanner, with two old point-n-shoot cameras:
      DIY High-Speed Book Scanner from Trash and Cheap Cameras
      All steps in one page

    4. Re:Copy of the Fujitsu ScanSnap SV600 by guestapoo · · Score: 1

      I use 'scantailor' for post processing scanned pages (nearly automatic) (dewarp, cleaning, etc..) then use cuneiform to ocr (output must be hocr format data) (it's faster and more accurate than tesseract but not update since 2011), then convert to DJVU and embed the ocred text layer into it.

  16. that's like 40 ebooks to break even by known_coward_69 · · Score: 1

    figure ebooks average $10. little more for some new releases and a lot less for catalog titles. why spend $400 to pirate paper books?

    1. Re:that's like 40 ebooks to break even by jolyonr · · Score: 3, Insightful

      There are a lot of things that simply aren't available on ebooks. And if I purchased the book and I'm using the pdf for my own use then it's not piracy. At least it's not morally wrong to me, and that's the only thing that matters as far as I am concerned.

      --


      Please read my Canon EOS tech blog at http://www.everyothershot.com
    2. Re:that's like 40 ebooks to break even by Lunix+Nutcase · · Score: 1

      Because $400 is signifcantly less than the thousands of dollars required to replace any significant sized collection of books?

    3. Re:that's like 40 ebooks to break even by ranton · · Score: 1

      When I read your title, I assumed you were going to comment on how cheap the device was because it breaks even after only 40 books. I didn't expect you to think a 40 count book collection would be considered large.

      --
      -- All that is necessary for the triumph of evil is that good men do nothing. -- Edmund Burke
    4. Re:that's like 40 ebooks to break even by known_coward_69 · · Score: 1

      how many of those do you reread on a regular basis? how many are so old you can buy them for a dollar or two in the kindle store? or simply put them into a wishlist and wait for the periodic sales to buy them for a dollar or two? i have over a thousand books in my kindle collection. lots of classics are free. lots of books you can buy on sale and read later. the only one i've ever read more than once is A Song of Ice and Fire

  17. Re:Finally a foot pedal for hands free application by Lunix+Nutcase · · Score: 1

    Paper degradation is an extremely common problem with document archiving.

  18. Re:Finally a foot pedal for hands free application by aitikin · · Score: 3, Informative

    The only reason devices that can display printed sheet music like tablets and e-ink readers are not popular is that they are essentially useless for sight reading. A foot pedal for page turns could easily create a reader for musicians. It would catch on like wild fire and the music publishers could finally start to distribute good editions again. I have been saying this for years and no one listens, it is the usual routine with industry not seeing the forest for the trees that are still being cut to print music.

    You clearly have done zero research. There's a number of options, the most popular I've come across is the AirTurn, although the Cicada works well too from what I've heard.

    --
    "Don't meddle in the affairs of a patent dragon, for thou art tasty and good with ketchup." ~ohcrapitssteve
  19. inconvenient and likely vaporware by NostalgiaForInfinity · · Score: 1

    I've tried scanning some of my books with a camera. This is simply an overhead scanner with manual page turning; you can buy them already. Realistically, it probably takes around 2-3s to scan a page, so it's about 20 minutes to scan a 500 page book. That's a lot of time to sit at a table turning pages.

    But let's say you're willing to put in the work. The hard part in making this work is the software, not some $200 digital camera on a stick. And the really hard part in making this work is not on books that are as well behaved and flat as the ones they use in their demo, but on thicker hardcovers, exactly the kind of expensive books you want to preserve by scanning. Unfortunately, they don't talk about their software much, which leads me to believe that they haven't completed it yet. If they had, they could already be selling it without the hardware scanner.

  20. Brittle books don't mix w/ a flat spine by oneiros27 · · Score: 1

    If you're dealing with old books, you want a scanner than can cradle the book without opening it up flat.

    And 60 pages per minute is actually pretty slow for these scanners. As you're imaging two pages at once, you only need to approach a page flip a second to get 120 pages/minute:

    http://arstechnica.com/gadgets...

    Note that the costs have gone up since that article was written. It used to be $500+electronics ... it's now $1200 + electronics + shipping. (as it's no longer someone doing it in his free time, and now a company doing it ... but it also now comes painted).

    If you have access to a plywood cutting machine, all of the cutting patterns are available under GPL:

    http://www.diybookscanner.org/

    But as it holds the pages flat (with glass that presses down on the pages), rather than the book's spine flat, you don't have to worry about trying to correct for the distortion from curved pages. (or damage your books in the process)

    --
    Build it, and they will come^Hplain.
  21. Looks like a Camera on a Copy Stand by McGruber · · Score: 1

    To me, the device looks like a camera on a copy stand.

    My guess is that it uses a camera from a cell phone, some LEDs to provide illumination, and the foot pedal is the shutter trigger.

    To scan, you hit the foot pedal to snap a photo, turn the page, hit the foot pedal again to snap another photo, turn the page, snap another photo, turn the page again, snap another photo, etc. Software then combines the photos into a scanned document.

  22. The context by XB-70 · · Score: 1

    What's with the BlackBerry Passport and stock footage of Toronto in the video - from a Chinese company?

    --
    *** Don't be dull.***
  23. chinese company by Noah+Haders · · Score: 1

    $199 for a scanner that will scan a book in 5 mins and send a copy back to the chinese govt.

  24. I Am Giving It A Try by crunchygranola · · Score: 1

    Lots, and lots and lots of reasons to dis this offering here. No new tech, just buy eBooks at $10 a pop, who wants hundreds of books on a device, what's so hard about destroying a book, OSS software already exists that does this, etc. etc.

    Here's the thing for me: I want a research library I can take where ever I go. I am a heavy research library book user, and I buy a lot of used books, trying to get out of print texts. When I need a book, I need that exact book, and no substitute will do, because none exists. No, many of the books I want CANNOT be downloaded because someone else scanned it, or as an eBook. I cannot destroy library books, and if I own the book I'd rather have the paper copy too. Whether something is possible with similar tech and software is irrelevant. It needs to be fast and convenient. A well integrated system to this is worth a lot, a lot more than $234 (the final cost if you buy it now). I have worked with a number of Windows and Linux-based processing chains for scanning, page clean-up, compression, OCR, etc., and they are painful to use without exception, some are more painful than others, but none is anything like painless.

    Will this really do the job I hope it will? I don't know, but I will find out.

    --
    Second class citizen of the New Gilded Age
    1. Re:I Am Giving It A Try by AHuxley · · Score: 1

      Re 'I want a research library I can take where ever I go." So true :)
      The ability to get the distance, light and lens makes the capture more easy. A fast CPU and good software then take over to convert every word into text.
      So many other solutions have difficult methods, resolution restricted lens, huge bulky capture systems. Standalone software to do the later OCR might expect flat scanner pages, color corrected, perfect text.
      The good part about this system is the understanding of the shape of the book, shadows and layout as part of the work flow.

      --
      Domestic spying is now "Benign Information Gathering"
  25. Re:Does this work for textbooks? by crunchygranola · · Score: 1

    Your concerns are valid. No, OCR does not handle mathematical formulas very well (in any that I have seen). But remember - OCR does not replace the image, it only augments it (at least if your doing PDFs, I don't know about eBook formats). You are still reading the scanned image. OCR simply provides fast searching and indexing capabilities, a huge win. So formula's can be searched for? Well, in my experience, no math or science book include formulasin its indices anyway so this is no different (the names of formulas yes, the formulas themselves, no).

    --
    Second class citizen of the New Gilded Age
  26. Luddites in academia by h8sg8s · · Score: 1

    (Some) Luddites in academia will still object if you show up in class and pull out a tablet with the book digitized on it. The dead-tree-textbook-publishing racket will die a slow and painful death as the publishing professors and companies seek to maintain their monopoly. $400 for a "new" Calculus textbook printed this year when the previous edition of that same book was in print for only 2 years? In most other areas of life this would be called extortion.

    --
    Organization? You must be joking..
    1. Re:Luddites in academia by GuB-42 · · Score: 1

      And what about a xerox copy of the book?
      I remember at school : most of us bought copies of textbooks from a shady copy shop for about the same price as a paperback novel. Sometimes the copies were actually better than the real deal for studying because of the format they were printed on.
      And before you ask, then yes, it was commercial scale piracy. But it shows that you don't need eBooks to counter the extortion.

  27. Re:The speed of scanning in the video seems fake by guestapoo · · Score: 1

    I have seen some ads from Chinese companies, such as their tablets, these could play HD videos while do other multi-tasks without any lag, the responding of touching ability is amazing fast, etc... AND the price is about less than 99$. ;)

    About this scanner, they claim their scanner could scan 300 pages per 5 minutes, it means 2 pages per 2 seconds (the scanner scans 2 pages at once), it's possible but what I doubt is about the quality of outputs at that speed and price.

  28. Is this a Cloud-only system? by timg11 · · Score: 5, Insightful

    The indigogo site says "Your sketches, paintings, and notes can be scanned and stored in the Czur cloud".
    Do we have the option to use our choice of server (maybe local)?
    What if I don't want everything that I scan going to a company in China?
    What if one day the "Czur cloud" is gone - is the scanner then unusable?

    Has anybody tracked down these answers? The product seem appealing if non-cloud, independent operation is allowed.

  29. Re:You can also use a Smartphone/Camera/ScanTailor by BarbaraHudson · · Score: 1

    Scantailor is garbage. It doesn't even do OCR, whereas the software for Czur does.

    You do know that you can use the OCR software of your choice on the images, don't you?

    --
    "Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
  30. Re:Finally a foot pedal for hands free application by wonkey_monkey · · Score: 1

    You might want to actually read what I've written, which is in reply to someone suggesting e-ink for sheet music.

    --
    systemd is Roko's Basilisk.
  31. Re:Finally a foot pedal for hands free application by wonkey_monkey · · Score: 1

    It's not about document archiving; it's about "live" documents printed to be used, not archived, and the impracticalities of applying e-ink as the solution in certain cases.

    --
    systemd is Roko's Basilisk.
  32. Re:You can also use a Smartphone/Camera/ScanTailor by BarbaraHudson · · Score: 1

    The process may be longer, but you can automate it, so ...

    --
    "Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
  33. You value something you pay for - more by CrashNBrn · · Score: 1

    So you have terabytes of books --- that you will never read. Bonus for you.

  34. Not something I'd get by RockDoctor · · Score: 1
    People have mentioned a number of important points like lighting. From the materials presented, there is no built in lighting. The scans produced in the promotional video are horribly lighted, with the top and bottom of the pages very dark, and the middle over-exposed. Horrible.

    I would be rather dubious about getting adequate quality images for OCR without controlling the lighting better. (I also wouldn't consider trying a task like this without pretty good OCR. that is near enough a solved problem these decades, given reasonable original images.)

    Getting decent enough images to accurately render figures - graphs, or in one book I scanned previously, the tear-down/ re-build photos for the wheel hub on a broken car I owned. As presented, there is no effort at controlling the curvature of the pages. that is incredibly annoying to attempt to read, and is going to be highly destructive to attempts to OCR the images. Text size will vary along each line, along with the focus.

    With a HP flat bed scanner, running a stack of open source OCR components, and manually turning the book and the pages, I could get 4 - 5 pages per minute, which was adequate. Otherwise, find a reliable scanning company in India, and post the books over there, if your time is more valuable than my off-shift time is.

    --
    Birds are not dinosaur descendants;birds are dinosaurs, for all useful meanings of "birds", "are" and "dinosaurs"
    1. Re:Not something I'd get by rpstrong · · Score: 1

      I suggest you watch the videos again. As presented, the Czur has both built in LED lighting AND curvature correction.

  35. Re:reading and dead tree technology by rlh100 · · Score: 1

    Ah, another aficionado of dead tree technology. I find reading long documents online is very tiring. That is why I prefer using dead tree technology by printing the document.

    Dead tree technology has many benefits:
    It never needs to be recharged.
    It is very portable. Just toss it into your bag. No cords or power supply.
    It is very easy to share with some one. Just hand the book to them. Remember to put your name in it.
    It has a very user friendly user indexing system called "dog ear".
    Simply fold a corner of a page over and you can find your place again.
    It is very easy to make notes with a pen or yellow highlighter technology. But only if it is your own book.
    Character image resolution is excellent. No "jaggies" in the font.
    Reading a book has a great tactile feel.
    Holding it in your hands, turning the pages.

    The only drawback is that it requires an external light source. Sunshine and daylight are great to read by but indoor lights work just as well. Even a flashlight under the bed covers.

    Yes, I do like reading using my "dead tree" technology. The only problem is that in a decade or two, children will be asking me about my odd hand held device. Do I really never have to charge it? How can I use it if it does not connect to the Internet? What if I have a question or want to text my friends? Do I really need a different one for each book I want to read?

    Apologies for this being off topic.
    RLH

  36. Slashdot now a product advertising site by unclefred · · Score: 1

    Slashdot a product review and best deals site? What next Fruit and Vegetable sales? Slashdot was once a repository of great tech info doled out by Tech snobs now its so down market it reads like its being produced in a basement by work experience interns A sad sad sad day. PS I am looking for a good home delivery service............

  37. Re:reading and dead tree technology by david_thornley · · Score: 1

    Other disadvantages of dead tree books: they come in one type size per book, they take up room, and they weigh a lot. (We had a structural engineer in to compensate for the weight of the bookshelves.)

    --
    "When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes