Slashdot Mirror


The DIY Book Scanner

azoblue writes "Daniel Reetz did not want to lug around heavy textbooks, so he built a book scanner to create digital copies. '... over three days, and for about $300, he lashed together two lights, two Canon Powershot A590 cameras, a few pieces of acrylic and some chunks of wood to create a book scanner that's fast enough to scan a 400-page book in about 20 minutes (PDF). To use it, he simply loads in a book and presses a button, then turns the page and presses the button again. Each press of the button captures two pages, and when he's done, software on Reetz's computer converts the book into a PDF file. The Reetz DIY book scanner isn't automated — you still need to stand by it to turn the pages. But it's fast and inexpensive.'"

48 of 177 comments (clear)

  1. Too bad slavery is illegal by Anonymous Coward · · Score: 3, Funny

    This would be a good activity for the winter months when farming isn't possible.

    1. Re:Too bad slavery is illegal by Anonymous Coward · · Score: 2, Funny

      This would be a good activity for the winter months when farming isn't possible.

      That's why God gave us illegal immigrants.

  2. Look out! by Chris+Tucker · · Score: 3, Insightful

    Here comes the Publisher's Copyright Enforcement Gundams to give you "What For!".

    Imagine that, thinking you could actually DO Something like that with your very own property.

    What cheek!

    --
    Guaranteed! This comment 100% Anthrax free!
    1. Re:Look out! by John+Hasler · · Score: 2, Insightful

      Right. After all, scanners have only been around for about fifty years: the publishers just haven't noticed yet. This homebrew effort is sure to bring the matter to their attention.

      --
      Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
    2. Re:Look out! by Mr.+Freeman · · Score: 2, Insightful

      Are you fucking stupid? They're doing the EXACT SAME THING RIGHT NOW with music, see how that turned out?
      Let's recap:
      -Consumers aren't having their rights protected
      -Some courts are actually ruling in favor of removing customers rights (i.e. every time the RIAA has won some part of a case_
      -Legislation to remove rights from consumers is getting more and more popular
      -The RIAA and other organizations are making bank off of their sue, settle, and drop campaign. (Sue at random, settle for thousands, drop the case if they fight back).

      The first time they try to sue someone, it won't fly back in their face. They'll settle out of court because no average person can afford to fight a company, much less a book publisher.

      --
      -1 disagree is not a modifier for a reason. -1 troll, flaimbait, redundant, overrated are NOT acceptable substitutes.
    3. Re:Look out! by Mr.+Freeman · · Score: 2, Interesting

      Hell yes it will. Prior to now it's been a pain in the ass to actually scan a book. You either had to shell out big money for a professional model scanner, which no one except large companies does. Or you had to scan in every page with a flatbed, which generally comes out poorly because that crease in the center results in shadows which results in an image that's not appealing to look at and read.

      This allows people to generate high-quality scans of books. Especially with the price of high-quality cameras dropping.
      I see this being amazingly popular with college students. The absurd amount of money publishers charge for books combined with the fact that college students are the group most likely to put up with having no physical document and settle for a pdf version means that a drop in sales is not unlikely. Now, take into consideration that at most college campuses with engineering programs there's generally a few people clever enough to build one of these and all of a sudden you see a business model start to fall.

      As for the liberal arts universities... well...

      --
      -1 disagree is not a modifier for a reason. -1 troll, flaimbait, redundant, overrated are NOT acceptable substitutes.
    4. Re:Look out! by nametaken · · Score: 2, Insightful

      Not sure he did it for his own property. But it does prove that books have the best DRM of all.

  3. A bargain by thethibs · · Score: 4, Informative

    Except for the lack of an automatic page-turner, Daniel's device is the same as one you can buy commercially for about $20,000 (http://www.treventus.com/bookscanner_pageturner.html).

    He was wise to decide on manual page-turning.

    --
    I'm a Programmer. That's one level above Software Engineer and one level below Engineer.
    1. Re:A bargain by Surt · · Score: 2, Insightful

      The automatic page turner costs an additional 19700 / 833 hours = 23.64 per hour. Hire a high school student for 8.

      --
      "Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
    2. Re:A bargain by Farhood · · Score: 3, Interesting

      I have Kinko's/Staples/ Office Depot cut off the spine ($1-$5), clip it on all sides, and go home to my Fujitsu ScanSnap for ADF scanning, auto color/ b/w selection, and OCR. Oh, and you press the button once and walk away.

    3. Re:A bargain by GNUALMAFUERTE · · Score: 3, Funny

      Better yet, how much does a high school student fetch on ebay?

      --
      WTF am I doing replying to an AC at 5 A.M on a Friday night?
  4. Heh by sys.stdout.write · · Score: 3, Insightful

    I do this for my law school textbooks (unless you're a book publisher, in which case I am joking and would never break the law).

    I was excited when I read this because it is a pain in the ass to turn the pages in a 1000 page Constitutional Law textbook. Thus, you can imagine my disappointment when I read that his machine doesn't automate this.

    Most universities have at least one library which has a Ricoh scanner that does exactly what his does, i.e. it writes out a PDF onto your USB stick. I don't know where he's a graduate student, but I bet if he looked in his library he could have saved himself $300.

    1. Re:Heh by atarkri · · Score: 4, Informative

      The school is NDSU. Yes we (he) looked. No our library does not have one.

      He has details of the reasons on his blog danreetz.com/blog

    2. Re:Heh by TubeSteak · · Score: 2, Insightful

      I do this for my law school textbooks (unless you're a book publisher, in which case I am joking and would never break the law).

      What law are you breaking?
      Whether you scan it and convert the OCRed text into an audio book, rip all the pages out and turn it into an art exhibit, or use the book for toilet paper, the publisher has no legal right (AFAIK) to stop you.

      --
      [Fuck Beta]
      o0t!
    3. Re:Heh by rdnetto · · Score: 3, Insightful

      Not sure where you live, but in most areas format shifting is usually recognized as fair use. Whether or not torrenting the PDF counts as format shifting isn't a question that the courts have answered yet, but it's currently the most convenient method.

      --
      Most human behaviour can be explained in terms of identity.
    4. Re:Heh by Toonol · · Score: 3, Insightful

      You might want to read the front matter of just about every book published to see that they specifically address feeding the book into a computer in any way possible and say it is a violation of the copyright if done without permission.

      It doesn't matter what they say. It matters what the law says, and if they tell you that you can't do something the law says you can, the law wins. The more books add legal crap in order to be more like software EULAs, the more lies they will incorporate, like software EULAs.

      I doubt there's much of a chance at all that you would be found guilty of copyright infringement for making a format change of your own book, for your own use. That's nearly the most straightforward example of fair use you could imagine. If you distributed it, sure; that's not fair use.

  5. Inevitable DMCA smackdown coming? by i_want_you_to_throw_ · · Score: 2, Insightful

    How soon before the manufacturer of the $20,000 commercial version files a lawsuit against him? That would be extraordinarily sad because the American system of patent/copyright only serves to stifle independent innovation like this.

  6. Cameras usually stink for this.... by Slugster · · Score: 3, Insightful

    It may work well enough for basic textbooks, but the problem is that (for high-quality scans) you can't ever get the same image quality from a $800 camera that you can from a $80 scanner. At 1200 DPI, a scanner is equivalent to a ~384 MP camera. Even scanning at "only" 300 DPI is ~90 MP, a far bigger image than any consumer-grade camera can provide.

    The cameras he used were only five megapixels.

    Might work for looking at the pages on your iPhone. Not gonna look very readable on your laptop screen, and forget about reading the book's footnotes.....
    ~

    1. Re:Cameras usually stink for this.... by bloobloo · · Score: 2, Informative

      There's no problem with the resolution.

      9" x 6" page, scanned at 300 dpi = 2700 x 1800 pixels = 4.86 MP.

    2. Re:Cameras usually stink for this.... by maxume · · Score: 2, Informative

      Lots of book scanners use ccds. They are good enough. No one really wants a 'portable' scanned document that weighs in at 3 gigabytes anyway, current laptop IO makes that a pain in the ass.

      --
      Nerd rage is the funniest rage.
    3. Re:Cameras usually stink for this.... by smallfries · · Score: 4, Informative

      You haven't actually tried this have you? I've had various flatbed A4 scanners over the years, all at much higher resolution than a camera, and hence all got down-sampled afterwards for my display that is only 1.5MP anyway. Then I switched to using a phone camera with only a 2MP CCD, but a really good lens and decent macro mode (Sony-Ericcson Cybershot for those that are interested). As long as the focus was good it produced perfectly readable shots, and so it became my portable scanner. These days I mostly shot stuff at home so I have a 12MP DSLR to hand. It's huge overkill, and I massively down-sample stuff afterwards, but entirely readable. So your basic claim that this can't be done with a camera based on the resolution compared to a scanner is a complete load of bollocks. The focus of the lens tends to be the important issue.

      --
      Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
    4. Re:Cameras usually stink for this.... by Chris+Tucker · · Score: 2, Informative

      FYI, the color camera on the Mars Rovers.

      One Megapixel. Really spiffy and detailed images of the Martian landscape for only one megapixel, don't you think?

      Also, TFA states he's using OCR to create a PDF.

      If the image from the camera is sharp enough, the OCR software should have no trouble "reading" it.

      --
      Guaranteed! This comment 100% Anthrax free!
    5. Re:Cameras usually stink for this.... by Chris+Tucker · · Score: 4, Funny

      I am well aware of how the Mars cameras work, having done a metric shitload of B&W "color" photography via filters myself.

      And you, obviously, know exactly dick about not being an asshole.

      --
      Guaranteed! This comment 100% Anthrax free!
  7. I've by Kamineko · · Score: 3, Funny

    What a coincidece! I too have a book scanner that scans books, and requires a human operator to attend to turning the pages.

    It's called a scanner.

    1. Re:I've by iammani · · Score: 2, Interesting

      We would love to see you scan 400 pages in 20 minutes with your 'book scanner'.

    2. Re:I've by fbjon · · Score: 2, Insightful

      Really? Scanning takes a fair number of seconds, then you need to lift the book in order to turn the page, set it down correctly, and start the next scan. Compare with: push button, turn page, push button. Limited pretty much by how fast you turn the page.

      --
      True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
    3. Re:I've by The+-e**(i*pi) · · Score: 5, Informative

      http://www.geocities.jp/takascience/lego/fabs_en.html

      turning the pages and scanning is childs play

    4. Re:I've by Jeremy+Erwin · · Score: 2, Funny

      Has anyone tried shotgun scanning yet? Irregularly shred the books, feed the shreds into a bulk scanner, and use a computer to reassemble.

  8. repost by AnonGCB · · Score: 4, Informative

    http://bkrpr.org/doku.php

    Same thing, much cheaper (I built mine for ~150 USD.)

    --
    http://CryoLANparty.com/ A lan I'm staff on!
    1. Re:repost by idji · · Score: 5, Informative

      yeah, but you have to press 2 buttons and then lift your two cameras with your 4 sided PMMA/perspex/plexiglass box every time - he has a hinged L-shaped piece of perspex and one button - a more elegant solution - half the button presses, the cameras don't move and less weight.

    2. Re:repost by idji · · Score: 3, Informative

      the bottom two sides of the box are holding the pages flat for the cameras. He has to lift the box to turn the page.

      Your idea would end up with bent pages.

  9. Re:Are there scanners that accept a stack of sheet by hansonc · · Score: 5, Informative

    You must not have ever gone to college. A textbook for $15? Get real.

  10. He's just pretending by phantomcircuit · · Score: 2, Insightful

    He keeps talking about how expensive the books are. Clearly he is just using this to scan other people's books to avoid paying.

    Still a pretty cool build though :P

    1. Re:He's just pretending by atarkri · · Score: 3, Informative

      Actually, the motivation behind the project stem's from Dan's stay in Russia before his graduate studies. He realized that their are tons of old posters, pictures, and other soviet propoganda floating around the country's libraries that many people in the western world would like to view, but are unwilling to go to Russia to see. He wanted to digitize some of these posters (works of art, in his view) in order to circulate them on the web. He soon became very frustrated with using a flatbed scanner, and stopped. Zoom ahead a few years later, Instrucatables is having a contest to win an epilog laser cutter, so he decided to build a book scanner out of recycled (read: trash) materials and submits the project, and wins. He says he's surprised at how well the project has resonated with the web community.

    2. Re:He's just pretending by fwarren · · Score: 4, Insightful

      He may be scanning books to pirate them. However, I am a college student as well but trying to save money by pirating the books is not my objective.

      I am in my 40's and my eyesight is not what it used to be. Here is why I would buy the books and scan them.

      1. To be legal and comply with the law. I may very well by the books used, to get them as cheaply as possible. But I will buy them.
      2. It is much lighter for me to carry one laptop around on campus, perhaps with copies of all the books I have used for all terms up to the current term.
      3. I can zoom the pages to a comfortable size to read the text.
      4. I now have the ability to search through the text.
      5. I can use a text-to-speech reader to listen to the book, I can even make an mp3 of the book if I so desired.

      To me it sounds like a bargain

      --
      vi + /etc over regedit any day of the week.
  11. high quality digital cameras doom textbooks by Surt · · Score: 4, Interesting

    This is a market that relies on outrageous reproduction prices just like cd's used to. They are equally doomed. I know a LOT of college students who no longer buy books ... they rent them for free by buying them, shooting them, and returning them. It may take a couple of hours to do manually without a device like this, but $80 per hour is pretty good wages for a college student.

    --
    "Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
  12. better wy by cinnamon+colbert · · Score: 3, Informative

    from the comments with the article
    posted by: irrational | 12/11/09 | 11:56 pm

    I do it in 5 steps, and you get rid of the book when you’re done since you don’t need to store it. After you get done putting 200 hours into your creation, you’ll have spent thousands of dollars worth of your time. I solved this problem much more quickly years ago:

    1. Buy a good sheet-fed and high-speed scanner. I have a Panasonic KV-S2026 color.
    2. Get a decent jigsaw from Home Depot. Use metal cutting blades (24 teeth/inch or better)
    3. Saw the spines off the book and for God’s sake use some C-clamps on each end of the book. Preferably sandwich them between two flat boards.
    4. Remove and feed sheets through the scanner to OmniPage and text recognize the pages.
    5. Save as PDF.
    6. Repeat. You now have searchable digital books!

    1. Re:better wy by Surt · · Score: 3, Insightful

      Even thousands of dollars worth of your time can be recouped easily over 4-5 years of college book costs. And rarely will a college student find a job that pays better than scanning their own books to save book costs.

      --
      "Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
  13. Well, ironically by Anonymous Coward · · Score: 3, Insightful

    Ironically, all these books that he and others are trying to scan into a digital format where created in a digital format from the start, sitting on a publisher's computer somewhere.

    Thanks copyright laws! Thank you very little.

    1. Re:Well, ironically by Ralph+Spoilsport · · Score: 2, Informative
      Even more evil: because some students are blind or vision impaired, they need digital copies they can have their computers blow up in size on screen or read audibly to them.

      This means that every textbook HAS a doc or PDF version you can get from the publisher. As a professor I regularly get pdf versions of my text books for "disabled" students who can't afford the $95 these leeches charge for the text I use.

      I'm in the process of putting together a "text pack" that consists of short excerpts from dozens of books and journals that I will put together as a pdf and give to the students. Fuck these leeches. They piss me off.

      RS

      --
      Shoes for Industry. Shoes for the Dead.
  14. Re:Are there scanners that accept a stack of sheet by Surt · · Score: 3, Insightful

    One semester's worth of books in college today runs around $1000. With this device you can return the books after you've scanned them. If you rip out the binding, most bookstores are going to frown on returns.

    So this device saves about $700 the first semester, and $1000/semester after that.

    --
    "Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
  15. Did the same thing with just a single camera by milesw · · Score: 3, Informative

    I'm amazed at how good OCR has gotten. I did the same thing without building anything: just connected my Canon PowerShot A540 to a tripod, lay the tripod on a coffee table, put the book on the floor, and started snapping away. Fed the JPGs to ABBYY FineReader 10, and it spit out plain text that was *at least* 97-98% accurate on every page. I did not use any special lights, do not know anything about photography, and frankly thought I'd have to buy all sorts of special equipment. The only other thing I added for convenience sake was Dirk's CanoRemote so that I would not move the camera (however imperceptibly) every time I pressed the shutter.

    1. Re:Did the same thing with just a single camera by hansamurai · · Score: 2, Insightful

      I was reading about OCR accuracy in my Game Developer magazine just last night, and they were lamenting that 98% accuracy really wasn't good enough for them. I know that the difference between personal and professional use is rather wide, but they printed a few sentences with 98% accuracy and I will admit, it was distracting. Of course, if they hadn't mentioned, would I have noticed?

    2. Re:Did the same thing with just a single camera by ShooterNeo · · Score: 2, Interesting

      When you OCR the resulting PDFs from using a scanner, you use a mode that includes data from the original scan. For instance, I just use Adobe Acrobat's "clear scan" OCR mode. What it does is it OCRs the text, and uses the OCR data to sharpen the scan of the letters in the PDF document. It then downsamples all the image data to a resolution that you specify. Basically, the resulting PDF is a hybrid between an OCRed file and the original image data that was scanned in. You can easily read all of the text, even letters that were not recognized properly by the OCR. The only problem this creates it that the text is not fully searchable : sometimes, a word that wasn't OCRed right will not be found in a text search, even though it's perfectly readable in the text. What you do then is do it old school : scroll to the bottom of the PDF of the book and look at the actual index. Then type the page number into the box at the top, and acrobat will jump right to that page.

      Problems : Acrobat is kind of slow on most computers. I think once I get a quad core with an SSD it'll be instantly fast, though. The second problem is that these hybrid PDF files are huge. A textbook takes up about a gigabyte at the quality level I scan them at. Not a problem at all though if you are reading the files on a beefy desktop PC with huge high resolution displays, though. (and such a PC would ironically cost less than a semester or two worth of textbooks...the PC would cost roughly $1500-$2000)

  16. Dupe ? by eulernet · · Score: 2, Informative
  17. See also the BookLiberator, a more compact design by kfogel · · Score: 4, Interesting

    See also the BookLiberator, a somewhat more compact cube-in-cradle design, that's also easy to build. Although soon you won't have to build your own: we're prototyping a manufacturable, flat-packed kit to sell from our online store; see questioncopyright.org/bookliberator for more about the project. It should be ready next year.

    None of which is to detract from Reetz's accomplishment, of course. This renaissance in personal book scanners is going to make it easier for all of them, in the long run, especially as we can share the same open source software among all the scanners.

    --
    http://www.red-bean.com/kfogel
  18. Re:I wonder if anyone in my area has such a rig? by Thing+1 · · Score: 4, Insightful

    I wonder how long the copyright will last on this book?

    Based on the last 40 years of Disney legislation?

    For-fucking-ever.

    --
    I feel fantastic, and I'm still alive.
  19. Just saw the spine off! by saccade.com · · Score: 2, Informative

    A while back I got a Fujitsu ScanSnap S510. Now when I want to scan a book, I just saw the spine off (table saw, band saw or even a steel ruler and X-acto knife will do the trick). Take the loose sheets, about 40 at a time, and put them into the ScanSnap. The ScanSnap comes with Acrobat Pro and does a fine job of making a searchable PDF file of the book. The paper? Into the recycle bin. I've cleared off several feet of shelf space.