DIY Scanner for the poor, lazy, and space-challenged

Built a scanner? Started to build a scanner? Record your progress here. Doesn't need to be a whole scanner - triggers and other parts are fine. Commercial scanners are fine too.

Moderator: peterZ

mrwarper
Posts: 18
Joined: 29 Dec 2012, 21:50
E-book readers owned: 10x iRex DR1000, 15x iRex DR800
Number of books owned: 10000
Country: Spain
Contact:

DIY Scanner for the poor, lazy, and space-challenged

Post by mrwarper »

Hi everyone,

having known this website for a couple of years, I long reflected on the designs shown here, and decided to try my own non-canonical (as per this website ; ) approach to DIY scanning, for I share the same goals as most here (good quality, speedy book scanning) but I differ on a number of other considerations. I wanted my setup to be as cheap as possible, and at the same time as easy to build as possible, so it can be extremely portable and/or nearly disposable, and easy to replace with minimal effort. I'm here now to share my results with you and see what you have to say about it.

My setup is based on a hand-held, ruler-like scanner running on AA batteries as capture unit. These beasts can be bought new at eBay for $45 or less (see this one, for example), and give you color or grayscale page scans with a simple swipe movement either at 300 or 600 dpi. Images are directly stored in a microSD memory card (add a few more bucks) so the thing operates without a computer, just like most cameras do, and integrates nicely with any software processing we concoct. Down side / constrains imposed by these units:
-Images are directly stored as JPG with possibly non-optimal parameters that can't be changed. However I can barely notice compression artifacts at 300dpi and I have to hunt them down at 600dpi.
-One dimension of the scan is limited to 22cm/9inches, maximal area is limited to about Letter/A4 size. 99% of my books have either a spine or a page content area that's shorter than those 22cm., though.
-It eats batteries like they were popcorn, so add a couple batteries ($0.50) per every 1500 scans, give or take.
-Upper and lower margin (usually left and right margins of page) must be > 1 cm. This can be overcome with some DIY surgery on the unit, which can be discussed later on.
Higher end units can do 900dpi and feature a rechargeable Li-po battery nad a few other goodies I don't need (bluetooth coupling with smartphones for previews, for example).

What really turns this type of units into a worth DIY scanner (for a total budget of maybe $50!) is the addition of an easel that's really easy to build and ensures its comfortable, convenient and speedy use to go through a whole book open at 90 degrees at nearly 600 pages per hour without breaking sweat (I started at little more than 300 before getting some practice and extensive tweaking of my initial setups). It's basically a big hard cardboard L made rigid by two small steel Ls I took from a router wall mount kit. This can be glued or held into place with bits of string so it can be dismantled and rebuilt in less than one minute. Behind the upright side of the L I fixed a hard cardboard shaft. This, along one last cardboard bit held into place with a simple rubber band, holds any book open and the pages out of the way while you scan. The rubber band and lengthy shaft and holder arm ensure you can easily calibrate the strength which with the book must be held while pages can be speedily turned with ease.
It's as simple as it looks, really!
It's as simple as it looks, really!
DSC00301.JPG (51.37 KiB) Viewed 20777 times
And it takes up as little space as one book on your desk!
And it takes up as little space as one book on your desk!
DSC00300.JPG (55.37 KiB) Viewed 20777 times
Now place the easel in an accessible corner (I use the one that my monitor base forms against my desktop so I don't need extra space) and just natural pressure and motions keep everything in place and from moving -- you're ready to start turning pages down from the upright side. These are kept flat as you swipe the scan over them by the desktop or book cover itself first, and by previously scanned pages underneath the current one after a while. You get nearly perfect, non-distorted scans of every odd or even page and it's a matter of repeating until you're done. If you swipe the scanner with one hand, and turn pages with the other meanwhile it's really fast -- it actually takes me as much (3 seconds?) to ensure the scanner edge is properly aligned with content (thus avoiding issues with poorly guillotined pages, not rare). Turn around the book and do the same again.

OK, I've left a few details out because this looks already like a long post and I'm sure they'll pop up in the discussion anyway if there's any interest in this. I'm attaching a couple of pics below to show how well this works before going for any post-processing (these haven't even been cropped).
Attachments
Scan sample #1, mostly text
Scan sample #1, mostly text
Scan sample #2, mostly image
Scan sample #2, mostly image
ai4px
Posts: 33
Joined: 12 Dec 2012, 12:47
Number of books owned: 0
Country: United States

Re: DIY Scanner for the poor, lazy, and space-challenged

Post by ai4px »

That looks like just the thing for a vacuum pickup scanner! Much cheaper than cannibalizing a flatbed scanner
User avatar
daniel_reetz
Posts: 2812
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: DIY Scanner for the poor, lazy, and space-challenged

Post by daniel_reetz »

Really cool. Thanks for sharing your setup. Can you possibly share a brief outline of your postprocessing steps?
mrwarper
Posts: 18
Joined: 29 Dec 2012, 21:50
E-book readers owned: 10x iRex DR1000, 15x iRex DR800
Number of books owned: 10000
Country: Spain
Contact:

Re: DIY Scanner for the poor, lazy, and space-challenged

Post by mrwarper »

daniel_reetz wrote:Really cool. Thanks for sharing your setup.
Thanks! Given the current price of ruler scanners*, I considered a must having a go at them before trying any 'canonical' DIY scanning. Before I actually tried mine myself, my cousin grabbed it along with a few books he needed, and he gave up scanning after trying for a couple of days! Now, the kid is not dumb, so I figured it wouldn't be completely obvious to everyone how to make the thing productive. After seeing how I could also boost my own productivity by nearly a factor of two with some tweaking, I concluded the results might be worth sharing. Especially when the result is oh so KISS : )

*I always thought 'ruler scanners' were an inevitable invention so I've been waiting for ages until someone actually made a market-viable form (remember, this is DIY scanning for the lazy). All the better if you can buy it WITHOUT paying for unnecessary stuff like scanning software. The design still has some room for improvements, but now what DIY is left is at a level that doesn't scare me off. Tweaking the optical unit is another matter, though.
daniel_reetz wrote:Can you possibly share a brief outline of your postprocessing steps?
Well, no; I'm afraid you'll have to endure the full version ; )
Seriously, I'm not completely settled on some aspects of post-processing, so I'll probably post 3 threads on this (slightly different depending on book contents, and increasingly open to discussion) in software and processing how-tos. Just give me a little time to figure out how to write it legibly. I'll post a link here when the first one's ready.
ai4px wrote:That looks like just the thing for a vacuum pickup scanner! Much cheaper than cannibalizing a flatbed scanner
Sure, but I'm not sure about the 'vacuum pickup' (perhaps I'm misunderstanding it). Another reason why I wanted to try ruler scanners was that judging from my readings here, pulling up the platen, reaching for the book, turning one page and getting the platen back into place was an important and inevitable time sink (and possibly a source of back problems), whereas my setup (besides practically forcing you to stay comfortably seated), for a roughly equal 'hand-scan' time, would yield images with minimal skewing or rotating problems (if any at all) every time, and certainly no de-warping or other nasty and annoying stuff to correct, without the need to readjust a single thing.

Now, if you devise anything that can automatically (and highly reliably) turn pages, why would I want to spend time swiping the scanner over the book pages? I'd certainly go for unattended picture taking even if it meant a bit more time (but certainly less than that used to scan) processing the images.I could save that time for the other inevitable sinks anyway, like checking all pages are into place, tweaking the final mark-up, etc.
mrwarper
Posts: 18
Joined: 29 Dec 2012, 21:50
E-book readers owned: 10x iRex DR1000, 15x iRex DR800
Number of books owned: 10000
Country: Spain
Contact:

Re: DIY Scanner for the poor, lazy, and space-challenged

Post by mrwarper »

Just some quick notes:
1) I just found out I can't edit my previous posts any more (I'm not sure if per site policy or due to some web / browser glitch).
2) Because of 1 I can't add proper links to my first post detailing my post-processing; I'm doing it here just in case it went unnoticed by anyone interested.
3) Because of 1 I can't reword my post above. I meant that given I spend roughly as much time doing manual or otherwise attended labour with my easel as I would spend with canonical designs that just won't make unattended page scanning a reality yet, I stick with my ruler scan + easel build because of its (IMHO) better output quality. However, this so-called 'superior quality' (mostly lack of need of much post-processing) I'll be more than happy to sacrifice the moment I can be spared the manual labour either via a vacuum pick device or however else, for I doubt any subsequently necessary post-processing will take an amount of time even in the same order of magnitude as that of attending the book scan process.

And now proper material for a new post.

Those inclined to run and acquire a ruler scanner like mine must be aware that out-of-the-box units (at least cheap ones, but expensive the expensive ones featuring software and stuff look identical) must undergo some almost trivial surgery before scans like the first two samples I posted are the norm and not the exception. The optical unit of the scanner is triggered by the motion of several rubber rollers as the scan is swept over pages. The axis these roller revolve around has some unnecessary sideways movement freedom that must be blocked. If you don't, most of your scans will look like the one below, or worse. While that's still mostly OK for OCR I would consider such images 'bad quality' and definitely not suitable for long term storage, so I thought I'd warn you (this is a re-sampled lower quality image to merely illustrate the distortion).

More to come soon.
Attachments
sample_distortion.png
sample_distortion.png (126.72 KiB) Viewed 20564 times
KenoshaKid
Posts: 1
Joined: 21 Nov 2010, 07:45

Re: DIY Scanner for the poor, lazy, and space-challenged

Post by KenoshaKid »

How do you deal with books that have rather narrow margins?
I own a similar device and my main problem with this workflow is, that the space between the scanning optics and the wheel, that records the scanning speed, is too narrow.
Often the wheel will be in mid-air, while the scanner is still at the edge of the text, meaning that the characters on the border either get skewed beyond recognition, or are not scanned at all.
It kind of works if you put some heavier paper under the page, but changing this every 5 pages (beyond that, the problem reappears) makes for a very cumbersome and frustrating workflow, as often borders will be distorted/cut off despite the paper underneath (usually around 20%) and you first have to transfer all images to your computer, rotate them, check for erroneous images, then scan again, replace the bad images, etc.
Did you encounter that problem as well and if yes, how do you deal with it?

I attached a sample, so you can see what I mean.
Attachments
Sample cutoff/distortion
Sample cutoff/distortion
mrwarper
Posts: 18
Joined: 29 Dec 2012, 21:50
E-book readers owned: 10x iRex DR1000, 15x iRex DR800
Number of books owned: 10000
Country: Spain
Contact:

How to deal with rather narrow margins

Post by mrwarper »

KenoshaKid wrote:[...]my main problem [...] the space between the scanning optics and the wheel [...] is too narrow.
Often the wheel will be in mid-air, while the scanner is still at the edge of the text[...]
I think you mean the opposite, the optical unit stops scanning before getting all the content in if the roller reaches the page border beforehand -- that may happen because / if rollers are too far away of the optical unit... (further apart than the margin width anyway).

I just use an aluminium sheet which is very rigid in spite of its thinness (about 0.5mm) and bigger than most books. I got it cut for free from a scrap at a window shop, so you shouldn't have problems getting one either. Just don't forget to file the borders...
easel+sheet.jpg
easel+sheet.jpg (59.71 KiB) Viewed 20206 times
This may sound cumbersome but it isn't. Of course it slows you down, but you can still work at ~2/3 your normal speed w/o too many problems. After each sweep, the scanner rests near the sheet edge -- I take the sheet with my thumb and I pull scanner and sheet away, and I start turning page with the other hand. Then I put the sheet on top of the page I just scanned, thrusting its end against the inner part of the book spine, and the sheet changes thumbs and hands as the new page comes over it, leaving hand #1 free to move the scanner. Now hand #2 tightens and flattens the new page against the sheet (thumb further presses the sheet edge towards the page inner side, fingers pull the page edge while pressing it against the sheet to increase the friction) while the whole thing is turned down, leaving space for hand #1 to put the scanner in place again. I hope you don't need a demonstration video, because I don't feel like putting one together. : )
easel+sheet+scanner.jpg
easel+sheet+scanner.jpg (57.24 KiB) Viewed 20206 times
These above are a couple of pictures of a cheap paperback (which typically exhibit this problem) being scanned using the metal sheet -- I'm attaching the scan sample now to illustrate that the real bitch here can be *inner* narrow margins - the upper pages held by the clamp may protrude too much and keep the scanner from reaching far enough:
Cool, no margin pains any more... Oops!
Cool, no margin pains any more... Oops!

Now, this is typical with thick paperbacks and the like, so we can take advantage of their spines being somewhat flexible to address this in a rather direct way. You just need to put a thick ruler or something that separates the book a bit from the easel back and lets the book spine be bent over it when the scanner is pressed against the inner part of each page. Remember the clamp is simply held to the shaft with elastic bands, so you should be able to adjust everything to work without damaging your books. I just made a sketch, because I couldn't take a good-looking picture:
easel supplement.png
easel supplement.png (2.23 KiB) Viewed 20206 times
The whole lower book part should be skewed but still flat (and the metal sheet only helps with that) so scanning is possible most of the time, even in these conditions.

An alternate way to deal with both problems (inner and outer narrow margins) at once would involve the surgery I alluded to in earlier posts: take the upper part off the scanner (batteries and circuitries) and move it side-to-side with the lower third (optics and rollers), giving the unit it a more wedge- (rather than square-) shaped section. While at it, my scanner optics have enough room to one side in their part to insert a supplementary trigger roller, which would allow for ditching the metal sheet, but I'm still pondering about the details.

Also, I tried a thin plexiglass sheet first. It being on top of the scanned page allowed for more scanning speed, but the particles that stick to the rollers over time quickly damaged it and it noticeably affected the scan quality anyway -- it could still be good enough for OCR but not for anything else. You may be more lucky with other plastics.

Last but not least, one little detail that which helps immensely to scan straight (that is, perpendicular to the text lines, parallel to page borders, or both if the book was well printed) as you go, provided the axis side wobbling is fixed, and I forgot to mention: white marks on the scanner side. If your unit doesn't have them painted, use some correction fluid -- if your unit hasn't them, make them yourself, and paint them!
It's like car lights - isn't it nice, seeing where you're heading to?
It's like car lights - isn't it nice, seeing where you're heading to?
scanner_closeup.jpg (47.32 KiB) Viewed 20206 times
recaptcha
Posts: 64
Joined: 03 Sep 2010, 13:23
Number of books owned: 0
Location: Calgary, Alberta, Canada

Re: DIY Scanner for the poor, lazy, and space-challenged

Post by recaptcha »

What's your scanning speed like? It seems the act of sweeping a page would be infinitely slower. How fast does the scanner allow you to move it?
mrwarper
Posts: 18
Joined: 29 Dec 2012, 21:50
E-book readers owned: 10x iRex DR1000, 15x iRex DR800
Number of books owned: 10000
Country: Spain
Contact:

Re: DIY Scanner for the poor, lazy, and space-challenged

Post by mrwarper »

recaptcha wrote:What's your scanning speed like? It seems the act of sweeping a page would be infinitely slower. How fast does the scanner allow you to move it?
Infinitely slower than... what exactly?

How long does it take to scan ONE page from a book? A little bit more than two seconds, maybe. Most books I own are best scanned 'sideways' so wider pages take a little longer to be scanned (how much more, half a second?) but that's really nothing when compared to the fact that waving your arms around to turn pages, etc. takes roughly another 3 seconds or more if you do it with any care, so it only makes sense to measure speed in batches.

I mentioned above 'roughly 600 pages per hour' (10 per minute), which I think is pretty good. After checking my logs (yes -- I keep logs I later analyse) I see it's a bit but not much less -- I work consistently at over 8.5, typically oscillating from 8.5 to 9.5 pages per minute, which would amount to more than 500 pph one way or another. But as I mentioned, I don't want to be tired or stressed after scanning a book, nor to need redoing more than a couple of pages. Granted, two thirds of that if I have to use the metal sheet, and one half if it's one of those books you need to keep pressing hard, but is anyone not having their own problems with those?

Given the amount of attended time getting a 'proper' final document takes, even without other image post-procressing than proper cropping (see my thread 1 of 3 about it), I'm all for inventing a working auto-page turner and whatnot, but if we keep talking attended scans, I'd say we're pretty much approaching their physical speed limit, and scanning is the shortest stage anyway. If it were possible to 'scan in no time' (ideal scanning device of the same type, perfect eye-hand coordination) it would still take 3-4 seconds to turn each page and get everything back into place. If you do the math, that amounts to saving a mere 20 minutes per 600 pages (a really thick book) you decide to scan.
Andrius
Posts: 5
Joined: 05 Mar 2013, 08:42
Number of books owned: 0
Country: Lithuania

Re: DIY Scanner for the poor, lazy, and space-challenged

Post by Andrius »

I am really interesting in your scanning method.
Could you explain more how to modify those cheap scanners ("trivial surgery" as you have reffered)? With some photos if you can.
Would it be possible if you can post a short video of your scanning process in action?
What is a script to combine odd an even pages so that i can feed them to ScanTailor already in correct order?
Post Reply