DIY Scanner for the poor, lazy, and space-challenged
Moderator: peterZ
-
- Posts: 18
- Joined: 29 Dec 2012, 21:50
- E-book readers owned: 10x iRex DR1000, 15x iRex DR800
- Number of books owned: 10000
- Country: Spain
- Contact:
DIY Scanner for the poor, lazy, and space-challenged
Hi everyone,
having known this website for a couple of years, I long reflected on the designs shown here, and decided to try my own non-canonical (as per this website ; ) approach to DIY scanning, for I share the same goals as most here (good quality, speedy book scanning) but I differ on a number of other considerations. I wanted my setup to be as cheap as possible, and at the same time as easy to build as possible, so it can be extremely portable and/or nearly disposable, and easy to replace with minimal effort. I'm here now to share my results with you and see what you have to say about it.
My setup is based on a hand-held, ruler-like scanner running on AA batteries as capture unit. These beasts can be bought new at eBay for $45 or less (see this one, for example), and give you color or grayscale page scans with a simple swipe movement either at 300 or 600 dpi. Images are directly stored in a microSD memory card (add a few more bucks) so the thing operates without a computer, just like most cameras do, and integrates nicely with any software processing we concoct. Down side / constrains imposed by these units:
-Images are directly stored as JPG with possibly non-optimal parameters that can't be changed. However I can barely notice compression artifacts at 300dpi and I have to hunt them down at 600dpi.
-One dimension of the scan is limited to 22cm/9inches, maximal area is limited to about Letter/A4 size. 99% of my books have either a spine or a page content area that's shorter than those 22cm., though.
-It eats batteries like they were popcorn, so add a couple batteries ($0.50) per every 1500 scans, give or take.
-Upper and lower margin (usually left and right margins of page) must be > 1 cm. This can be overcome with some DIY surgery on the unit, which can be discussed later on.
Higher end units can do 900dpi and feature a rechargeable Li-po battery nad a few other goodies I don't need (bluetooth coupling with smartphones for previews, for example).
What really turns this type of units into a worth DIY scanner (for a total budget of maybe $50!) is the addition of an easel that's really easy to build and ensures its comfortable, convenient and speedy use to go through a whole book open at 90 degrees at nearly 600 pages per hour without breaking sweat (I started at little more than 300 before getting some practice and extensive tweaking of my initial setups). It's basically a big hard cardboard L made rigid by two small steel Ls I took from a router wall mount kit. This can be glued or held into place with bits of string so it can be dismantled and rebuilt in less than one minute. Behind the upright side of the L I fixed a hard cardboard shaft. This, along one last cardboard bit held into place with a simple rubber band, holds any book open and the pages out of the way while you scan. The rubber band and lengthy shaft and holder arm ensure you can easily calibrate the strength which with the book must be held while pages can be speedily turned with ease. Now place the easel in an accessible corner (I use the one that my monitor base forms against my desktop so I don't need extra space) and just natural pressure and motions keep everything in place and from moving -- you're ready to start turning pages down from the upright side. These are kept flat as you swipe the scan over them by the desktop or book cover itself first, and by previously scanned pages underneath the current one after a while. You get nearly perfect, non-distorted scans of every odd or even page and it's a matter of repeating until you're done. If you swipe the scanner with one hand, and turn pages with the other meanwhile it's really fast -- it actually takes me as much (3 seconds?) to ensure the scanner edge is properly aligned with content (thus avoiding issues with poorly guillotined pages, not rare). Turn around the book and do the same again.
OK, I've left a few details out because this looks already like a long post and I'm sure they'll pop up in the discussion anyway if there's any interest in this. I'm attaching a couple of pics below to show how well this works before going for any post-processing (these haven't even been cropped).
having known this website for a couple of years, I long reflected on the designs shown here, and decided to try my own non-canonical (as per this website ; ) approach to DIY scanning, for I share the same goals as most here (good quality, speedy book scanning) but I differ on a number of other considerations. I wanted my setup to be as cheap as possible, and at the same time as easy to build as possible, so it can be extremely portable and/or nearly disposable, and easy to replace with minimal effort. I'm here now to share my results with you and see what you have to say about it.
My setup is based on a hand-held, ruler-like scanner running on AA batteries as capture unit. These beasts can be bought new at eBay for $45 or less (see this one, for example), and give you color or grayscale page scans with a simple swipe movement either at 300 or 600 dpi. Images are directly stored in a microSD memory card (add a few more bucks) so the thing operates without a computer, just like most cameras do, and integrates nicely with any software processing we concoct. Down side / constrains imposed by these units:
-Images are directly stored as JPG with possibly non-optimal parameters that can't be changed. However I can barely notice compression artifacts at 300dpi and I have to hunt them down at 600dpi.
-One dimension of the scan is limited to 22cm/9inches, maximal area is limited to about Letter/A4 size. 99% of my books have either a spine or a page content area that's shorter than those 22cm., though.
-It eats batteries like they were popcorn, so add a couple batteries ($0.50) per every 1500 scans, give or take.
-Upper and lower margin (usually left and right margins of page) must be > 1 cm. This can be overcome with some DIY surgery on the unit, which can be discussed later on.
Higher end units can do 900dpi and feature a rechargeable Li-po battery nad a few other goodies I don't need (bluetooth coupling with smartphones for previews, for example).
What really turns this type of units into a worth DIY scanner (for a total budget of maybe $50!) is the addition of an easel that's really easy to build and ensures its comfortable, convenient and speedy use to go through a whole book open at 90 degrees at nearly 600 pages per hour without breaking sweat (I started at little more than 300 before getting some practice and extensive tweaking of my initial setups). It's basically a big hard cardboard L made rigid by two small steel Ls I took from a router wall mount kit. This can be glued or held into place with bits of string so it can be dismantled and rebuilt in less than one minute. Behind the upright side of the L I fixed a hard cardboard shaft. This, along one last cardboard bit held into place with a simple rubber band, holds any book open and the pages out of the way while you scan. The rubber band and lengthy shaft and holder arm ensure you can easily calibrate the strength which with the book must be held while pages can be speedily turned with ease. Now place the easel in an accessible corner (I use the one that my monitor base forms against my desktop so I don't need extra space) and just natural pressure and motions keep everything in place and from moving -- you're ready to start turning pages down from the upright side. These are kept flat as you swipe the scan over them by the desktop or book cover itself first, and by previously scanned pages underneath the current one after a while. You get nearly perfect, non-distorted scans of every odd or even page and it's a matter of repeating until you're done. If you swipe the scanner with one hand, and turn pages with the other meanwhile it's really fast -- it actually takes me as much (3 seconds?) to ensure the scanner edge is properly aligned with content (thus avoiding issues with poorly guillotined pages, not rare). Turn around the book and do the same again.
OK, I've left a few details out because this looks already like a long post and I'm sure they'll pop up in the discussion anyway if there's any interest in this. I'm attaching a couple of pics below to show how well this works before going for any post-processing (these haven't even been cropped).
Re: DIY Scanner for the poor, lazy, and space-challenged
That looks like just the thing for a vacuum pickup scanner! Much cheaper than cannibalizing a flatbed scanner
- daniel_reetz
- Posts: 2812
- Joined: 03 Jun 2009, 13:56
- E-book readers owned: Used to have a PRS-500
- Number of books owned: 600
- Country: United States
- Contact:
Re: DIY Scanner for the poor, lazy, and space-challenged
Really cool. Thanks for sharing your setup. Can you possibly share a brief outline of your postprocessing steps?
-
- Posts: 18
- Joined: 29 Dec 2012, 21:50
- E-book readers owned: 10x iRex DR1000, 15x iRex DR800
- Number of books owned: 10000
- Country: Spain
- Contact:
Re: DIY Scanner for the poor, lazy, and space-challenged
Thanks! Given the current price of ruler scanners*, I considered a must having a go at them before trying any 'canonical' DIY scanning. Before I actually tried mine myself, my cousin grabbed it along with a few books he needed, and he gave up scanning after trying for a couple of days! Now, the kid is not dumb, so I figured it wouldn't be completely obvious to everyone how to make the thing productive. After seeing how I could also boost my own productivity by nearly a factor of two with some tweaking, I concluded the results might be worth sharing. Especially when the result is oh so KISS : )daniel_reetz wrote:Really cool. Thanks for sharing your setup.
*I always thought 'ruler scanners' were an inevitable invention so I've been waiting for ages until someone actually made a market-viable form (remember, this is DIY scanning for the lazy). All the better if you can buy it WITHOUT paying for unnecessary stuff like scanning software. The design still has some room for improvements, but now what DIY is left is at a level that doesn't scare me off. Tweaking the optical unit is another matter, though.
Well, no; I'm afraid you'll have to endure the full version ; )daniel_reetz wrote:Can you possibly share a brief outline of your postprocessing steps?
Seriously, I'm not completely settled on some aspects of post-processing, so I'll probably post 3 threads on this (slightly different depending on book contents, and increasingly open to discussion) in software and processing how-tos. Just give me a little time to figure out how to write it legibly. I'll post a link here when the first one's ready.
Sure, but I'm not sure about the 'vacuum pickup' (perhaps I'm misunderstanding it). Another reason why I wanted to try ruler scanners was that judging from my readings here, pulling up the platen, reaching for the book, turning one page and getting the platen back into place was an important and inevitable time sink (and possibly a source of back problems), whereas my setup (besides practically forcing you to stay comfortably seated), for a roughly equal 'hand-scan' time, would yield images with minimal skewing or rotating problems (if any at all) every time, and certainly no de-warping or other nasty and annoying stuff to correct, without the need to readjust a single thing.ai4px wrote:That looks like just the thing for a vacuum pickup scanner! Much cheaper than cannibalizing a flatbed scanner
Now, if you devise anything that can automatically (and highly reliably) turn pages, why would I want to spend time swiping the scanner over the book pages? I'd certainly go for unattended picture taking even if it meant a bit more time (but certainly less than that used to scan) processing the images.I could save that time for the other inevitable sinks anyway, like checking all pages are into place, tweaking the final mark-up, etc.
-
- Posts: 18
- Joined: 29 Dec 2012, 21:50
- E-book readers owned: 10x iRex DR1000, 15x iRex DR800
- Number of books owned: 10000
- Country: Spain
- Contact:
Re: DIY Scanner for the poor, lazy, and space-challenged
Just some quick notes:
1) I just found out I can't edit my previous posts any more (I'm not sure if per site policy or due to some web / browser glitch).
2) Because of 1 I can't add proper links to my first post detailing my post-processing; I'm doing it here just in case it went unnoticed by anyone interested.
3) Because of 1 I can't reword my post above. I meant that given I spend roughly as much time doing manual or otherwise attended labour with my easel as I would spend with canonical designs that just won't make unattended page scanning a reality yet, I stick with my ruler scan + easel build because of its (IMHO) better output quality. However, this so-called 'superior quality' (mostly lack of need of much post-processing) I'll be more than happy to sacrifice the moment I can be spared the manual labour either via a vacuum pick device or however else, for I doubt any subsequently necessary post-processing will take an amount of time even in the same order of magnitude as that of attending the book scan process.
And now proper material for a new post.
Those inclined to run and acquire a ruler scanner like mine must be aware that out-of-the-box units (at least cheap ones, but expensive the expensive ones featuring software and stuff look identical) must undergo some almost trivial surgery before scans like the first two samples I posted are the norm and not the exception. The optical unit of the scanner is triggered by the motion of several rubber rollers as the scan is swept over pages. The axis these roller revolve around has some unnecessary sideways movement freedom that must be blocked. If you don't, most of your scans will look like the one below, or worse. While that's still mostly OK for OCR I would consider such images 'bad quality' and definitely not suitable for long term storage, so I thought I'd warn you (this is a re-sampled lower quality image to merely illustrate the distortion).
More to come soon.
1) I just found out I can't edit my previous posts any more (I'm not sure if per site policy or due to some web / browser glitch).
2) Because of 1 I can't add proper links to my first post detailing my post-processing; I'm doing it here just in case it went unnoticed by anyone interested.
3) Because of 1 I can't reword my post above. I meant that given I spend roughly as much time doing manual or otherwise attended labour with my easel as I would spend with canonical designs that just won't make unattended page scanning a reality yet, I stick with my ruler scan + easel build because of its (IMHO) better output quality. However, this so-called 'superior quality' (mostly lack of need of much post-processing) I'll be more than happy to sacrifice the moment I can be spared the manual labour either via a vacuum pick device or however else, for I doubt any subsequently necessary post-processing will take an amount of time even in the same order of magnitude as that of attending the book scan process.
And now proper material for a new post.
Those inclined to run and acquire a ruler scanner like mine must be aware that out-of-the-box units (at least cheap ones, but expensive the expensive ones featuring software and stuff look identical) must undergo some almost trivial surgery before scans like the first two samples I posted are the norm and not the exception. The optical unit of the scanner is triggered by the motion of several rubber rollers as the scan is swept over pages. The axis these roller revolve around has some unnecessary sideways movement freedom that must be blocked. If you don't, most of your scans will look like the one below, or worse. While that's still mostly OK for OCR I would consider such images 'bad quality' and definitely not suitable for long term storage, so I thought I'd warn you (this is a re-sampled lower quality image to merely illustrate the distortion).
More to come soon.
- Attachments
-
- sample_distortion.png (126.72 KiB) Viewed 20564 times
-
- Posts: 1
- Joined: 21 Nov 2010, 07:45
Re: DIY Scanner for the poor, lazy, and space-challenged
How do you deal with books that have rather narrow margins?
I own a similar device and my main problem with this workflow is, that the space between the scanning optics and the wheel, that records the scanning speed, is too narrow.
Often the wheel will be in mid-air, while the scanner is still at the edge of the text, meaning that the characters on the border either get skewed beyond recognition, or are not scanned at all.
It kind of works if you put some heavier paper under the page, but changing this every 5 pages (beyond that, the problem reappears) makes for a very cumbersome and frustrating workflow, as often borders will be distorted/cut off despite the paper underneath (usually around 20%) and you first have to transfer all images to your computer, rotate them, check for erroneous images, then scan again, replace the bad images, etc.
Did you encounter that problem as well and if yes, how do you deal with it?
I attached a sample, so you can see what I mean.
I own a similar device and my main problem with this workflow is, that the space between the scanning optics and the wheel, that records the scanning speed, is too narrow.
Often the wheel will be in mid-air, while the scanner is still at the edge of the text, meaning that the characters on the border either get skewed beyond recognition, or are not scanned at all.
It kind of works if you put some heavier paper under the page, but changing this every 5 pages (beyond that, the problem reappears) makes for a very cumbersome and frustrating workflow, as often borders will be distorted/cut off despite the paper underneath (usually around 20%) and you first have to transfer all images to your computer, rotate them, check for erroneous images, then scan again, replace the bad images, etc.
Did you encounter that problem as well and if yes, how do you deal with it?
I attached a sample, so you can see what I mean.
-
- Posts: 18
- Joined: 29 Dec 2012, 21:50
- E-book readers owned: 10x iRex DR1000, 15x iRex DR800
- Number of books owned: 10000
- Country: Spain
- Contact:
How to deal with rather narrow margins
I think you mean the opposite, the optical unit stops scanning before getting all the content in if the roller reaches the page border beforehand -- that may happen because / if rollers are too far away of the optical unit... (further apart than the margin width anyway).KenoshaKid wrote:[...]my main problem [...] the space between the scanning optics and the wheel [...] is too narrow.
Often the wheel will be in mid-air, while the scanner is still at the edge of the text[...]
I just use an aluminium sheet which is very rigid in spite of its thinness (about 0.5mm) and bigger than most books. I got it cut for free from a scrap at a window shop, so you shouldn't have problems getting one either. Just don't forget to file the borders...
This may sound cumbersome but it isn't. Of course it slows you down, but you can still work at ~2/3 your normal speed w/o too many problems. After each sweep, the scanner rests near the sheet edge -- I take the sheet with my thumb and I pull scanner and sheet away, and I start turning page with the other hand. Then I put the sheet on top of the page I just scanned, thrusting its end against the inner part of the book spine, and the sheet changes thumbs and hands as the new page comes over it, leaving hand #1 free to move the scanner. Now hand #2 tightens and flattens the new page against the sheet (thumb further presses the sheet edge towards the page inner side, fingers pull the page edge while pressing it against the sheet to increase the friction) while the whole thing is turned down, leaving space for hand #1 to put the scanner in place again. I hope you don't need a demonstration video, because I don't feel like putting one together. : )
These above are a couple of pictures of a cheap paperback (which typically exhibit this problem) being scanned using the metal sheet -- I'm attaching the scan sample now to illustrate that the real bitch here can be *inner* narrow margins - the upper pages held by the clamp may protrude too much and keep the scanner from reaching far enough:
Now, this is typical with thick paperbacks and the like, so we can take advantage of their spines being somewhat flexible to address this in a rather direct way. You just need to put a thick ruler or something that separates the book a bit from the easel back and lets the book spine be bent over it when the scanner is pressed against the inner part of each page. Remember the clamp is simply held to the shaft with elastic bands, so you should be able to adjust everything to work without damaging your books. I just made a sketch, because I couldn't take a good-looking picture:
The whole lower book part should be skewed but still flat (and the metal sheet only helps with that) so scanning is possible most of the time, even in these conditions.
An alternate way to deal with both problems (inner and outer narrow margins) at once would involve the surgery I alluded to in earlier posts: take the upper part off the scanner (batteries and circuitries) and move it side-to-side with the lower third (optics and rollers), giving the unit it a more wedge- (rather than square-) shaped section. While at it, my scanner optics have enough room to one side in their part to insert a supplementary trigger roller, which would allow for ditching the metal sheet, but I'm still pondering about the details.
Also, I tried a thin plexiglass sheet first. It being on top of the scanned page allowed for more scanning speed, but the particles that stick to the rollers over time quickly damaged it and it noticeably affected the scan quality anyway -- it could still be good enough for OCR but not for anything else. You may be more lucky with other plastics.
Last but not least, one little detail that which helps immensely to scan straight (that is, perpendicular to the text lines, parallel to page borders, or both if the book was well printed) as you go, provided the axis side wobbling is fixed, and I forgot to mention: white marks on the scanner side. If your unit doesn't have them painted, use some correction fluid -- if your unit hasn't them, make them yourself, and paint them!
-
- Posts: 64
- Joined: 03 Sep 2010, 13:23
- Number of books owned: 0
- Location: Calgary, Alberta, Canada
Re: DIY Scanner for the poor, lazy, and space-challenged
What's your scanning speed like? It seems the act of sweeping a page would be infinitely slower. How fast does the scanner allow you to move it?
-
- Posts: 18
- Joined: 29 Dec 2012, 21:50
- E-book readers owned: 10x iRex DR1000, 15x iRex DR800
- Number of books owned: 10000
- Country: Spain
- Contact:
Re: DIY Scanner for the poor, lazy, and space-challenged
Infinitely slower than... what exactly?recaptcha wrote:What's your scanning speed like? It seems the act of sweeping a page would be infinitely slower. How fast does the scanner allow you to move it?
How long does it take to scan ONE page from a book? A little bit more than two seconds, maybe. Most books I own are best scanned 'sideways' so wider pages take a little longer to be scanned (how much more, half a second?) but that's really nothing when compared to the fact that waving your arms around to turn pages, etc. takes roughly another 3 seconds or more if you do it with any care, so it only makes sense to measure speed in batches.
I mentioned above 'roughly 600 pages per hour' (10 per minute), which I think is pretty good. After checking my logs (yes -- I keep logs I later analyse) I see it's a bit but not much less -- I work consistently at over 8.5, typically oscillating from 8.5 to 9.5 pages per minute, which would amount to more than 500 pph one way or another. But as I mentioned, I don't want to be tired or stressed after scanning a book, nor to need redoing more than a couple of pages. Granted, two thirds of that if I have to use the metal sheet, and one half if it's one of those books you need to keep pressing hard, but is anyone not having their own problems with those?
Given the amount of attended time getting a 'proper' final document takes, even without other image post-procressing than proper cropping (see my thread 1 of 3 about it), I'm all for inventing a working auto-page turner and whatnot, but if we keep talking attended scans, I'd say we're pretty much approaching their physical speed limit, and scanning is the shortest stage anyway. If it were possible to 'scan in no time' (ideal scanning device of the same type, perfect eye-hand coordination) it would still take 3-4 seconds to turn each page and get everything back into place. If you do the math, that amounts to saving a mere 20 minutes per 600 pages (a really thick book) you decide to scan.
Re: DIY Scanner for the poor, lazy, and space-challenged
I am really interesting in your scanning method.
Could you explain more how to modify those cheap scanners ("trivial surgery" as you have reffered)? With some photos if you can.
Would it be possible if you can post a short video of your scanning process in action?
What is a script to combine odd an even pages so that i can feed them to ScanTailor already in correct order?
Could you explain more how to modify those cheap scanners ("trivial surgery" as you have reffered)? With some photos if you can.
Would it be possible if you can post a short video of your scanning process in action?
What is a script to combine odd an even pages so that i can feed them to ScanTailor already in correct order?