Sheet Fed Scanner for Destructive Scanning

Book scanning methods that involve taking books apart.

Moderator: peterZ

TooMuchToDo
Posts: 3
Joined: 10 May 2010, 02:13

Sheet Fed Scanner for Destructive Scanning

Post by TooMuchToDo »

Hello all!

I built a traditional DIY book scanning cradle several months back for doing some scanning of century old books a local non-profit needed digitized, but I'm looking to digitize my entire bookcase of several hundred paperback books.

For the sake of speed, I've decided to forgo the DIY scanner and purchase a sheet fed scanner to feed the books through after using a table saw (with a fine blade) to take the bindings off. Can anyone who has used one for this sort of project recommend a duplex sheet fed scanner that can handle 100-200 pages at a time? I'm looking to scan at between 300-400 DPI (other forum members have mentioned anything above that is fairly useless).
seltzered

Re: Sheet Fed Scanner for Destructive Scanning

Post by seltzered »

Hope I'm not criticizing, but could an office scanner with auto page-feed support work for this task? I know people at my office have just taken their college notebooks, ripped the spiral bindings off, and let our office scanner (it's a SHARP I think) take care of capturing the pages and converting them to PDF.

Otherwise, you may want to look into the scansnaps with duplex support (not the portable ones, there's some which i think will scan a certain number of pages)
User avatar
reggilbert
Posts: 49
Joined: 28 Sep 2010, 19:57
Number of books owned: 3000
Location: Buffalo, New York

Re: Sheet Fed Scanner for Destructive Scanning

Post by reggilbert »

Dear TooMuchToDo,

I recently happened on some YouTube videos on this topic --

1) A guy doing exactly what you plan to do, with post-chop scanning with an unspecified Fujitsu ScanSnap, at what i count as 28 duplex pages per minute:

http://www.youtube.com/watch?v=-M95Ob4k ... re=related

2) This guy also uses a Fujitsu ScanSnap, the S1500, which appears to scan both sides of just 22 sheets each minute. However, he gets rid of the binding using a a $700 paper cutter, much preferable to my squeamish mind (my guess is that resale on these cutters is pretty good):

http://www.youtube.com/watch?v=uP4NbFaYZVI

Finally, company info on the Fujitsu ScanSnap scanners,which all seem to be 50 ADF / 20 ppm:

http://www.fujitsu.com/us/services/comp ... /scansnap/
User avatar
daniel_reetz
Posts: 2812
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: Sheet Fed Scanner for Destructive Scanning

Post by daniel_reetz »

Really great answers here. If you search around these forums for "scansnap" you can find some of the other users that use them (we have a few). If you can destroy the book, it is the fastest way to go and probably worth it.
TooMuchToDo
Posts: 3
Joined: 10 May 2010, 02:13

Re: Sheet Fed Scanner for Destructive Scanning

Post by TooMuchToDo »

Excellent! Thanks for the advice all, I'll let you know how the scanning goes.
User avatar
daniel_reetz
Posts: 2812
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: Sheet Fed Scanner for Destructive Scanning

Post by daniel_reetz »

It would be really helpful if you could post your process when you get it figured out (or even as you figure it out)... as you have seen, there is no complete tutorial here on the forums...
TooMuchToDo
Posts: 3
Joined: 10 May 2010, 02:13

Re: Sheet Fed Scanner for Destructive Scanning

Post by TooMuchToDo »

I'd be more than happy to detail the process as I work on this. A rising tide raises all boats.
User avatar
LA2
Posts: 23
Joined: 01 Feb 2012, 16:43
Number of books owned: 1000
Country: Sweden
Location: Linköping, Sweden

Re: Sheet Fed Scanner for Destructive Scanning

Post by LA2 »

To this thread, I can report that the Canon ImageFormula DR-M160 is a good buy. Compared to the Canon DR-2080C and DR-2050C that I have used earlier, this one is about twice as expensive (around $1100), but a lot faster, scanning 60 sheets (120 pages, front and back) per minute. That is a speed range where serious office scanners used to cost in the range of $4000 or more. This scanner uses a normal USB 2.0 connector with supplied software for Windows only, and the specifications say: "An Intel Core i7 2.8GHz processor with at least 4GB of RAM is recommended" (so it gives you a good reason to upgrade to a good, fast gaming computer as well). The documentation also says: "Suggested Daily Volume: 7,000 scans." I haven't kept mine busy full time, but only scanned 38,500 sheets (77,000 pages) in the first 90 days.
deklerkt
Posts: 11
Joined: 26 Aug 2012, 06:19
E-book readers owned: BE Book White, iPad
Number of books owned: 2000
Country: Netherlands

ScanSnap S1500 procedure

Post by deklerkt »

I've used a Fujitsu ScanSnap S1500 dual page scanner that accepts approx 50 sheets of paper and can scan it both sides and output 150, 300 or 600 dpi JPG b/w or color images or collect them inside a single PDF document. The sheet feeder can be kept full by adding new sheets while it works. The scanner automatically adjusts to the size of the scanned page - no cropping needed afterwards.
If the loose sheets occassionally have some left-over glue, this may smudge the scanning window strips (only shows if photographs,not on printed text), but these glass windows are easily cleaned using a little bit of windows cleaner on a tissue. Or tissues to clean your glasses with. All parts of the scanner are easily accessible - also handy in the cases a page gets stuck somehow.

I use JPG output if I'm scanning yellowed pages of old paperbacks. Using batch mode Photoshop I can then use some level- and contrast settings to make the pages virgin white with ink black sentences.

Whether JPG or PDF output, next the file(s) go into Abbyy Finereader V11 for final processing. It allows me to add a cover (if it did not fit the ScanSnap scanner) and perform text recognition. Finereader automatically takes care of headers and footers (but not always correctly) and a "verify" cycle quickly uncovers many but not all misrecognized characters.
If you scanned a signature (magazine folded and with staple) obviously the scans contain pages out of order and 2 pages in a single scan, but Finereader allows you to split pages and re-order them the way they should.

As final result, I output to both PDF (keep page image layout with recognized text below image) as well as ePub (eReader) format for books suitable for eReader. That is mostly plain text and no illustrations and certainly no mathematical formulas or fancy fonts.

The ScanSnap comes with a version of Acrobat to allow you to enhance the pdf with bookmarks (for table of contents) and links (for linking page with table of contents entry). It also allows automated OCR but you never get to see the recognized text so you never know how good it really is (it is - as long as no fancy fonts or formulas are found). I find Acrobat very useful as it allows to build embedded indexes in a pdf file but also an index catalog spanning many books. Very useful for magazines and multi-book texts where you can now quickly search for any word in any document. Acrobat does this too for pdf files produced by Finereader - you then skip the ocr phase of Acrobat.
The enhanced PDF will work fine on iPad, the catalog spanning files does not. This requires a proper Acrobat Reader on a pc.

The few paperbacks I do want to keep, can easily be glued back together using either some whitish hobby glue you smear over the gutter side of the book pages and the spine of the cover and let it set a day or so. Or using some thermobinding.
To be able to do this, you need to "destroy" the book neatly. Not by circle saw. It works fine for me to cut the book pages out of the cover and then gently pull a set of pages loose. Like you do a blocknote if "perfectly bound" in glue. If the book has signatures and is stitched, simply cut the stitches with a sharp hobby knife and pull the pages loose. The stack of pages then go under a guillotine knife cutting about 20 pages at the time. This makes a nice clean cut that allows new glue to sink in if you decide to remake the book. Not entirely equivalent to the original book, but as close as you get if you need to "destroy" it.
User avatar
mellow-yellow
Posts: 46
Joined: 28 Jun 2010, 13:33
Number of books owned: 1
Country: USA
Location: Portland, OR, USA
Contact:

Re: Sheet Fed Scanner for Destructive Scanning

Post by mellow-yellow »

Here is the Youtube video: http://www.youtube.com/watch?feature=fv ... 4kIak&NR=1

I recently scanned 158 books using the Scansnap S1500, after MinuteMan press, a print shop chain with a franchise in Portland, OR, removed the bindings with their ream cutter per book for under $2 each. As Daniel says, if you CAN destroy the book, do it. You eliminate ALL KINDS of issues that book scanners introduce:

1. uneven lighting
2. unusual DPI ratings
3. Scantailor mistakes (including shots of your rig instead of cropping them out)
4. slowness due to CHDK install, camera setup, transfer of images, manual tweaking, etc.
5. expense, since a Scansnap at $500 is, in many cases, cheaper than the rigs on this forum with cameras + ABBYY Finereader
6. space and weight, since the Scansnap is about the size of a shoebox

I scan to B&W or possibly color, but always at 300DPI at high compression, then run Acrobat X's Clearscan (obviating ABBYY Finereader Pro).

Downsides to the Scansnap S1500:
1. destroying your book (although the print shop can spiral bind them together again)
2. correcting overlapping pages
3. loading paper manually at intervals
Post Reply