book scanning...

Built a scanner? Started to build a scanner? Record your progress here. Doesn't need to be a whole scanner - triggers and other parts are fine. Commercial scanners are fine too.

Moderator: peterZ

Post Reply
bokks
Posts: 10
Joined: 22 Oct 2019, 00:23
E-book readers owned: Kindle
Number of books owned: 1000
Country: Australia

book scanning...

Post by bokks »

Hi All,

I am planning to digitize (scan) for my own collection of books which are in the count of around 10,000. The age of the books are between 50-100 years for around 20% of the books and the rest of the books between 1-50 years. Most of the books are in good condition but the old books pages have turned brownish/yellowing due to age. Since this is my own personal work, I am not planning to invest in any commercial book scanners available in the market as they are very expensive. Around 60% of the books can be scanned by keeping the books flat on a table and then using either less expensive Czur scanner or use a high quality DSLR camera and post-process the images using Scan Tailor. Rest 40% of the books that are hard to open fully can be scanned by building a DIY v-shaped book scanner using one among the many built designs available in this forum. Since I don't have any previous experience doing any large scale scanning work, i would request experienced members to provide their thoughts and suggessions.

Thanks,
Bokks
BillGill
Posts: 139
Joined: 18 Dec 2016, 17:13
E-book readers owned: Calibre, FBReader
Number of books owned: 7000
Country: USA

Re: book scanning...

Post by BillGill »

Since you are interested in how to scan books I thought this might interest you. I have been doing just what you are wanting for several years now and this is my story.
To start out I should explain that I have been scanning mostly fiction, not technical manuals or text books, which have lots of illustrations. My setup works just fine for my purposes. This is about my scanner. What you do with the scans you get from this setup will depend on what you want to do with the final eBooks.
I started out by building a standard scanner on the basic principles as shown on the home page of DIY Book Scanning site.
OldScanner.JPG
This worked pretty good, but it was cumbersome and I had problems with some of my books which have almost invisible gutter margins (the margin on the inside edge of the book). So I tried a different approach based on a style described in a post on this DIY Book Scanning site.
TeeterTotter.JPG
I called this one a teeter-totter scanner. This was simpler, but was still rather bulky and complex. It did do better on the gutter margin problem.
Then I looked at it and started trying to simplify the design. That led to what I now call my KISS (Keep It Simple Stupid) scanner.
NewScanner (3).JPG
This is about as simple as I could get it. At the top is the platen. Below this is the camera. It is positioned so that the field of view is centered on the center of the platen. It is set so that the field of view just covers the width of the platen.
Below the camera is a mirror which I use to make sure that the page of the book is properly registered in the camera screen.
In use I open the book to the first page I want to scan. I place that page face down on the platen and hold it flat with my hand. I press the shutter button to take the picture. Then I pick the book up and slide it across the platen so that the opposite page is on the platen and repeat. Then I turn the page and keep this up until I finish taking the pictures.
When I finish scanning the book I move the scans to my computer to be converted to the final form. At this point I do one extra thing. I flip through the pictures on the computer and check the page numbers to make sure I didn’t skip any pages. I frequently do.
I generally take about 30 to 45 minutes to scan a book up to about 350 pages using this system.
bokks
Posts: 10
Joined: 22 Oct 2019, 00:23
E-book readers owned: Kindle
Number of books owned: 1000
Country: Australia

Re: book scanning...

Post by bokks »

Thanks BillGill for the detailed response. I have a few questions for you. From your post i understand that you are doing scanning for a few years and I am assuming might have done 100's of books, but wouldn't it hurt your arms if you keep repeatedly moving the platen up and down for each page? In my case I am talking about 1000's of books, so plan to automate this effort. Also the camera that you have in the picture is point-and-shoot one, are you getting good quality pictures from those that are enough for OCR processing? What software are you using for post-processing?
Konos93a
Posts: 200
Joined: 19 Sep 2016, 10:00
E-book readers owned: kobo aura,kindle 1,kindle pw3,pocketbook inkpad 2
Number of books owned: 3000
Country: greece

Re: book scanning...

Post by Konos93a »

https://www.youtube.com/watch?v=mR2TQOHEDYc max 1100 pages per hour
https://www.youtube.com/watch?v=vYIL-p9ET4k max 1800 pages per hour
https://www.youtube.com/watch?v=XCBiFAXXq80
https://www.youtube.com/watch?v=l_wxUJFEZLI
https://www.youtube.com/watch?v=Rkf49KPIPf0
https://www.youtube.com/watch?v=dtM5ljDN9so

here is some guides and tutorial i made about it. check them especially the first 2.

try to scan an ordinary book with your smartphone and a glass in a wooden window before make a 2 camera scanner.

the biggest issue with scanning nowdays is the camera . most people around here use canon compact that use chdk . this is a way to "hack" some cameras that u cant find even aftermarket . so u have to use your smartphone or dslr camera.

i am trying to use raspberry camera v3 to make something good that could replace compacts for a diybookscanner , i dont know if that would work in raspberry forum they told me it will.
lastly use library genesis for your books
BillGill
Posts: 139
Joined: 18 Dec 2016, 17:13
E-book readers owned: Calibre, FBReader
Number of books owned: 7000
Country: USA

Re: book scanning...

Post by BillGill »

You wanted to know if it would hurt my arms moving the platen up and down for a whole book. I never had a problem when I was using either the traditional or the teeter-totter scanners. Of course with the KISS scanner I don't move the platen, all I move is the book. It doesn't bother my arms, all it bothers is my back. Sitting up straight for 30 to 45 minutes causes it to ache. And that is more my age than the scanner.

The point and shoot camera is quite good enough. Resolution is the important factor in the choice of a camera. Almost any camera with a resolution greater than 1.5 Meg will be good enough. Most modern point and shoot cameras have over a 2 Meg resolution. The processing of the images is more affected by the quality of the page than by the resolution of the camera, once you get a good enough resolution. The camera in most phones is plenty good enough. I used my 'good' camera once when I had a problem with the point and shoot camera and didn't get significantly better results.

I am using Abby Finereader 14 for OCR. It works quite well. However Version 14 is no longer available and the latest version is sold only on a subscription basis. There are a number of free OCR products, but most of them are slower.

After the OCR process I use Word 365 to correct the text. You don't need to buy Microsoft Office, there are a couple of free office suites that are almost as good. This is actually the long pole in the creation of an eBook. There are always a lot of errors in the text, because no OCR software will get it perfect. The number of errors varies with the quality of the book being scanned.

After that I load the text document into Calibre. https://calibre-ebook.com/ This is a program described as eBook management software, however it has capabilities far beyond simple management. It can be used to convert a text document into many other formats, and can be used to edit the books if they are in .epub or .azw3 format.

Hope this helps.
Bill
TDavLinguist
Posts: 5
Joined: 10 Jan 2024, 11:21
E-book readers owned: Kindle Fire, Galaxy Tab
Number of books owned: 800
Country: Thailand
Contact:

Re: book scanning...

Post by TDavLinguist »

It's been a long time since I've used a DIY book scanner and I'm getting ready to get back into it. However, your post inspired me to dig up some files of the first book I scanned. It was a rare book (at least to an American with little access to Burmese books at the time) on loan from a local Burmese monastery. A grammar of Pāli in Burmese. I scanned it and uploaded it to Internet Archive and it was promptly removed for violating Burmese copyright law (lol). The scanner I used was the simple Cardboard Box scanner that you can find on Instructables. Here are some photos:
The raw capture:
Raw capture
Raw capture
The ScanTailor output (TIFF):
100_1699.tif
The ScanTailor output
(66.46 KiB) Not downloaded yet
For the box, I used a USPS mailing box and cut and scored it as per the instructions. I bought the piece of glass at Lowe's Home Improvement and used painter's tape to protect myself (and the books) from the rough edges.

Knowing what I know now, I probably could have done a better job with the capture, but I'm satisfied with the results.
Good luck to you!
cday
Posts: 456
Joined: 19 Mar 2013, 14:55
Number of books owned: 0
Country: UK

Re: book scanning...

Post by cday »

@TDavLinguist: I have downloaded your TIFF file attachment, good result (300DPI, Black and White image with LZW compression) but for some unexplained reason the download count still shows that the file hasn't been downloaded yet??
Post Reply