Help...novice scanner using iOCHOW camera scanner

General discussion about software packages and releases, new software you've found, and threads by programmers and script writers.

Moderator: peterZ

cday
Posts: 451
Joined: 19 Mar 2013, 14:55
Number of books owned: 0
Country: UK

Re: Help...novice scanner using iOCHOW camera scanner

Post by cday »

The VFlat (I presume) page images in the PDF file have very good quality text, zooming in it looks as if the bitmap image captured by the smartphone may have been vectorised, so that characters scale without becoming pixelated. No sign of fingers, a pity that some pages aren't flattened better, but possibly that is something you could work on.

I don't have a recent version of Abbyy Finereader, but recall that the earlier version I have has an 'optimise' page option, and wonder if the version you have has a similar option, and whether that operation has been applied before the recognition stage? Depending on how far the software has progressed that operation might reduce the curvature in the affected images and so improve recognition accuracy. Otherwise, personally I don't relate well to 'ribbon' interfaces, and Finereader has always had its quirks where some results seem rather inexpiable!

I'll try to test in my (now fairly old) version of Adobe Acrobat tomorrow, but wouldn't expect it to cope well with images with significant distortion, and if the images are indeed vectorised text it might possibly not accept them as input.
BruceG
Posts: 99
Joined: 14 May 2014, 23:17
Number of books owned: 500
Country: Australia

Re: Help...novice scanner using iOCHOW camera scanner

Post by BruceG »

The easy way to fix things would be to do one page at a time. Even if it means the pages are upside down. The right hand pages are fine, the left hand pages / even pages are the problem. The extra time getting a good image saves far more time in editing.
You have not said what editing has been done. In the attached Zip file is is a Abbyy Finereader Project file, Epub, Docx and pdf with Text on Top files.
The only editing has been on the even page numbers in the Image Editor - Straighten Text Lines
Attachments
DIY Testing.7z
(69.03 MiB) Downloaded 117 times
BruceG
Posts: 99
Joined: 14 May 2014, 23:17
Number of books owned: 500
Country: Australia

Re: Help...novice scanner using iOCHOW camera scanner

Post by BruceG »

It maybe worth looking at a program called YASW. I think David Landin did a video on using it. Key stoning is one of the things it does, which the even pages suffer from. The YASW DIY output file is the result of keystoning and cropping to make a rectangle. The 2 Abbyy files used the YASW output file as a source and applied Straighten text lines.
Although the pages can be fixed a little, doing what you did for the odd pages, do for the even pages, it will produce better and far quicker results.
Attachments
YASW DIY output.pdf
(12.08 MiB) Downloaded 114 times
YASW DIY output Abbyy.pdf
(304.03 KiB) Downloaded 111 times
YASW DIY output Abbyy.docx
(20.76 KiB) Downloaded 104 times
cday
Posts: 451
Joined: 19 Mar 2013, 14:55
Number of books owned: 0
Country: UK

Re: Help...novice scanner using iOCHOW camera scanner

Post by cday »

In addition to BruceG's useful suggestions above:

VFlat has an option to output a PDF file containing searchable text, so if you haven't already committed to buying Abbyy Finereader you might be able to manage without it. The idea would be to display the searchable PDF file in a PDF viewer in one window, where you would expect to see an accurate image of the page, and in a second window alongside display the actual searchable text in the PDF file, in a software that enables it to be edited and then saved as an ePUB file. That would be easier with a reasonable size screen, and the details of how the editing operation might be performed can perhaps be deferred until you have decided whether to buy Finereader.

As BruceG says, clearly life would be much easier if you are able to start with good quality images. I don't know anything about the algorithm VFlat uses for curve flattening, but whether it uses a single assessment of the amount of curvature or different assessments for alternative zones, positioning the open book so that curvature is as far as possible even should be beneficial. I have seen suggestions on the forum for quite simple aids that people have used to assist in flattening an open book, one of them I think recently, but can't point you to one.
glenleslie
Posts: 30
Joined: 13 Aug 2012, 09:08
E-book readers owned: Kindle - multiple platforms
Number of books owned: 1000
Country: United States

Re: Help...novice scanner using iOCHOW camera scanner

Post by glenleslie »

Abbyy Finereader really should help you with your challenges. I manually scanned a 125 page hard-to-find book two years ago using my Brother printer at 300 DPI gray scale (it has a scan button which can FTP the resulting scan as a JPEG to an FTP server... for a 200 page book, it's doable). The pages were 2 page scans (2-up) - you can see the originals in the attached (jpgs are the original pages). I didn't have as much problem with page curving because the printer/scanner has a lid to squash the book down. The problem that introduced was page skewing as I could not easily tell if I had mashed the page completely flat. The early pages skewed pretty badly as you can see in my samples.

Abbyy 12 Finereader Pro (which is now quite old, I have a licensed version so maybe the trial version doesn't do this stuff?) automatically recognized the 2up pages from a directory of JPEG files, split them, auto-rotated to portrait and then straightened the page skew, and line-straightening the center of each JPG where the page divide introduced a little bit of curve on the pages. I spent about 2 hours editing pages further deleting junk on the edges and manually correcting some words which it really missed. Separator pages and different fonts in Chapter/Section headings threw it off quite a bit for OCR.

So... bottom line is that even with low graphic quality, Abbyy did a good job. v12 is missing two key features for this sort of work:

1. Page edge deletion just doesn't seem to work -- that was the bulk of what I had to delete myself. I gave up and just left it messy on a lot of pages in my final document.
2. The pages do not auto-center in split page mode... once it splits pages, you're in some sort of IIWII (ee-wee ) ... It Is What It Is mode regarding page alignment of the split pages. Someone checked into it in 2021 with a newer version of FR for me here on the forum and they didn't see any newer options although they didn't say which version of FR they were referring to.

I think Abbyy autozoomed the pages to a pre-defined page size. I messed with Scantailor with the same book so I may be mis-remembering that bit.

I'm super interested in vFlat as one of your earlier commenters indicated: if you can reduce the problems with the original scan, then it's much less of a job when you import all the graphics into Abbyy (or Scantailor if you want to go that route). If vFlat really could reduce page curve then it would be relatively easy to find a camera stand and just use a phone with a high MP camera to do a much better job than my Brother printer's scanner feature.
test.pdf
(123.42 KiB) Downloaded 112 times
InventingFlatEarth1 (9).jpg
InventingFlatEarth1 (8).jpg
BruceG
Posts: 99
Joined: 14 May 2014, 23:17
Number of books owned: 500
Country: Australia

Re: Help...novice scanner using iOCHOW camera scanner

Post by BruceG »

Trying to capture 2 pages at a time is difficult. That is why people have designed many different scanners to reduce the problems. Problems are more than halved by capturing one page at a time. One would think using a flat bed scanner would do away with these problems. As one can see in the sample provided, this is not true. The biggest problem is the binding of the book, it is not the same top to bottom of a page let alone consistent through the whole book. In theory lines should extend across both pages when opened up, this does not often happen. Abbyy does not like this situation as it try to straighten the pages before doing OCR. When reading our eyes make all the adjustments for us on the go.
Although Abbyy can auto split pages, I usually split 2 page scans manually.
My version of Abbyy Finereader is v15. I am happy do any testing for anyone.
I am not sure what 'Page edge deletion' is. If it is every thing past the text eg. edge of other pages or cover. I have not looked for it and do not see how to do it automatically. If I wanted to do it I would crop.
The trouble with auto splitting is that you cannot undo it. Other than start again and do manually.
aku
Posts: 54
Joined: 02 Jan 2010, 08:38
Number of books owned: 0
Country: Germany
Location: Willich, Germany

Re: Help...novice scanner using iOCHOW camera scanner

Post by aku »

BruceG wrote: 04 Feb 2022, 22:54 It maybe worth looking at a program called YASW.
See https://sourceforge.net/projects/yascanw/
glenleslie
Posts: 30
Joined: 13 Aug 2012, 09:08
E-book readers owned: Kindle - multiple platforms
Number of books owned: 1000
Country: United States

Re: Help...novice scanner using iOCHOW camera scanner

Post by glenleslie »

BruceG wrote: 05 Feb 2022, 23:29 One would think using a flat bed scanner would do away with these problems. As one can see in the sample provided, this is not true.
Agreed this is why Daniel Reetz chose the V shaped cradle.. to avoid this exact problem... all the way back to the Dumpster Dive Scanner...
GemmaGem
Posts: 9
Joined: 04 Jan 2022, 21:25
Number of books owned: 300
Country: United States

Re: Help...novice scanner using iOCHOW camera scanner

Post by GemmaGem »

I'm not sure if it was just because it was a free trail but Abbyy was limiting me on how many pages I can OCR. That makes me nervous cause that is one of the reasons I needed it. Trying to get clarification from the website did not prove that helpful cause they only answered one part of my question and spent the whole email explaining the pricing options. I think I need to go to back to the beginning with the scanner and with the VFlat my main issue with both was getting the pages to flatten enough to get a good photo. Either I have too much curve showing or my fingers are in too much of the page. I don't be able to seem to find an in between. My pixel does take nice photos and the stand I have works well so I just think I need to get more creative with the book positioning to get the right angle, which is hard to do with just the tips of my fingers. The press glass might help but I'm afraid it might create too much glare. Like I said before every time I fix one problem another one seems to pop up. I'm at the point where I'm really reluctant to take a book apart, just to go digital. I had really hoped to donate the books when I was done with them.
glenleslie
Posts: 30
Joined: 13 Aug 2012, 09:08
E-book readers owned: Kindle - multiple platforms
Number of books owned: 1000
Country: United States

Re: Help...novice scanner using iOCHOW camera scanner

Post by glenleslie »

GemmaGem wrote: 07 Feb 2022, 20:08 I'm not sure if it was just because it was a free trail but Abbyy was limiting me on how many pages I can OCR.
This is definitely a trial issue. My FineReader Pro 12 is only limited by system resources... but even a several hundred page book is handled fine on a system with a very old quad core AMD system with 24GB RAM.
GemmaGem wrote: 07 Feb 2022, 20:08 I'm at the point where I'm really reluctant to take a book apart, just to go digital. I had really hoped to donate the books when I was done with them.
The reason Daniel Reetz invented the first diy scanner machine was to avoid "destructive scanning" (cutting the binding off a book and scanning the loose pages). It's the first Q in the FAQ...https://diybookscanner.org/en/faq.html
Post Reply