Examples from my postprocessor software

Share your software workflow. Write up your tips and tricks on how to scan, digitize, OCR, and bind ebooks.

Moderator: peterZ

User avatar
rob
Posts: 773
Joined: 03 Jun 2009, 13:50
E-book readers owned: iRex iLiad, Kindle 2
Number of books owned: 4000
Country: United States
Location: Maryland, United States
Contact:

Examples from my postprocessor software

Post by rob »

I don't have a name for it... I guess "Rob's Book Postprocessing Software" will do for now. The software is written in Java, and makes heavy use of the Leptonica C library.

First, we have the original image straight from the camera. This is originally 285 ppi, 24 bpp. I measured the size of the book and then measured its image in order to derive the ppi. The following image is a sample page. It has been reduced to fit within the filesize restrictions of the forums.
Original image from camera
Original image from camera
IMG_0020_small.jpg (250.87 KiB) Viewed 21479 times
Note that it seems dark. In actuality, there is 400W of halogen light streaming onto the page. It's bright. When I stop using my scanner, I have to let my eyes adjust for a minute or so before I can see again! The camera is set to ISO 100, and it selects a shutter speed of 1/200 second and aperture of F/10. Despite that, the image still looks dark. The highest level of any significance is 189 (on a scale from 0 to 255). In any case, this is the image we have to work with.

Several important features which the software requires:

1. A dark area surrounds the top, bottom, and edge of the page.
2. The spine (i.e. the join in the middle of the platen) appears in the image.
3. The other page also appears in the image. Reflections in the other page side are OK.

I have to tell the software that this is a left-hand page so that it knows which end is up. The software then:

1. rotates the page to the proper orientation,
2. converts to gray,
3. generates a histogram,
4. computes a threshold (based on the Otsu method),
5. thresholds the image,
6. deskews the image (based on the Postl method),
7. dekeystones the image (based on making all text lines horizontal).

Here is the result on the sample page:
IMG_0020w_small.png
Deskewed, dekeystoned binary image
(218.61 KiB) Downloaded 4051 times
The next step is to denoise and "blockify" the text. I do this by three morphological operations:

1. Closure, using a sel of 25x1
2. Erode, using a sel of 3x1
3. Erode, using a sel of 3x3

Here is the result:
IMG_0020wd_small.png
Blockified image
(52.24 KiB) Downloaded 4051 times
Note that a lot of the white specks in the black background were removed.

The next step is to find the page outline in this image by first locating the spine and edge. First we find the top and bottom of the white area of each pixel column, and then using that data, starting from the black area, find the "cliff" at the edge of the page, then march along the flat of the page until we find the hill, which we declare to be the spine.

To determine the top and bottom of the page, we add up the white pixels for each pixel row, and then from the top and bottom of the image, march down and up, respectively, until we find a row that has more than 50% of its pixels being white. We then declare the top and bottom of the page.

Knowing the edges of the page, we can apply the measured deskew and dekeystone to the original image, and clip to the page limits. Here is what a closeup view of this looks like:
Zoom of corrected image
Zoom of corrected image
cIMG_0020_small.png (189.91 KiB) Viewed 21479 times
It's not too bad. Notice the particularly bad vertical fracturing in the letter 'e' in 'ranked' in the first line, and the awful horizontal fracturing through the third line.

The software then binarizes this by upscaling by a factor of 4, and then thresholding via Otsu again. Specks are then removed by morphological closing with a sel of 3x3. The image is then kept at this upscaled factor because I found it looks much better when both viewed and printed this way. Here is a part of the resulting image:
bIMG_0020_small.png
Zoom of binarized upscaled corrected image
(45.33 KiB) Downloaded 4051 times
While there are still fractures, they are barely noticeable due to the increased ppi. In addition, this image does the most justice to the original font used in the book. Keeping the original ppi (i.e. not upscaling) results in terrible quality.
The Singularity is Near. ~ http://halfbakedmaker.org ~ Follow me as I build the world's first all-mechanical steam-powered computer.
spamsickle
Posts: 596
Joined: 06 Jun 2009, 23:57

Re: Examples from my postprocessor software

Post by spamsickle »

The reason your original image is dark despite being blasted by halogens is because the automatic exposure function of your camera is calibrated to expect 18% grey. This is a useful compromise in most real-world shooting situations, and will produce exposures for most scenes which are neither too light nor too dark. Unfortunately, when shooting scenes which are naturally brighter than 18% grey (like snowdrifts, or black text on white pages), the exposure will typically be too dark.

In my setup, I'm using "aperture priority" on my cameras, and setting the aperture to the highest number available on the camera. This gives me the most depth of field, which is important in my setup because my book cradle doesn't slide from side to side. That means that the pages move a bit closer or a bit further away for me, as I turn from the front to the back of the book. Going for the maximum depth of field allows me to use "manual focus" and not have to worry about refocusing the camera as I move through the book, or depend on autofocus and worry about pages with no text in the focus area. It's set it and forget it on the focus. Those who are using a sliding cradle which keeps each page the same distance from its camera could go shutter priority or full automatic on the exposure, because they won't need to worry about depth of field.

To compensate for the 18% grey effect, I use the EV setting on my cameras, which allows me to vary the exposure by a set amount. Again for my cameras, I find that +0.7 or +1.0 on the exposure gives me white pages without washing out the text. This might enable you to skip some of the processing you're doing, which looks to me like it's chipping away your fonts. This may be intentional, and you may be getting better compression on your PDFs or something by generating the kind of high-contrast black-and-white image you're getting, but I'm not too worried about the sizes I end up with, as long as they'll fit on a DVD. Most of my PDFs are currently coming in at under a gigabyte, or about the same size as the original collection of JPEGs. That's fine with me for now; at some point I'll probably look into making the PDFs more compact. Right now, I'm still saving the original JPEGs (4 books per DVD), and making separate PDF DVDs (again, 4 books per DVD).
jradi

Re: Examples from my postprocessor software

Post by jradi »

What cameras are you using? Those sound like DSLR features...
User avatar
daniel_reetz
Posts: 2812
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: Examples from my postprocessor software

Post by daniel_reetz »

with CHDK (http://chdk.wikia.com/), our compact Canons have all the features of SLRs (and sometimes more).

Spamsickle, while you're pretty much right about metering, I'd like to mention a few things here.

The reason that they're grey is because Rob's camera is set to auto-expose with a center weighted average (which follows the "gray world assumption" similar but not quite identical to the gray card you mention, sorry for the super-lame pedantry). For our purposes, this is less than optimal, because yes, it does lead to a relatively gray image because it "tries" to move all the bits to the center of the histogram. It's better to set the exposure manually, and to push the white pixels up near the right of the on-screen histogram (but not to let them clip). Crudely put, the reason for this is that the bits at the left of the histogram represent the dark (and therefore noisy) pixels, which are somewhat "worse" at representing our data than pixels at the right of the histogram... right up until things start clipping, which is Pixel Death.

You might want to reconsider using the smallest possible aperture. I need to confirm this experimentally, but with all the cameras I have calibrated at work there's a point where a smaller aperture stops buying you depth of field and starts degrading the image ever so slightly (or at least doesn't improve things). For example, on my Tokina 12-24 lens on my Nikon D200, I gain nothing after F7.6, though I can go all the way to F22. At some point the system is limited by diffraction around the aperture entrance itself and not the glass elements anymore. Making the aperture smaller after that will just reduce the amount of light coming in. Guessing you already know this, but I thought it was worth a mention. And I guess if you're already getting good results it's just more lame pedantry, but it has the potential benefit of getting more out of your existing light.

Personally, I set everything but focus manually, and I will be setting that on the final one. Makes things fast and reliable.
User avatar
daniel_reetz
Posts: 2812
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: Examples from my postprocessor software

Post by daniel_reetz »

And thanks for coming over here -- really glad to have you!
spamsickle
Posts: 596
Joined: 06 Jun 2009, 23:57

Re: Examples from my postprocessor software

Post by spamsickle »

jradi wrote:What cameras are you using? Those sound like DSLR features...
One of my cameras is a DSLR, but the other is a cheap Casio. Even the cheap camera offers choice of shutter/aperture priority metering, white balance, and an EV adjustment. I don't know what EV stands for; I assume the E is for exposure, and the V is for value or something, but with white pages and black text, I get good results by setting this to +0.7 or +1.0. I assume this skews things somewhat on pages with illustrations rather than text, so Daniel's suggestion to set the exposure completely manually is probably better than trying to go automatic with a fudge factor. If I had a grey card, I'd use that. Come to think of it, I should probably print one, or cut the reflective vinyl off of a grey notebook and see how close that comes. Or just stop by a camera store and buy one; they can't be more than a couple of bucks, can they?
spamsickle
Posts: 596
Joined: 06 Jun 2009, 23:57

Re: Examples from my postprocessor software

Post by spamsickle »

daniel_reetz wrote:You might want to reconsider using the smallest possible aperture. I need to confirm this experimentally, but with all the cameras I have calibrated at work there's a point where a smaller aperture stops buying you depth of field and starts degrading the image ever so slightly (or at least doesn't improve things). For example, on my Tokina 12-24 lens on my Nikon D200, I gain nothing after F7.6, though I can go all the way to F22. At some point the system is limited by diffraction around the aperture entrance itself and not the glass elements anymore. Making the aperture smaller after that will just reduce the amount of light coming in. Guessing you already know this, but I thought it was worth a mention. And I guess if you're already getting good results it's just more lame pedantry, but it has the potential benefit of getting more out of your existing light.
On my Casio, I have my choice of f 2.8 or f 4.0, so I'm not exactly doing pinhole photography. The DSLR has more range, but the images coming out of that camera seem a bit better than those from the pocket camera, so if the small aperture is introducing degradation, the better-quality camera is more than making up for it.

And, really, I only need a couple of inches of depth of field to keep things focused. I do my manual focusing in the center of the book, preferably on a page with a bold black line somewhere. I really miss my old SLR with split-image viewfinder; trying to focus on text with a 2-inch LED display is nowhere near as effective as that was. I know you don't have that problem with your live monitor feed, but I'm still working with my "do it wrong quickly" setup that doesn't have that feature.

The eventual solution will be to give up my static setup and get that rolling book cradle the rest of you have, but for now I'll work with what I have and think about what I'll have next. I'll hang out here and nick some more ideas from the other people who are playing with this, and some day I'll get around to adding the bells and whistles or building a new scanner.
User avatar
daniel_reetz
Posts: 2812
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: Examples from my postprocessor software

Post by daniel_reetz »

Ack, sorry. I forgot that you had different cameras. Yeah, 4.0 is definitely a better bet. ;)

One thing they get right on the Canon compacts is that they give you a small section of the screen with 100% magnification, so you can do manual focusing pretty easily. Still not nearly as nice as a rangefinder.

I'm pretty excited about your "do it wrong quickly" setup. The knowledge that it works well enough as is, is super cool. And as you saw from the Instructable, I'm all about being field-expedient and crude when the situation demands it. Great work.
User avatar
daniel_reetz
Posts: 2812
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: Examples from my postprocessor software

Post by daniel_reetz »

Rob, thanks for the pointer to Otsu's method. That's a pretty sweet method of thresholding.

To what degree is your processing resolution-independent? Don't you have to choose a radius for erosion and dilation to work?
spamsickle
Posts: 596
Joined: 06 Jun 2009, 23:57

Re: Examples from my postprocessor software

Post by spamsickle »

daniel_reetz wrote:One thing they get right on the Canon compacts is that they give you a small section of the screen with 100% magnification, so you can do manual focusing pretty easily. Still not nearly as nice as a rangefinder.
The Casio gives me a magnified area when I'm doing the manual focus. It might be 100%, I don't know, but it's not nearly enough for these old eyes. Having done a few books now, I kind of know where the pointer on the "distance slider" should be. The best way for me to make absolutely sure the focus is right is just to take a test picture. I can blow that image up several times in-camera, to the point that a single word fills the viewfinder. If it isn't quite right, I can switch back to camera mode, pop another step one way or the other on the focus, and check it again. You're right, it's not nearly as nice as a rangefinder, but since I don't have to finish the roll and wait for development to know if I got it right, all things considered, I'll take digital, shortcomings and all.
Post Reply