I decided to develop a new application designed for Book scanning post-processing. Its not quite ready for a beta, but its getting close.
What I wanted was a tool that I could define certain actions to be preformed, like deskewing, correcting keystone distortion, and cropping, and have them automatically apply to the entire batch. I wanted a tool that could be run interactively to set up the job, then having the option to run the actual full processing without user intervention. So I decided to come up with a new tool, which Iâ€™m calling BSW (Book Scan Wizard) http://bookscanwizard.sourceforge.net
. I am releasing it as open source, under the GPL license.
Its a bit different animal than Scan Tailor in that you define just what you want done to the pages. So while Scan Tailor will try to figure out the margins by examining the pages, with BSW you click on the image corners and add a crop operation.
This works on the premise that the book scanner keeps the pages more or less in the same position from one scan to the next, so that once that operations are defined (with some separate configuration of the left and right pages), it can be applied to the remainder of pages. The goal is to be able to set up the configuration of an entire book in less than 5 minutes, and be able to set up a bunch of books and convert them all without any user intervention. FeaturesOptionally works with a separate left and right images folder:
This will match left & right images by the timestamp of the images, but will use the last images from the two directories to sync the remainder. That way, even if the cameras do not have their time synchronized, it can still match up the images correctly. If there is an image that doesn't have a matching image, it will get flagged and put at the end of the list. So if a camera doesn't fire, or if you take some test shots, it won't screw up the placement of the remaining images. Rotating, fixing keystone distortion (perspective):
This is done by bringing up an image of the page, and selecting 4 corners that should be straightened out to a rectangle. It could also crop the image using the same selected corners. Or if just the rotation needs to be fixed it can be specified by clicking to points that should be horizontal. Barrel or pincushion distortion can also be corrected.Performs basic color corrections such as "auto levels", "levels", and gamma adjustments.Converts to grayscale or black & white. Calling external scripts:
If in the middle of the processing you wanted to call an ImageMagic script to do something fancy, that can be defined as part of the process. Or if you know Java and want to add a new operation to the program, it can be integrated easily into the program.Define operations for certain pages:
For any operation, you have the ability to specify what pages the command should be run for. You can choose to do an operation on the left side pages, or on a certain page or range of pages. For example, if you want to make everything black and white but the exception of a few photo pages it in the center of the book, it can be done. Or leave the cover in color, cropped less so that the whole cover is visible, while cropping the internal pages tighter. You can also indicate certain pages that shouldnâ€™t be included in the output.Optionally estimates the source dpi by examining the focal length from the jpeg metadata
The way this works is you take two pictures, one zoomed a bit out, and another zoomed a bit in, then measure the dpi of those two images. Using that information the program will interpolate to find the source dpi of other images. Assuming you keep the camera at roughly the same distance whenever you scan, and just change the zoom, you only have to do this step once. Accurately indicating the source dpi of an image will help with OCR tasks and will ensure if you print from the scan, it will be about the same size as the original.Scales the image to the desired dpi.
It will create as an output a scaled version of each page. If your two cameras had two different zoom settings, it can adjust for that and have each page be the same size.Fast:
Depending on the size of the images and the speed of the computer it will process each one in less than a second. If your computer has multiple cores, it will make use of them.Will run anywhere that Java runs:
It is written in Java with the JAI toolkit so will run on many platforms, including Windows, Linux, and Mac. Note that the Mac version will run slower, because the JAI toolkit doesnâ€™t have a native library for the Mac.Easy to rerun the process:
Because the configuration is saved, if a mistake is make in cropping. or if a page was missing in the initial conversion, it is easy to correct it and rerun the process. Also, because resulting tiff files can be easily regenerated, there is no need to hold onto them after creating the final pdfs or other files. Just save your source images and the configuration file, and if you ever have a need, they can be regenerated. Status of the project
Its pretty much working, but it is still a bit rough, and I don't have a good install process, and not much in the way of documentation. But if you know Java, are comfortable with using svn to download java source code, and using ant or NetBeans to compile the code, feel free to check it out.http://sourceforge.net/projects/bookscanwizard/