Scan Tailor

Scan Tailor specific announcements, releases, workflows, tips, etc. NO FEATURE REQUESTS IN THIS FORUM, please.

Moderator: peterZ

Locked
User avatar
daniel_reetz
Posts: 2812
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Scan Tailor

Post by daniel_reetz »

My friend Mary M. just pointed me to this very interesting software package, Scan Tailor.
Scan Tailor is an interactive post-processing tool for scanned pages. It performs operations such as page splitting, deskewing, adding/removing borders, and others. You give it raw scans, and you get pages ready to be printed or assembled into a PDF or DJVU file. Scanning, optical character recognition, and assembling multi-page documents are out of scope of this project.
spamsickle
Posts: 596
Joined: 06 Jun 2009, 23:57

Re: Scan Tailor

Post by spamsickle »

This looks good. A bit slow, but most of the pages are adequately recognized in automatic mode -- problems with some page numbers being clipped out, a few pages "recognizing" bits of the facing page, and a problem with blank pages that can be made moot by running Scan Tailor after merging the left and right views -- but it's sufficiently robust and sufficiently flexible that I'll probably stop using YAPP. I'll still be using ImageMagick -- Scan Tailor, as far as I can tell, only puts out TIFF files, and I still need to convert them to PDFs. Also, it's possible to bloat the original images by 10-20 times by choosing output parameters poorly -- color and 600 DPI takes a 1.5 MB JPEG and turns it into a 25 MB TIFF -- but the "mixed" mode does a good job of putting out crisp text and still preserving greyscale images.

I need to play with it some more, but I think this is going to become my main post-processing engine, at least until something better comes along. Thanks for the tip.
Turtle
Posts: 40
Joined: 04 Mar 2014, 00:53

Re: Scan Tailor

Post by Turtle »

That's a great find. Automatically splits page pretty well. Too bad there's no option to use only one feature like split page alone. You have to run your pages through the whole process which is very time consuming.
User avatar
daniel_reetz
Posts: 2812
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: Scan Tailor

Post by daniel_reetz »

Mary's pointed me to a few interesting things now. Thanks for taking the time to check this out and come back with your experiences.

You know, if Scan Tailor had a few extra features, and especially if it had a "camera model" -- in other words, taking into account focal length and lens distortion, it could really be a killer processor. You could probably get this done with Fulla from the Hugin suite, or some other panotools prog.

His page says he's looking for developer help. If only I had any worthwhile programming skillz...
User avatar
rob
Posts: 773
Joined: 03 Jun 2009, 13:50
E-book readers owned: iRex iLiad, Kindle 2
Number of books owned: 4000
Country: United States
Location: Maryland, United States
Contact:

Re: Scan Tailor

Post by rob »

Fascinating... I'm going to take a look!
The Singularity is Near. ~ http://halfbakedmaker.org ~ Follow me as I build the world's first all-mechanical steam-powered computer.
Mandor
Posts: 24
Joined: 28 Jul 2009, 01:27
E-book readers owned: lBook V8, lBook V3
Number of books owned: 0
Location: Sofia, Bulgaria

Re: Scan Tailor

Post by Mandor »

May be you don't know, but ScanTailor is written as "reply" to Scan Kromsator - very powerfull, but very sophisticated and not-well-documented program. Many users in Russia used SK for post-scan image processing.
User avatar
daniel_reetz
Posts: 2812
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: Scan Tailor

Post by daniel_reetz »

That is super-interesting, Mandor. I just found the abbreviated guide to Kromsator. I speak enough Russian to understand the instructions, but I don't recognize or understand the word "kromsator". Does it sound like anything to you?
Mandor
Posts: 24
Joined: 28 Jul 2009, 01:27
E-book readers owned: lBook V8, lBook V3
Number of books owned: 0
Location: Sofia, Bulgaria

Re: Scan Tailor

Post by Mandor »

Well, you can use Толковый словарь русского языка:
КРОМСАТЬ, аю, аешь; кромсанный; несов., что (разг.). Грубо, неаккуратно резать на части. К. хлеб
and sounds like: "roughly, neglect cutting in pieces".
User avatar
daniel_reetz
Posts: 2812
Joined: 03 Jun 2009, 13:56
E-book readers owned: Used to have a PRS-500
Number of books owned: 600
Country: United States
Contact:

Re: Scan Tailor

Post by daniel_reetz »

Thanks for the link and explanation. I usually useKatzner's dictionary, but it's in a box out in my workshop. I'll use the Толковый словарь from now on... certainly looks more complete than the Promt online engine...

I love Russian for words like "неаккурат()".
User avatar
rob
Posts: 773
Joined: 03 Jun 2009, 13:50
E-book readers owned: iRex iLiad, Kindle 2
Number of books owned: 4000
Country: United States
Location: Maryland, United States
Contact:

Re: Scan Tailor

Post by rob »

Ha, the only Russian I know is, божемой!

Anyway, I compiled scantailor on OSX, and it seems pretty interesting, but it does not seem to take care of the two major problems using cameras, which are splitting the page properly (almost always chooses the wrong side for the page [EDIT: I misinterpreted Scan Tailor's output, and found it was actually selecting the proper side]), and keystone correction (as in, there is none). Here's an example of the auto-deskewed version of a page. Notice that there is no fixed skew amount that will correct a keystoned image.
scantailor-nokeystoning.png
(601.25 KiB) Downloaded 34561 times
I really should work on PostProcessor again...

--Rob
The Singularity is Near. ~ http://halfbakedmaker.org ~ Follow me as I build the world's first all-mechanical steam-powered computer.
Locked