Scan Tailor

Scan Tailor specific announcements, releases, workflows, tips, etc. NO FEATURE REQUESTS IN THIS FORUM, please.

Moderator: peterZ

Tulon
Posts: 687
Joined: 03 Oct 2009, 06:13
Number of books owned: 0
Location: London, UK
Contact:

Re: Scan Tailor

Post by Tulon »

Eventually Scan Tailor will make use of OpenCL, though it's not a high priority right now.
Actually, if you profile ST, the very algorithms taking most of the time are easily parallelizable. The heaviest one is probably building Euclidean distance maps. It's in imageproc/SEDM.{h,cpp} It's actually pretty fast by itself, but the "Select Content" stage calls it too many times. Then we have Savitzky-Golay smoothing, which is nothing more than a simple convolution, with the only complexity being the kernel calculation. Then we have polynomial surface generation, used in illumination equalization, which is implemented using QR decomposition by Givens rotations. That should be paralellizable as well, though not trivially. If those 3 things are implemented in OpenCL, performance should go up at least 3 times.
Scan Tailor experimental doesn't output 96 DPI images. It's just what your software shows when DPI information is missing. Usually what you get is input DPI times the resolution enhancement factor.
spamsickle
Posts: 596
Joined: 06 Jun 2009, 23:57

Re: Scan Tailor

Post by spamsickle »

Tulon, I saw that guy's review of your software on your page, complaining that you didn't have a university degree and weren't providing UML diagrams. I didn't know he'd launched a vendetta, but I was thinking what an ass he was just on the basis of those comments. Who cares about that fluff, when you have an application that works well and is easy to use?

I'm still debulking my library mostly by culling the cheap books I can cut up and feed through ScanSnap, but in a few weeks or months I should be using my DIY snapshot scanner more frequently. As I said, keystoning doesn't really bother me, and while I noticed the frequent loss of page numbers, I can live with that too. Unlike Rob, I haven't even compiled your source code yet, but when I do, I'll start looking into how you're doing the selection. If you haven't already solved the page number problem by then, maybe I'll be able to help.

I've been following OpenGL for some time, but I didn't even know there was an OpenCL project. It sounds promising, and I'll be watching for further developments. Won't be buying a Mac just to play with it, though...

Thanks for making this useful application Scan Tailor available, and welcome to the DIY Scanner forum.
Tulon
Posts: 687
Joined: 03 Oct 2009, 06:13
Number of books owned: 0
Location: London, UK
Contact:

Re: Scan Tailor

Post by Tulon »

spamsickle wrote:If you haven't already solved the page number problem by then, maybe I'll be able to help.
I haven't. Well, there are actually several possible reasons for page numbers to be cut off, some of them being all but unsolvable. Basically, if the page number is far away from the rest of the text, the only chance for it not to be cut is getting classified as text. One of the reasons that may fail is poor contrast and poor resolution of input images. Text detection happens on binarized images. I can't use my illumination equalization code, because it works well only if content area is already known. So I use Wolf binarization (the link to the relevant paper is in the source code). The reason I chose Wolf is because it tends to produce overly thick letters, which is exactly what I want, or actually, it's exactly the opposite of what I want to avoid, that is letters breaking up into separate fragments. Text detection involves collecting statistics on cavities in a possible text line. Not enough cavities means it's not text. So, when we have poor image contrast, or smooth edges (jpeg compression?), even Wolf binarization produces unconnected letters. That leads to few cavities, which prevents them to be classified as text.

BTW, you can activate the debugging mode from Tools -> Debugging and see the intermediate results each stage produces. For example you can find visualized cavities in a tab called "ueps" (ultimate eroded points) on the Select Content stage. Note that aggressive caching sometimes defeats the debugging mode :) In such cases you need to do something to make it re-evaluate the page, for example switching to manual mode and back.
Scan Tailor experimental doesn't output 96 DPI images. It's just what your software shows when DPI information is missing. Usually what you get is input DPI times the resolution enhancement factor.
Tulon
Posts: 687
Joined: 03 Oct 2009, 06:13
Number of books owned: 0
Location: London, UK
Contact:

Re: Scan Tailor

Post by Tulon »

spamsickle wrote: I've been following OpenGL for some time, but I didn't even know there was an OpenCL project. It sounds promising, and I'll be watching for further developments. Won't be buying a Mac just to play with it, though...
You don't have to. NVidia supports OpenCL on all platforms. AMD and Intel are expected to support it soon as well.
Scan Tailor experimental doesn't output 96 DPI images. It's just what your software shows when DPI information is missing. Usually what you get is input DPI times the resolution enhancement factor.
StevePoling
Posts: 290
Joined: 20 Jun 2009, 12:19
E-book readers owned: SONY PRS-505, Kindle DX
Number of books owned: 9999
Location: Grand Rapids, MI
Contact:

Re: Scan Tailor

Post by StevePoling »

Tulon wrote:Eventually Scan Tailor will make use of OpenCL, though it's not a high priority right now.
Actually, if you profile ST, the very algorithms taking most of the time are easily parallelizable. The heaviest one is probably building Euclidean distance maps. It's in imageproc/SEDM.{h,cpp} It's actually pretty fast by itself, but the "Select Content" stage calls it too many times. Then we have Savitzky-Golay smoothing, which is nothing more than a simple convolution, with the only complexity being the kernel calculation. Then we have polynomial surface generation, used in illumination equalization, which is implemented using QR decomposition by Givens rotations. That should be paralellizable as well, though not trivially. If those 3 things are implemented in OpenCL, performance should go up at least 3 times.
Tulon, For someone who says he's no good at math, you certainly talk a good game. I haven't heard of QR decompositions since I was in grad school. (Yeah, I need to get out more). You're definitely on the right path leveraging GPU hardware to the max. OpenCL sounds like the way to go for anyone with the hardware.

Off Topic: way back when I had a freshly minted Masters and got a job working where they had a lot of very fast computers. I figured I'd use my sexy cool matrix theory i'd learnt in grad school on the Cray (a particularly fast supercomputer at the time). One of the older programmers shook his head and said that simple row-column operations (the stuff they teach in 1st year Linear Algebra) were faster because they all vectorized. I was crushed.
phaedrus
Posts: 56
Joined: 04 Mar 2014, 00:52

Re: Scan Tailor

Post by phaedrus »

Tulon, I was one of the people who left a [positive!] english comment on SF regarding ScanTailor - very impressive work! I was directed to it circuitously from Slashdot :-)

As someone else has commented here already what's happening is that this community is making effective use of cameras to snap pages in a book rather than use a flat-bed scanner as has been the norm in the past. I'm sure you've read all about it!

What I'm surprised to see is that while everyone has been working on some pretty impressive setups no-one has really mentioned that they're using cameras to record articles/books in libraries themselves. As an amateur historian I am often taking snaps of pages from reference-only books in the local library and historical archive center. Given that they're not keen on anything other than a camera (ie. not even a tripod) the end result can be a bit dubious. It was on some of these snaps that I tried ScanTailor and was very pleased with the results.

That said I'd like to put in my vote for some sort of keystone/skew correction at some stage - when you're just hand-holding a camera it's almost impossible to get things straight and this could be a good thing to have. I've had a brief try of ScanKromasator which seemed to do a reasonable job of straightening things up but didn't clean up the image anywhere near as good as ScanTailor so perhaps in the interim an amalgam of the two may do the best job. I'll have more of a play at some stage and report on the results if anyone's interested.

So good work Tulon, thanks for offering ScanTailor for us to use!

Cheers, P.
Tulon
Posts: 687
Joined: 03 Oct 2009, 06:13
Number of books owned: 0
Location: London, UK
Contact:

Re: Scan Tailor

Post by Tulon »

StevePoling wrote: Tulon, For someone who says he's no good at math, you certainly talk a good game. I haven't heard of QR decompositions since I was in grad school.
Well, it's not that I really understand what I am talking about. For illumination equalization algorithm I took the idea from this paper: http://www.comp.nus.edu.sg/~tancl/Paper ... zation.pdf
The original formulas were quite simple, so I managed to implement that method. Later I found some source code that did least squares fitting using QR decompositions, which is supposedly a better way of doing that, so I adapted that code for Scan Tailor.
My current approach is not limited to polynomial surface generation, but also involves some morphology, which allows it to handle complex cases, like a page with a picture covering most of it. It's biggest limitation is inability to cope with artificial white areas around the image. Such areas may be the result of rotating it in another program. If they are black, it's not a problem though.
For those interested, the source code can be found in EstimateBackground.{cpp,h}, imageproc/PolynomialSurface.{cpp,h}, imageproc/LeastSquaresFit.{cpp,h}
As usual, debugging mode is useful for learning how it works.
Scan Tailor experimental doesn't output 96 DPI images. It's just what your software shows when DPI information is missing. Usually what you get is input DPI times the resolution enhancement factor.
Tulon
Posts: 687
Joined: 03 Oct 2009, 06:13
Number of books owned: 0
Location: London, UK
Contact:

Re: Scan Tailor

Post by Tulon »

phaedrus wrote: That said I'd like to put in my vote for some sort of keystone/skew correction at some stage - when you're just hand-holding a camera it's almost impossible to get things straight and this could be a good thing to have. I've had a brief try of ScanKromasator which seemed to do a reasonable job of straightening things up but didn't clean up the image anywhere near as good as ScanTailor so perhaps in the interim an amalgam of the two may do the best job. I'll have more of a play at some stage and report on the results if anyone's interested.
I agree that keystone correction is a high priority task. There are a few problems though:
1. I am currently the only one working on Scan Tailor, and there are actually some higher priority tasks right now, like finishing manual picture zones and getting deskpeckling working right. Currently it's too aggressive and can easily eat stuff like equality symbol in formulas, provided there is enough space around it.
2. I will probably need assistance with maths, and ideally with code as well. I posted a link to a paper that proposes keystone correction approach, so interested people may actually try to implement that.
Scan Tailor experimental doesn't output 96 DPI images. It's just what your software shows when DPI information is missing. Usually what you get is input DPI times the resolution enhancement factor.
User avatar
rob
Posts: 773
Joined: 03 Jun 2009, 13:50
E-book readers owned: iRex iLiad, Kindle 2
Number of books owned: 4000
Country: United States
Location: Maryland, United States
Contact:

Re: Scan Tailor

Post by rob »

Before I gave up C++ for Java, I was pretty good at it. If you can point me to the right places in the code to try implementing a dekeystoning algorithm, I can maybe work on it. Simple at first (like Leptonica's dekeystoning), but then moving to that paper. I have a Bachelor's degree in mathematics, but my eyes sort of glaze over if an image processing paper becomes too abstract with lemmas and things. Just give me the damned equations and algorithms :)
The Singularity is Near. ~ http://halfbakedmaker.org ~ Follow me as I build the world's first all-mechanical steam-powered computer.
User avatar
rob
Posts: 773
Joined: 03 Jun 2009, 13:50
E-book readers owned: iRex iLiad, Kindle 2
Number of books owned: 4000
Country: United States
Location: Maryland, United States
Contact:

Re: Scan Tailor

Post by rob »

Working on that curled-snakes thing. Ohhhh my head hurts :( Hessian matrices, scale-space, and parameterized norms. But I'm working through it...
The Singularity is Near. ~ http://halfbakedmaker.org ~ Follow me as I build the world's first all-mechanical steam-powered computer.
Locked