alternative to abbyy?

Convert page images into searchable text. Talk about software, techniques, and new developments here.

alternative to abbyy?

Postby builderstudent2 » 23 Jul 2011, 00:15

This is what I have been doing:
http://www.megaupload.com/?d=YHHK35PN

So, I'm thinking about quitting using Abbyy Finereader. But what is the alternative? I tried using Irfanview for what I have to do above in the video but it just doesn't work right.

So, I'll explain what I'm doing in the video. I have about 2600 images that I have already done the above process on. I scan the images, transfer them to my computer, rotate them, deskew them, crop them, and finally save them as a pdf. As you see in the video, there are steps that I can perform on all the images at once. These include batch rotating and batch deskewing. In Abbyy, when I open images in the application, it automatically corrects the orientation for the image. As for batch deskewing, I just click on "apply to all images" and then deskew them all. However, I have to go through all 2600 pages and do the odd pages first and then start over and do the even pages next when I manually crop each of the 2600 files. But as you see in the video, Abbyy keeps the crop parameters the same everytime I move forward or backwards to my next image. That way, if an image's area of focus that I'm interested in falls outside of the crop parameters I previously set then I will be able to easily notice that and widen or restrict the crop parameters as needed. After all that, I save it as a PDF and I'm done.

Any ideas on which image editor can do everything that I used to do with Abby? It doesn't matter if it is paid or free. The thing about Abbyy that really frustrates me is that all of the absolutely have to be "read" before I can save the images to a pdf. That obligatory step makes me very angry and it takes a very very long time. It just does NOT work for large amounts of image files unfortunately...
builderstudent2
 

Re: alternative to abbyy?

Postby snaguy » 23 Jul 2011, 02:09

A paid solution that can do all that you want.

http://www.fujixerox.com.au/products/production/production-workflow-software/FreeFlow-Makeready/ffmaker

It probably will not be cheap and you will need a decent workstation to run it.

There is no free package that can do all you require. A combination of open source software can achieve the result you want but you would have to use individual programs.

You could also bring the source images directly into Acrobat Pro and do the post processing.
snaguy
 
Posts: 54
Joined: 14 Mar 2011, 04:28

Re: alternative to abbyy?

Postby snaguy » 23 Jul 2011, 03:26

What did you use to make the video?

Can you zip 20 images and make available for download.
snaguy
 
Posts: 54
Joined: 14 Mar 2011, 04:28

Re: alternative to abbyy?

Postby builderstudent2 » 23 Jul 2011, 06:04

You know, I really want to use Acrobat Pro and it almost is just what I need. BUT it has one flaw. (so far) It does NOT visualize the maintaining of crop parameters from one image to the next. (like Abbyy does in the video I linked to above) This feature is very important because, as you saw in the video, when moving from image file to image file, sometimes the crop parameters need to be made smaller just a little bit and sometimes it needs to be widen just a tiny bit. I love the way Acrobat does, in fact, allow only odd numbers to be cropped and not only that but also whichever files you want. That is very convenient. I may have missed something in Acrobat but I don't see where it allows the crop parameters to be maintained visually on screen from image to image.

Snaguy: I used Snagit to record the video. What do you mean zip 20 images? You want 20 unedited raw source images or 20 fully edited images...?

Edit:

Also, since Acrobat doesn't truly crop images, what I would have to do is turn the raw images into a pdf, edit them/crop/etc., save them as a jpep/png/etc. image file, and then finally re-convert them back to a pdf file. What a hassle.
Last edited by builderstudent2 on 23 Jul 2011, 06:19, edited 1 time in total.
builderstudent2
 

Re: alternative to abbyy?

Postby snaguy » 23 Jul 2011, 06:17

RAW

What operating system are you using and Acrobat Version?

This may also be possible in gimp via command line batch processing and a gimp de-skew plugin. Using a file renaming software too differentiate odd and even.
snaguy
 
Posts: 54
Joined: 14 Mar 2011, 04:28

Re: alternative to abbyy?

Postby snaguy » 23 Jul 2011, 06:35

http://sourceforge.net/projects/briss/

Briss is a cross-platform Java application which makes visual cropping of PDF pages much faster and easier. You just load a pdf file, draw one or more rectangles with a mouse to crop the regions you want from a page and then save them as another file. If you load a PDF file with multiple pages, you can see the odd and even numbered pages overlaid and crop them easily in one go.
snaguy
 
Posts: 54
Joined: 14 Mar 2011, 04:28

Re: alternative to abbyy?

Postby the.traveller » 24 Jul 2011, 08:59

builderstudent2 wrote:Snaguy: I used Snagit to record the video.

Don't you mean Camtasia from Techsmith, also the makers of Snagit.

I have never been working with Abby Finereader, but it sure looks very nice. Do you have to purchase a special edition of Abby or is this possible even in the cheaper versions. (I see you have the corporate edition)

I actually don't see what is the problem (or didn't read carefully)
Why don't you split up your left and right images into 2 seperate maps, as adviced here on the forum using the free software BookScanWizard and/or Scan Tailor which can be found through
http://diybookscanner.org/wiki/index.php?title=Introduction_to_Software

Because in case you have been able to keep your book at exactly the same place on the scanner they will be the same for all the left and right pictures. This way you do the left images deskewing and cropping all in one, since Abby remembers the coordinates. And then you do your right side pictures with the new coordinates which you put in and Abby remembering it.
On this forum people already mentioned software which is capable to rename pictures in batch with even and odd numbers so later you can put them together in a map after which you make a PDF from it.

I am curious however why don't you OCR proces your dictionary making it into a epub file. More easier to have an index and search capabilities. As long as your files have a high dpi it wouldn't be a problem to OCR the entire book.

One of the other members (Rob) already showed us how to calculate the dpi from a picture. See:
http://diybookscanner.org/forum/viewtopic.php?f=1&t=934&p=9998&hilit=+dpi+calculate#p9998

To improve your pictures even more and in batch you could try Adobe Lightroom. For making the left and right pictures even in brightnes and change them into B+W. If all the pages are text you could sharpen to have a more distinct difference between black and white.

I hope that this small sidestep (splitting the right and left pictures in seperate maps) will improve the speed of your workflow.

But to answer your question, sorry I don't have a better alternative. Just have a look to the software page mentioned or the forum's Scantailor and Bookscanwizard.
the.traveller
 
Posts: 70
Joined: 22 Sep 2010, 03:58
Location: Rotterdam, Netherlands

Re: alternative to abbyy?

Postby snaguy » 25 Jul 2011, 03:06

I think builderstudent2's problem is movement throughout the scanning process. Acrobats false crop may be an advantage here where you can use the image select tool to do manual adjustment where necessary. Also quite imposing plus plugin has a feature trim and shift where you can select page range to batch shift. Then re pdf to make the crop permanent.

the.traveller's points are all valid. Processing the left and right images before combining will take some problems out of the process. Quite imposing plus can also reorder pages. I can elaborate further if you choose to go down this path.
snaguy
 
Posts: 54
Joined: 14 Mar 2011, 04:28

Re: alternative to abbyy?

Postby builderstudent2 » 25 Jul 2011, 18:31

Snaguy, Here are the 20 raw pics you requested:
http://www.megaupload.com/?d=NINPMHWC

I am on windows xp sp3.
you can see the odd and even numbered pages overlaid and crop them easily in one go.

I'm not sure what they meant by that. Hmm...
Don't you mean Camtasia from Techsmith, also the makers of Snagit.

No. From Snagit not Camtasia. I have Camtasia but I don't really know how to use it so I chose to use Snagit to make the video.

The.traveller, I'm not sure if what I do is possible in the free or "cheaper" versions.
Code: Select all
I actually don't see what is the problem (or didn't read carefully)

My problem is specifically batch cropping. It takes me a very long time since I have to crop each individual page. But as you can see from the other thread that I created, I just decided that there is no other way but to just crop each individual page. Why? Because during my scans, the books do NOT stay in the same position and I have re-position them at times since they are very big, thick books. I just have to accept the fact that there is no other alternative than sucking it up and going through each page one-by-one no matter how boring or long it takes. :)

I am curious however why don't you OCR proces your dictionary making it into a epub file. More easier to have an index and search capabilities. As long as your files have a high dpi it wouldn't be a problem to OCR the entire book.

No way. If cropping my works already takes forever then can you imagine how long it would take to OCR it? But what I am thinking about doing is setting up the resultant PDF that I get from my work to where I can click on "A" for all the words that begin with "A" and "B" for all the words that begin with "B" and so on. (I scanned a number of dictionaries)
builderstudent2
 

Re: alternative to abbyy?

Postby builderstudent2 » 25 Jul 2011, 20:26

OMG!!! Another user just asked me why I am doing any post-processing in the first place with my custom scanner setup. He said that I should just set my camera's zoom properly for each book that I scan according to the size of the book and then just scan the areas that I would normally be cropping using my former super time-consuming process. I just tried it and I feel so stupid that I didn't know about this. :oops: :evil: I wasted almost one month on nothing but editing when in actuality all 20 books that I scanned should have done in about two days. Ugh, this is extremely depressing.

I always assumed that the more one zooms in on an image then the worse the quality of resultant photo will be. That it is better to zoom in on a photo that wasn't zoomed in on when it was taken as opposed to zooming in on a photo that was already zoomed in on when it was taken. Is that correct or am I just completely totally off?
builderstudent2
 

Next

Return to OCR/Optical Character Recognition

Who is online

Users browsing this forum: No registered users and 0 guests