Alternative Software Workflow
Moderator: peterZ
Re: Alternative Software Workflow
I still got the error. It's strange, it doesn't happen to all photos, sometimes it might only happen to 2 or 3 in a batch of 600. It's the most annoying thing about my workflow, is that I have to wait until abbyy is ocr'ing (which can take quite a long time) before I leave the process alone. Until then, there's a chance that several of the pages aren't accepted and I'm forced to recrop the photos.
The strange thing is that once a jpg is corrupted, if that's what happens, then no amount of tweaking and resaving the image will recover it. The only solution is to go back to the original jpg and crop/save from there.
Maybe if pagebuilder will be better...
The strange thing is that once a jpg is corrupted, if that's what happens, then no amount of tweaking and resaving the image will recover it. The only solution is to go back to the original jpg and crop/save from there.
Maybe if pagebuilder will be better...
Re: Alternative Software Workflow
i was approaching the problem form the prospective that abbyy was the problem not jpgcrops.
-
- Posts: 63
- Joined: 04 Mar 2014, 00:52
Re: Alternative Software Workflow
second link is dead, here is one:daniel_reetz wrote:Surya is employing your method now (I hear from him via email much more often than the forum).
I know you've got a method worked out and everything, but PageBuilder will do your first two steps -- crop and rotate. From those JPGs you could use ABBY. All you need is PageBuilder 2 and the Matlab Component Runtime. Apologies if you've already tried it. Just be sure to check the JPG output radio button.
Matlab component Runtime installer: http://www.sccn.ucsd.edu/~arno/download ... taller.exe
- daniel_reetz
- Posts: 2812
- Joined: 03 Jun 2009, 13:56
- E-book readers owned: Used to have a PRS-500
- Number of books owned: 600
- Country: United States
- Contact:
Re: Alternative Software Workflow
pagebuilder is presently unmaintained, i wouldn't waste time on it.
-
- Posts: 63
- Joined: 04 Mar 2014, 00:52
Re: Alternative Software Workflow
What are you currently using then? I have the Left and Right folders full of pictures for a book I scanned and I am not sure what to do from here?
I need to get my images rotated, cropped, and into PDF format with the least amount of effort possible. (it does not need to be OCCR scanned, I am fine with just having the images in a pdf)
edit: Metamorphose is a very powerful file renamer... it is also open source, I have put in a request for a feature so that you could easily rename all the files in one swoop, instead of having to do the left pages, and then the right.
I need to get my images rotated, cropped, and into PDF format with the least amount of effort possible. (it does not need to be OCCR scanned, I am fine with just having the images in a pdf)
edit: Metamorphose is a very powerful file renamer... it is also open source, I have put in a request for a feature so that you could easily rename all the files in one swoop, instead of having to do the left pages, and then the right.
- daniel_reetz
- Posts: 2812
- Joined: 03 Jun 2009, 13:56
- E-book readers owned: Used to have a PRS-500
- Number of books owned: 600
- Country: United States
- Contact:
Re: Alternative Software Workflow
I am using Scan Tailor.
-
- Posts: 596
- Joined: 06 Jun 2009, 23:57
Re: Alternative Software Workflow
If you have Perl installed on Windows, you can do what I do. I start each new book in its own directory, within which I create subdirectories L, R, and Both.jakegaisser wrote:Metamorphose is a very powerful file renamer... it is also open source, I have put in a request for a feature so that you could easily rename all the files in one swoop, instead of having to do the left pages, and then the right.
Then, I have a little script called DIYmerge.cmd:
Code: Select all
cd L
perl f:\Scripts\Perl\DIYrename0.plx 0 > DIYrename.cmd
call DIYrename.cmd
move *.jpg ..\Both
cd ..\R
perl f:\Scripts\Perl\DIYrename0.plx 1 > DIYrename.cmd
call DIYrename.cmd
move *.jpg ..\Both
Code: Select all
# glob an array of all the JPG files
@files = <*.jpg>;
# get starting page number from command line
$page = $ARGV[0];
# print "ren file.jpg page.jpg" for each file in array
foreach $file (@files) {
print "ren " . $file . " ";
printf ("%04d", $page);
print ".jpg" . "\n";
$page += 2;
}
ren 0* 5*
ren 9* 4*
DIYmerge
-
- Posts: 63
- Joined: 04 Mar 2014, 00:52
Re: Alternative Software Workflow
I wrote a script to rename, merge and rotate, it requires imagemagick to be installed.
I have:
D:\Book\L
D:\Book\R
I place this windows batch script into D:\book
makebook.batch:
I have:
D:\Book\L
D:\Book\R
I place this windows batch script into D:\book
makebook.batch:
Code: Select all
set count=0
FOR /r %%A IN (*.jpg) DO CALL :NUMBER %%A
goto :EOF
:NUMBER
IF "%count%"=="0" (
set count=1
set odd=1
) ELSE (
IF NOT "%~p1"=="%PREVDIR%" (
IF "%odd%"=="1" (
set count=2
set odd=0
) ELSE (
set count=1
set odd=1
)
)
)
set NUM=000%count%
set NUM=%NUM:~-4%
ren "%1" "../%NUM%.JPG"
IF "%odd%"=="1" (
convert %NUM%.JPG -rotate 90 %NUM%.JPG
) ELSE (
convert %NUM%.JPG -rotate 270 %NUM%.JPG
)
set /a count+=2
set PREVDIR=%~p1
goto :EOF
-
- Posts: 63
- Joined: 04 Mar 2014, 00:52
Re: Alternative Software Workflow
I am now trying to see if I can automatate cropping.... so far it does not look like I can... maybe I can find a way to crop all images as a whole for both L & R folder before renaming and rotating.
-
- Posts: 596
- Joined: 06 Jun 2009, 23:57
Re: Alternative Software Workflow
I think your way of doing the rename/merge is better than mine, because it doesn't require Perl. Old habits...
I wouldn't bother doing the rotates in pre-processing though, if you're using Scan Tailor. Scan Tailor will rotate the images faster than ImageMagick, with less wear and tear on your hard drive. Before I became aware of Scan Tailor, I was doing exactly what you're trying to do, using ImageMagick to rotate and crop. The "cropping" was hit and miss, because there was a bit of jitter in the scanning process, so I'd typically have to include a bit of slop in the dimensions to make sure I wasn't cropping content. For a while, I was using JPEGcrops to do the cropping without the slop, but now I'm doing all of my content-selection tweaking in Scan Tailor. Cropping as a pre-processing step doesn't really guarantee that Scan Tailor's content selection won't still need to be adjusted, and I'd rather do something once than twice.
I am still using ImageMagick on the back end, to convert Scan Tailor's TIFFs to PDFs:
mogrify -format PDF *.TIFF
then pdftk to merge the individual pages into the final book:
pdftk 0*.pdf cat output finalbookname.pdf
I wouldn't bother doing the rotates in pre-processing though, if you're using Scan Tailor. Scan Tailor will rotate the images faster than ImageMagick, with less wear and tear on your hard drive. Before I became aware of Scan Tailor, I was doing exactly what you're trying to do, using ImageMagick to rotate and crop. The "cropping" was hit and miss, because there was a bit of jitter in the scanning process, so I'd typically have to include a bit of slop in the dimensions to make sure I wasn't cropping content. For a while, I was using JPEGcrops to do the cropping without the slop, but now I'm doing all of my content-selection tweaking in Scan Tailor. Cropping as a pre-processing step doesn't really guarantee that Scan Tailor's content selection won't still need to be adjusted, and I'd rather do something once than twice.
I am still using ImageMagick on the back end, to convert Scan Tailor's TIFFs to PDFs:
mogrify -format PDF *.TIFF
then pdftk to merge the individual pages into the final book:
pdftk 0*.pdf cat output finalbookname.pdf