Newbie with Spreadpi and Spreads Questions

Johannes Baiter's Spreads and SpreadPi are the latest control systems and postprocessors for DIY scanning. http://spreads.readthedocs.org

Moderator: peterZ

Leo
Posts: 6
Joined: 27 Sep 2014, 10:18
E-book readers owned: Kindle Fire
Number of books owned: 0
Country: USA

Newbie with Spreadpi and Spreads Questions

Post by Leo »

Greetings,
I've been following the forums off and on for a couple of years now. I purchased one of Daniel's scanner kits back in 2012. I had some spare cash at the time, but way overly optimistic about the time I could devote to the project. I finally got around to putting the kit together earlier this year and then started reading about Spreads and SpreadPi. I decided that is the route I wanted to take for image capture and post processing. I'm using two Canon A2200 cameras with CHDK 1.3 installed and a RaspberryPi B+.

I installed spreadpi this week and after some assitance on IRC from duerig was able to run a capture test. I feel I should mention that I encountered a lot of problems using the web interface. The basic navigation in the interface was fine, but if I attempted to use the "crop" feature, or edit the "preferences" for a workflow, the pop up (overlay) page was presented problems. The overlay page was so oversized that I wasn't able to use the crop or edit preferences features at all on my Ubuntu laptop running Firefox 30 with 1280 x 768 resolution. I went to my Windows 7 desktop and had the same problem with Firefox and IE. Using ctrl - to reduce the page size would only reduce the size slightly. However, the Chrome browser worked quite well with the web interface. The Chrome browser on my Nexus 7 worked pretty good too, actually better than Firefox on the other two devices.

So, I was able to set the crop setting for a test scan using Chrome on my Win7 machine, then scanned a 228 page book in about 20 minutes. Spreadpi must have crashed about half way through, but I removed power to the Rpi and then powered it back on. Spreadpi started right back up and I was able to finish scanning the book. I then exported the workflow project to my laptop and took a look at the images. They look pretty good to me, but I don't have much experience to gauge the quality confidently.

I have several questions I'm hoping someone can answer in order to progress in my book scanning project. First, in regards to image capture. Below is the output of the yaml file included in the workflow manifest.

device:
shutter_speed: 1/25
upside_down: no
flip_target_pages: no
focus_distance:
sensitivity: 80
shoot_raw: no
zoom_level: 3
chdkptp_path: /usr/lib/chdkptp
parallel_capture: yes
monochrome: no
focus_mode: autofocus_all
whitebalance: Auto
dpi: 300
plugins: [hidtrigger]

Do any of the settings need to be changed to improve the scan quality and if so, how is that accomplished? Also, are there particular CHDK settings that should be set up on the cameras prior to launching spreadpi?

I would like to use Spreads for the post processing, but I'm unclear about just what I need to do to get it set up. I installed python-pip on my Ubuntu laptop and then installed spreads using pip. I also installed scantailor. I then ran spread configure, but got an error telling me that there were missing plugins. I assumed that I needed to install chdkptp even though I don't plan to run image capture using Spreads on the laptop. After installing chdkptp-r650-Linux_x86_64 I ran spread configure again but got the same error.

~$ spread configure
There is a problem with your configuration file(s):
plugins not found

The documentation at http://spreads.readthedocs.org states that it is recommended to install Spreads in virtualenv, but I neglected to do that, so now I'm wondering if I need to start over with the installation. Also, the Spreads documentation refers to python 2.7, but I thought I would install pdfbeads and noted that one of the the dependicies for pdfbeads is python 3. I realize some of the Spreads documentation is out of date, so I thought I would check before installing a bunch of packages only to run into further problems. (Edit: I have now intstalled all the packages listed in the Spreads docs with the exception of rubygems and libhpricot-ruby, which were not available through apt-get. I did install ruby though. I also installed djvubind_1.2.1.deb with all its dependencies. Then ran

$ virtualenv ~/.spreads
$ source ~/.spreads/bin/activate
$ pip install spreads
$ spread configure

I'm still getting the same error about missing plugins.)

~Leo
Leo
Posts: 6
Joined: 27 Sep 2014, 10:18
E-book readers owned: Kindle Fire
Number of books owned: 0
Country: USA

Re: Newbie with Spreadpi and Spreads Questions

Post by Leo »

Just wanted to update my original question. Some of my confusion about changing settings for the image capture were resolved after installing a more stable version of spreadpi. Also found out that by using the Chromium browser on my Ubuntu laptop I could actually use the crop feature as well as modify the capture settings.

I'm still stuck on how to get Spreads working for post-processing though.

~Leo
elwi
Posts: 11
Joined: 19 Jul 2014, 06:33
E-book readers owned: kindle
Number of books owned: 30
Country: Germany
Location: Heidelberg, Germany

Re: Newbie with Spreadpi and Spreads Questions

Post by elwi »

Hi!

See jbaiters last comment on this page for a simple ubuntu installation (did not work 100% for me as there were still some dependencies issues):
https://github.com/DIYBookScanner/spreads/issues/126

also for a installation from scratch on ubuntu see:
https://github.com/OliPelz/spreads-ubuntu-trusty

hope this helps
User avatar
jbaiter
Posts: 98
Joined: 17 Jun 2013, 16:42
E-book readers owned: 2
Number of books owned: 0
Country: Germany
Location: Munich, Germany
Contact:

Re: Newbie with Spreadpi and Spreads Questions

Post by jbaiter »

I'm really sorry about the sizing issues, I've developed mosty with a 1680x1050 display, a Nexus 7 tablet and a Nexus 4 phone, so resolutions outside of these might be a bit wonky. Please file an issue on GitHub so I won't forget about it!

The spreads documentation is currently also woefully out of date, i.e. some things might not work exactly as described, some steps have become unneccesary, some dependencies are not listed... Any contributions to ameliorate this are welcome, it's been neglected lately, unfortunately. But the links posted by @elwi should walk you through the installation of the current development version without much problems.

The installation from pip currently does not work, please use the nightly builds instead: http://buildbot.diybookscanner.org/nightly.

It should be as easy as this:

Code: Select all

$ pip install cffi jpegtran-cffi
$ wget http://buildbot.diybookscanner.org/nightly/spreads-latest.tar.gz
$ tar xf spreads-latest.tar.gz
$ cd spreads-someversion
$ pip install .
$ pip install -e ".[web]"
Be warned though, that the postprocessing-server might still be a bit buggy, so please file an issue on GitHub if you encounter any problems :-)
spreads: Command-line workflow assistant
Leo
Posts: 6
Joined: 27 Sep 2014, 10:18
E-book readers owned: Kindle Fire
Number of books owned: 0
Country: USA

Re: Newbie with Spreadpi and Spreads Questions

Post by Leo »

Jbaiter & elwi, thanks for the suggestions on how to get this set up. I decided to do a clean Ubuntu 14.04 install on my laptop and use OliPelz instructions at the link in elwi's post. Everything seemed to going fine and spreads GUI is up and running. Now, I'm still a bit lost in how to do the post processing with a scan job exported from SpreadsPi.

I unzipped a copy of a scan job I exported from SpreadsPi.
I ran the spread gui command to launch the GUI and selected the folder created by SpreadsPi and clicked Next. Nothing happened in the GUI, but in the terminal window there was a message telling me that no driver was found. When I initially ran spread configure, I left the driver set at none since I wasn't going to be using Spread for the image capture.

I ran spread configure again and selected chdkptp as the driver. This time when I ran spread gui, the GUI screen opened up in a larger window then before, with the bottom of the window below the bottom of my screen so I could not see the Back, Cancel, Next buttons. Since the Spread GUI window cannot be resized, I just had to use the Tab key to navigate to the command options. After selecting the SpreadsPi folder that I want to process, and selecting Next, again nothing happens. Checking the terminal window, I see this:
Traceback (most recent call last):
File "/home/leo/.spreads/local/lib/python2.7/site-packages/spreadsplug/gui/gui.py", line 235, in validatePage
wizard.workflow.prepare_capture()
File "/home/leo/.spreads/local/lib/python2.7/site-packages/spreads/workflow.py", line 839, in prepare_capture
if any(dev.target_page is None for dev in self.devices):
File "/home/leo/.spreads/local/lib/python2.7/site-packages/spreads/workflow.py", line 508, in devices
self._devices = plugin.get_devices(self.config, force_reload=True)
File "/home/leo/.spreads/local/lib/python2.7/site-packages/spreads/plugin.py", line 477, in get_devices
"Could not find any compatible devices!\n"
spreads.util.DeviceException: Could not find any compatible devices!
Make sure your devices are turned on and properly connected to the machine.
Traceback (most recent call last):
File "/home/leo/.spreads/local/lib/python2.7/site-packages/spreadsplug/gui/gui.py", line 235, in validatePage
wizard.workflow.prepare_capture()
File "/home/leo/.spreads/local/lib/python2.7/site-packages/spreads/workflow.py", line 839, in prepare_capture
if any(dev.target_page is None for dev in self.devices):
File "/home/leo/.spreads/local/lib/python2.7/site-packages/spreads/workflow.py", line 508, in devices
self._devices = plugin.get_devices(self.config, force_reload=True)
File "/home/leo/.spreads/local/lib/python2.7/site-packages/spreads/plugin.py", line 477, in get_devices
"Could not find any compatible devices!\n"
spreads.util.DeviceException: Could not find any compatible devices!
Make sure your devices are turned on and properly connected to the machine.
So, I'm much closer to the goal, but still need some help to move forward. I'm pretty confident that I don't have to connect my cameras to my laptop just to get Spreads to do the post-processing.
duerig
Posts: 388
Joined: 01 Jun 2014, 17:04
Number of books owned: 1000
Country: United States of America

Re: Newbie with Spreadpi and Spreads Questions

Post by duerig »

I haven't used the gui before, just the web interface for capture and the command line for postprocessing. You might try using the command line postprocess command instead of the GUI and see what happens.
User avatar
jbaiter
Posts: 98
Joined: 17 Jun 2013, 16:42
E-book readers owned: 2
Number of books owned: 0
Country: Germany
Location: Munich, Germany
Contact:

Re: Newbie with Spreadpi and Spreads Questions

Post by jbaiter »

First of all, sorry for answering so late!
The GUI can currently not be used for post-processing alone, it has to be used for the whole capture->process->output chain.
It is very easy to do post-processing on the commandline, though, just run "spread process --help" to see all available options, then run it with "spread process [options] PROJECT_DIR".
spreads: Command-line workflow assistant
Leo
Posts: 6
Joined: 27 Sep 2014, 10:18
E-book readers owned: Kindle Fire
Number of books owned: 0
Country: USA

Re: Newbie with Spreadpi and Spreads Questions

Post by Leo »

Johannes, thanks for the information about the GUI.
I ran a postprocess job as suggested by duerig using the output from a spreadPi workflow. I didn't specify any options for the job, but that ended up with all the .tiff files that scantailor created being rotated 90 degrees clockwise and the hOCR pages were just garbled characters. I assume the OCR was thrown off by the pages being rotated.

I guess I don't understand how to properly set the job options for the command line.
Could I see an example of how to start a postprocess job that doesn't rotate the pages and OCR's the output using tesseract?

~Leo
User avatar
jbaiter
Posts: 98
Joined: 17 Jun 2013, 16:42
E-book readers owned: 2
Number of books owned: 0
Country: Germany
Location: Munich, Germany
Contact:

Re: Newbie with Spreadpi and Spreads Questions

Post by jbaiter »

Leo, can you confirm that the images were not rotated before and have valid EXIF orientation tags? You can also send me some sample images from your project and I can try to reproduce locally.

If this is the case, here is a minimal example configuration and a command-line that I used:

Code: Select all

plugins:
- autorotate
- scantailor
- tesseract
- gui
- web
- djvubind
- pdfbeads
core:
    verbose: no
    loglevel: info
    capture_keys: [' ', b]
    logfile: ~/.config/spreads/spreads.lo0
The command-line:

Code: Select all

spread --verbose postprocess --no-autopilot --language fra $PROJECT_DIR
This command will apply autorotation, ScanTailor postprocessing and tesseract OCR to the images. During the ScanTailor step, the ScanTailor GUI will pop up to allow you to make changes to the autogenerated settings. At the end you should have *_rotated.jpg, *.hocr and *.tif files in the data/done directory.
spreads: Command-line workflow assistant
Leo
Posts: 6
Joined: 27 Sep 2014, 10:18
E-book readers owned: Kindle Fire
Number of books owned: 0
Country: USA

Re: Newbie with Spreadpi and Spreads Questions

Post by Leo »

Johannes,
The jpg images in the data/raw folder are in the proper orientation. Looking at the EXIF data I see this line

Code: Select all

exif:Orientation: 8
I have attempted to run the postprocess again using the example you provided, but it errors out:

Code: Select all

(.spreads)leo@leo-ThinkPad:~$ spread --verbose postprocess --no-autopilot Documents/epic
Workflow: Initializing workflow Documents/epic
bagit: Adding path /home/leo/Documents/epic/bag-info.txt to payload
bagit: Adding path Documents/epic/config.yml to payload
Workflow: Starting postprocessing...%
Workflow: Running 'process' hooks
spreadsplug.autorotate: Rotating images
spreads encountered an error:
Traceback (most recent call last):
  File "/home/leo/.spreads/local/lib/python2.7/site-packages/spreads/main.py", line 318, in main
    run()
  File "/home/leo/.spreads/local/lib/python2.7/site-packages/spreads/main.py", line 308, in run
    args.subcommand(config)
  File "/home/leo/.spreads/local/lib/python2.7/site-packages/spreads/cli.py", line 356, in postprocess
    workflow.process()
  File "/home/leo/.spreads/local/lib/python2.7/site-packages/spreads/workflow.py", line 938, in process
    self._run_hook('process', self.pages, processed_path)
  File "/home/leo/.spreads/local/lib/python2.7/site-packages/spreads/workflow.py", line 801, in _run_hook
    getattr(plug, hook_name)(*args)
  File "/home/leo/.spreads/local/lib/python2.7/site-packages/spreadsplug/autorotate.py", line 154, in process
    in_path = page.get_latest_processed(image_only=True)
  File "/home/leo/.spreads/local/lib/python2.7/site-packages/spreads/workflow.py", line 191, in get_latest_processed
    key=lambda p: p.stat().st_mtime, reverse=True)[0]
  File "/home/leo/.spreads/local/lib/python2.7/site-packages/spreads/workflow.py", line 191, in <lambda>
    key=lambda p: p.stat().st_mtime, reverse=True)[0]
  File "/home/leo/.spreads/local/lib/python2.7/site-packages/spreads/vendor/pathlib.py", line 1050, in stat
    return self._accessor.stat(self)
  File "/home/leo/.spreads/local/lib/python2.7/site-packages/spreads/vendor/pathlib.py", line 341, in wrapped
    return strfunc(str(pathobj), *args)
OSError: [Errno 2] No such file or directory: 'Documents/epic/data/done/000.tif'
I didn't have autorotate and djvubind configured previously, but I now have those options selected but I still get the above error.
~Leo
Post Reply