the future of ocr ?

Convert page images into searchable text. Talk about software, techniques, and new developments here.

Moderator: peterZ

StevePoling
Posts: 290
Joined: 20 Jun 2009, 12:19
E-book readers owned: SONY PRS-505, Kindle DX
Number of books owned: 9999
Location: Grand Rapids, MI
Contact:

Re: the future of ocr ?

Post by StevePoling »

iwenttoofast wrote:OCR is a tough computer science problem. Think about it -- true on-the-fly OCR of cursive (no training necessary -- just like our eyes and brain works) remains one of the toughest unsolved problems out there.
I recall reading that one of the Apple engineers in charge of the Newton's cursive handwriting code committed suicide.

If memory serves, signature capture systems retain biometric information such as how hard one presses on the paper when writing. I think cursive handwriting recognition might benefit from the addition of such biometric info. Handwriting and speech consists of a structured sequence of gestures by either the fingers or the vocal tract (unobservable) which gives rise to marks on paper or sounds (observable). (Hidden-Markov models are well suited to use here.) Typeset/typewritten text benefits from much greater consistency of its observable features, thus I'm not surprised handwritten OCR lags in the performance department.
Post Reply