L.Willms wrote: ↑28 Jun 2018, 01:53
zbgns wrote: ↑
Newer versions of Acrobat Pro join lines into paragraphs and in general try to save layout of original documents (as I remember Acrobat 11 Pro does). So usually no additional converting of line brakes into spaces is necessary (apart of typical OCR errors).
Why do you think that this is so hard to do?
Well, I do not find this hard. It is quite easy to join the lines into paragraphs using regular expressions (wildcards) under MS Word or Libreoffice Writer. I guess that you had also PepitoCleaner extension in mind when you asked this question. The tricky thing is in case of lines ended with ful-stops (questionmarks) which are not end of the paragraph. It is difficult to join properly these lines using this solution.