Changing line spacing to correctly run OCR text recognition on double spaced documents on the Mac

Cover Image for Changing line spacing to correctly run OCR text recognition on double spaced documents on the Mac
Head Owl
Head Owl

The line spacing setting in OwlOCR is a setting that allows you to manage how the OCR output is divided into paragraphs.

In this example I have captured a paragraph of text from Microsoft Word. The text is formatted to have double line spacing (line spacing 2.0) in Word. OwlOCR on the other hand has the default line spacing setting of 1.0. Due to the mismatch in settings, OwlOCR divides almost each line in the output to be a new paragraph resulting in a less than desirable output.

By changing the line spacing setting in OwlOCR to 2.0 and then running the OCR processing again on the same input, the paragraph breaks are corrected to match the input.

If you are getting too few or too many paragraph breaks in your output, it can be a good idea to try to change the linespacing setting to something slightly smaller or larger, even if does not exactly match what you have set in your input document.


More Stories

Cover Image for How to create searchable PDFs from photos, images and PDF files in MacOS Finder

How to create searchable PDFs from photos, images and PDF files in MacOS Finder

OwlOCR has provided the tools to do this before, but with version 4.5 we are for the first time including Finder Extensions, a way to do the steps above quickly and easily, right from the Finder.

Head Owl
Head Owl
Cover Image for OwlOCR v4.5, birthday edition ๐ŸŽ‚๐ŸŽ‰, released!

OwlOCR v4.5, birthday edition ๐ŸŽ‚๐ŸŽ‰, released!

For the first time OwlOCR actions can be used directly from the Finder. Create searchable PDFs, extract text from files to clipboard or plain text files - with only a couple clicks needed in Finder.

Head Owl
Head Owl
Cover Image for Using post processing to improve OCR text recognition results

Using post processing to improve OCR text recognition results

While in general the OCR engines do a pretty good job these days, none of them are unfortunately perfect. There's always a case where some word or character gets continuously incorrectly detected and one has to go back to fix it.

Head Owl
Head Owl
Cover Image for How to run OCR text recognition on an image on the Mac?

How to run OCR text recognition on an image on the Mac?

Running text recognition on images is a handy way to grab the text information from them. OwlOCR support images from clipboard, display or files.

Head Owl
Head Owl
Cover Image for How can I use OCR to capture text from the Mac screen?

How can I use OCR to capture text from the Mac screen?

Optical Character Recognition (OCR) can be used to capture text off the Mac screen. The processing can done right on your Mac for near instantaneous results, while ensuring privacy.

Head Owl
Head Owl
Cover Image for OwlOCR v4.4 released!

OwlOCR v4.4 released!

Another feature-packed release; post-processing, custom dictionary, more customization.

Head Owl
Head Owl
Cover Image for OwlOCR v4.3 released!

OwlOCR v4.3 released!

First feature release of 2021; wrapping, line spacing and multiple languages!

Head Owl
Head Owl
Cover Image for How I turned lockdown into a side project and why you should too

How I turned lockdown into a side project and why you should too

Death. Disease. Unemployment. Missed games and events. Disneyland closed. Weddings canceled. All bad? No!๐Ÿ‘‡๐Ÿผ

Head Owl
Head Owl