Automatically correct skew in images containing text

I wanted to crop out a specific rectangle in a few dozen scanned documents with ImageMagick like this:

convert -crop 1600x1880+100+420 image.nrm.png cropped.png

Scanning often results in a tiny skew, which would lead to a slightly different rectangle location on every image. It seems that a common preprocessing step when doing OCR is to automatically correct this skew. The Python toolset ocropy ((https://github.com/tmbdev/ocropy)) for example contains a tool to do this: ./ocropus-nlbin image.jpg creates the file image.nrm.png which is optimized for OTR and has corrected skew.

Automatically correct skew in images containing text

Leave a Reply Cancel reply