Textract on 400 year-old print

0

I am using Textract to extract text from some 400 year old books. Unfortunately, Textract cannot distinguish between "s" and "f" in the print. The two look VERY similar on 400 year old printing presses! However, there IS a subtle difference. Can Textract be configured to distinguish between the two?

Fred
asked 2 years ago241 views
1 Answer
0

At the moment there is no custom training option with Textract for OCR. However, we did see success reading old newspapers when we applied segmentation: https://aws.amazon.com/blogs/machine-learning/improve-newspaper-digitalization-efficacy-with-a-generic-document-segmentation-tool-using-amazon-textract/

Hope this is useful.

AWS
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions