textract does not recognise simple table


I am using "aws textract analyze-document" to OCR the attached Table. However, no text nor numbers are recognised. Other packages (like Adobe Acrobat Pro) is able to OCR the image. Can somebody help out?! This is an example of an image which Textract did not recognise at all

  • Hi, can you fix your question so that we see your image ? Currently, it doesn't show

asked 2 months ago78 views
1 Answer

There are a few things to consider when using AWS Textract for document analysis:

Image Quality: Textract works best with high-quality, clear images. If the image is blurry, low-resolution, or has other issues, Textract may have trouble accurately recognizing the text and content.

Document Structure: Textract is generally better at extracting text from simpler documents, like forms or invoices. Tables and complex layouts can be more challenging for Textract to analyze accurately.

Textract Configuration: Depending on the type of content you're trying to extract, you may need to adjust the Textract configuration, such as setting the appropriate "FeatureTypes" parameter (e.g., "TABLES", "FORMS", "LINES", etc.).

Post-processing: Even if Textract is able to extract the text and data, you may need to do some additional post-processing to clean up the output and structure the data in a way that's useful for your application.
answered 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions