Persistent Text Detection Issue in Documents

Question

In several documents that we have attempted to process using Amazon's OCR feature, we have found that the system does not correctly detect text in some areas, despite it being clearly legible. We have conducted numerous tests and have exhausted multiple approaches to resolve this issue. We have even compared the results with other OCR services, such as RapidOCR, which do not exhibit this problem.

The measures we have taken so far include:

* Sending the document in PDF format 
* Try sending a high-resolution image.
* Resizing the image to improve detection.
* Adjusting contrast levels to optimize text readability.
* Try cut/remove the white border for only focus on text for sugestion othere similas issues

However, regardless of these efforts, the problem persists, particularly at the top of the documents.

Attached to this message, you will find a sample of the original document and a screenshot illustrating the text that was not detected correctly.

We appreciate your attention and action on this matter.

![Enter image description here](/media/postImages/original/IMUsZsX6_FRCa-vTzx3zYgZw)

![Enter image description here](/media/postImages/original/IMEKIuVeLRSmqaZXwo9zNm1g)

Answer

Hi thank you for using textract. We are sorry that you're facing facing regarding accuracy of detection. Would you be able to share document through some medium so we can help furhter?

Persistent Text Detection Issue in Documents

관련 콘텐츠