- 新しい順
- 投票が多い順
- コメントが多い順
While AWS doesn’t have any explicit tools for watermark removal or advanced document quality enhancement, there are methods you can use to enhance the documents that being input to Textract. Some options available to increase document quality are firstly to make sure you are using high quality images (resolution of at least 150 DPI), use supported formats of PDF, TIFF, JPEG or PNG and to try to avoid converting or down sampling the documents before uploading to Amazon Textract
In terms of extracting text from tables in these documents you can also make sure to clearly separate the tables, and ensure the text in the tables is not rotated and is in fact up right. You may also bump into some challenges if you're dealing with merged table cells spanning multiple columns or tables with inconsistent cell structures. In these cases, consider using the text detection tool from Textract as a workaround. For more information check out the best practices for queries in Textract.
関連するコンテンツ
- 質問済み 6年前
- AWS公式更新しました 3年前
Thanks for the help