Questions tagged with Amazon Textract

Amazon Textract is a machine learning (ML) service that automatically extracts text, handwriting, and data from scanned documents.

Content language: English

Select up to 5 tags to filter
Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

344 results
I use Textract to read tables that have been filled in with handwriting. In general it works great, but there is a recurring issue of Textract not recognizing '1' or interpreting it as a column separa...
1
answers
0
votes
263
views
asked 7 months ago
Hi, I have trained a customer query adapter in textract and I get as far as Trying the Adapter, but when I upload an image to try the adapter I just get a spinning confidence marker on the image and a...
1
answers
0
votes
218
views
asked 7 months ago
I am using Textract's asynchronous functions StartDocumentAnalysis and GetDocumentAnalysis to detect signatures on a document using AWS SDK Python. The JSON data I receive is correct from GetDocumentA...
1
answers
0
votes
207
views
asked 8 months ago
I was trying to extract invoice number from PDF file (using Amazon Textract - Analyze Expense), I uploaded pdf file and then analyze but it returned this error UnsupportedDocumentException. Then I con...
2
answers
0
votes
411
views
asked 8 months ago
In several documents that we have attempted to process using Amazon's OCR feature, we have found that the system does not correctly detect text in some areas, despite it being clearly legible. We have...
1
answers
0
votes
212
views
asked 8 months ago
Hello all, I am using Textract async StartDocumentAnalysis and GetDocumentAnalysis for detecting signatures. However, when I test the code with a PDF document the job status of GetDocumentAnalysis is...
1
answers
0
votes
200
views
asked 8 months ago
I have two lambda functions currently, one to specify the queries and start the document analysis. The second function is triggered by a SNS topic and retrieves the document analysis. The problem is t...
1
answers
0
votes
236
views
asked 8 months ago
I'm trying to use Textract to extract the product descriptions form our PDF catalogs in page order. The Textract analysis picks up the descriptions as text blocks, but how do I go about training Textr...
1
answers
0
votes
223
views
asked 8 months ago
I am extracting data from documents that include tables and other text that is not in table format (the documents do not include figures). I would like to separate table data from non-table data beca...
1
answers
0
votes
494
views
asked 8 months ago
Background: I am using Textract Analyze document API to detect Layout response objects in a PDF page. The page has Page Headers, Title, Sub-headers, tables, figures, and text. The page is divided into...
1
answers
0
votes
301
views
asked 8 months ago
Is there any way to use Analyze Expense when the receipt or bill is split into multiple images. I have tried combining the images into a single image but this didn't work as expected. I was getting du...
1
answers
0
votes
214
views
asked 8 months ago
Hi, Can I know what is the code behind pulling the text of the document in the demo console: https://us-west-2.console.aws.amazon.com/textract/home?region=us-west-2#/demo The "rawText.txt" in the zip ...
1
answers
0
votes
289
views
asked 9 months ago