Questions tagged with Amazon Textract
Amazon Textract is a machine learning (ML) service that automatically extracts text, handwriting, and data from scanned documents.
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
329 results
I am very new to Textract, so apologies if this is a basic question. I have a document that mixes tables and text throughout, and I want to extract both at once. However, I find that in the "layout"...
Using Textract for a table of contents where each line has** TITLE . . . . Author PageNo.**
Resultant table has Title and Author merges ignoring dot-leader as one column and page numbers has 2nd...
I have set up a user in the IAM Identity Center console that is assigned to a group that I'd like to only have access to a few select s3 buckets and the AWS Textract service. I've created a group with...
I have been using AWS Textract to scan forms and invoices. Previously, I trained the adapter after auto-labelling it and reviewing annotations.
I wanted to prepare my own data components using already...
Hi,
I have a multi-page PDF document which I can process fine and extract key value pair in Amazon Textract web interface. However, when I try to extract key value pairs in my Python code, it returns...
Hey guys, just wondering if it is possible to train the Textract to return meaningful results. I am trying to use Textract to read some handwritten forms, but sometimes it gives me results that are...
We are using AWS Textract modules Analyse Document to extract data from enrollment forms which are in pdf and jpeg format. We observed that for online filled pdf forms, textract is giving incorrect...
Hello there,
I am working with serverless-offline, so I run my project with sls offline, all good with that. Here is my serverless.yml:
```
service: ${env:APP_SERVICE_NAME}
useDotenv:...
Hello,
My code breaks every time I attempt to analyze a PDF with more than one page. It displays the following error:
```
UnsupportedDocumentException: Request has unsupported document format
...
Hi everyone, this is my first postQuestion, so im sorry if i broke any rule 😅.
I try to extract some text from a PDF file and when the script run i have this message error with queries method analyze...
We use a custom tag (cost-center) to better understand our AWS expenses. Is there a way to add a tag to a textract detect_document_text() call? The AWS Cost Explorer included Textract as a service...
I have a Textract/A2I process setup and it works as expected. However, I need to change the workflow and am looking for suggestions.
Context: we are using Textract/A2I to process historical...