- Newest
- Most votes
- Most comments
There are 2 ways you can integrate Amazon Textract with Amazon A2I for human review:
- The standard instructions in the Amazon Textract developer guide use the pre-built task template for Textract K-V review, with direct integration (specifying the human loop directly in the AnalyzeDocument call).
- A2I also supports custom task templates, allowing you to customize the review UI and controls - which can be started by explicit API request.
Today as you saw, the pre-built/direct UI integration is focussed on K-V pairs and I believe doesn't support reviewing Queries results.
What I would recommend is to use a custom integration (2) instead of the direct/built-in one (1), which will allow you to customize the UI and/or the data structure that the UI receives. The general flow would be:
- Call Amazon Textract without HumanLoopConfig.
- When the result is ready (either synchronous API response, or async SNS callback gets triggered), use a Lambda function or similar to transform the Textract JSON first and then start a human review.
- Listen to the S3 output location for your human loop, to detect the upload of a result object and resume the process flow.
If you'd like to re-use the existing UI template, you could use the pre-A2I Lambda to transform the Amazon Textract payload before forwarding to the A2I service: Perhaps editing the JSON blocks to transform the query response blocks into KEY_VALUE_SET blocks, so that the existing template can render them. Alternatively, you could create a new task template using Liquid HTML (supporting embedded JavaScript).
I don't have an example for queries in particular, but would recommend referring to:
- If you select "Create a task template" in the A2I Console, you can select "Textract-Form extraction" task type from the drop-down to get an insight into the source code of the standard built-in template.
- The official sample repositories for A2I task templates and SageMaker Ground Truth task templates (which shares essentially the same front-end system of Liquid HTML templating). There's also a collection of A2I example Python notebooks.
- The Amazon-Textract-Transformer-Pipeline sample is quite complex, but includes a custom PDF document A2I template written with VueJS as part of a serverless, Step Functions-orchestrated flow. There's also an older template which uses just a single HTML file instead of relying on the JavaScript/NPM build toolchain.
- The SageMaker-GroundTruth-Custom-Angular-Template sample demonstrates using a custom template written with Angular, which should be easily portable to A2I.
Relevant content
- asked 2 years ago
- asked 2 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated 3 years ago
Thanks Alex for the great references. I will get to work trying them out!
Hi Nunz - I have a similar requirement, where you able to set up your Human Review Workflows using QUERY_RESULT?