Amazon Textract Queries : Trouble when query for empty value

0

Hello everyone,

I try to extract data from financial documents with the feature QUERIES given by Amazon Textract. The documents that I need to analyse have the format bellow : Enter image description here

When I try to query for a value and this value can be empty (ex : What is the value of AA ?). When the value is empty, the answer given to my query is a value corresponding to an other label that is not empty (ex : the value of AF (="12345") instead of "" ). Anybody does know how can I get the correct value or at least how can I be able to know if a value is empty ?

Thank you in advance for your help.

Cherrygolo.

profile picture
已提问 1 年前301 查看次数
1 回答
1

Textract Query may not be the best fit here given the structure of the document. You'll end up with inconsistent results. Here are the results of my tests so far with Query:

  • What is the value of AB? --> <empty>
  • What is the value of AF--> 12345
  • What is the value of AA --> Capital souscrit non appelé
  • What is the value of Capital souscrit non appelé? --> LLES

Once approach could be to leverage Tables and the merged cell feature that identifies cells that are merged horizontally or vertically. The screenshot below shows what I was able to get while testing the sample in the demo console using Tables.

Demo results with Tables feature

Please check out the blog below for an example of how to use the merged cell construct in the AnalyzeDocument API's response. https://aws.amazon.com/blogs/machine-learning/merge-cells-and-column-headers-in-amazon-textract-tables/

AWS
NZ
已回答 1 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则