Hello, I would like to know if anything changed in the way Textract gives back answers.
Meaning:
If I ask : What is the title of this doc? and set it up to look on page1, I get an answer with text and coordinates.
However, if I get 'interpreted' answers e.g. What are the standards of this doc, same lookup on page1: I have geometry set given back on None
query is TBlock(geometry=None, id='d1a1bac6-8c00-4b8b-91ef-72ff7d3398d9', block_type='QUERY', relationships=[TRelationship(type='ANSWER', ids=['d3c0611d-a7ba-48ed-9d4a-031e64a3d4f3'])], confidence=None, text=None, column_index=None, column_span=None, entity_types=None, page=1,
row_index=None, row_span=None, selection_status=None, text_type=None, custom=None, query=TQuery(text='what are the standards of the certified weight?', alias='tc_certified_shipping_standards'))
rels is TRelationship(type='ANSWER', ids=['d3c0611d-a7ba-48ed-9d4a-031e64a3d4f3'])
[TBlock(geometry=None, id='d3c0611d-a7ba-48ed-9d4a-031e64a3d4f3', block_type='QUERY_RESULT', relationships=None, confidence=43.0, text='GRS, GRS', column_index=None, column_span=None, entity_types=None, page=1, row_index=None, row_span=None, selection_status=None, text_type=None, custom=None, query=None)]
I have a quite big chunk of code depending on coordinates and for 5 months straight, I had no issue. I did check for having same other libraries related to Textract to the old version and tested on old git branches.
So, is this a new way Textract answers to questions?
Please and thank you!
Were you seeing a bounding box on interpreted answers previously with the same document?
To be frankly honest, I inherited a tiny piece of code , grew from there, and didnt have to look into it as it was going smooth. So I assume there was geometry before as it didnt crash at the same step within the app.
I use the polygon coordinates and I will paste what I get from Textract: Without polygon and geometry, where it now fails: TBlock(geometry=None, id='6e5deb40-4c90-47e7-b99d-933ac8c73231', block_type='QUERY_RESULT', relationships=None, confidence=43.0, text='GRS, GRS', column_index=None, column_span=None, entity_types=None, page=1, row_index=None, row_span=None, selection_status=None, text_type=None, custom=None, query=None), TBlock(geometry=None, id='d84596f0-3e59-4279-b907-f8f39a3b49dd', block_type='QUERY', relationships=[TRelationship(type='ANSWER', ids=['6e5deb40-4c90-47e7-b99d-933ac8c73231'])], confidence=None, text=None, column_index=None, column_span=None, entity_types=None, page=1, row_index=None, row_span=None, selection_status=None, text_type=None, custom=None, query=TQuery(text='what are the standards of the certified weight?', alias='tc_certified_shipping_standards')),
My result has no TPoints with coordinates. Maybe this helps
Answer from Textract with coordinates: TBlock(geometry=TGeometry(bounding_box=TBoundingBox(width=0.061864323914051056, height=0.010403391905128956, left=0.5233812928199768, top=0.3567923903465271), polygon=[TPoint(x=0.5233926177024841, y=0.3567923903465271), TPoint(x=0.5852456092834473, y=0.35685184597969055), TPoint(x=0.585235059261322, y=0.3671957850456238), TPoint(x=0.5233812928199768, y=0.367136150598526)]), id='d7fe92f2-c1d0-4298-857a-77cfd5d95c8e', block_type='QUERY_RESULT', relationships=None, confidence=94.0, text='803.28 kg', column_index=None, column_span=None, entity_types=None, page=1, row_index=None, row_span=None, selection_status=None, text_type=None, custom=None, query=None), TBlock(geometry=None, id='2aaa4f9f-6f0e-4ba0-8973-6a4462ca9bce', block_type='QUERY', relationships=[TRelationship(type='ANSWER', ids=['d7fe92f2-c1d0-4298-857a-77cfd5d95c8e'])], confidence=None, text=None, column_index=None, column_span=None, entity_types=None, page=1, row_index=None, row_span=None, selection_status=None, text_type=None, custom=None, query=TQuery(text='what is the net shipping weight?', alias='tc_net_shipping_weight')),