- Newest
- Most votes
- Most comments
Hello,
A bounding box (BoundingBox) has the following properties:
- Height – The height of the bounding box as a ratio of the overall document page height.
- Left – The X coordinate of the top-left point of the bounding box as a ratio of the overall document page width.
- Top – The Y coordinate of the top-left point of the bounding box as a ratio of the overall document page height.
- Width – The width of the bounding box as a ratio of the overall document page width.
Each BoundingBox property has a value between 0 and 1. The value is a ratio of the overall image width (applies to Left and Width) or height (applies to Height and Top). For example, if the input image is 700 x 200 pixels, and the top-left coordinate of the bounding box is (350,50) pixels, the API returns a Left value of 0.5 (350/700) and a Top value of 0.25 (50/200).
The polygon returned by AnalyzeDocument is an array of Point objects. Each Point has an X and Y coordinate for a specific location on the document page. Like the BoundingBox coordinates, the polygon coordinates are normalized to the document width and height, and are between 0 and 1.
You can use points in the polygon array to display a finer-grain bounding box around a Block object. You calculate the position of each polygon point on the document page by using the same technique used for BoundingBoxes. Multiply the X coordinate by the document page width, and multiply the Y coordinate by the document page height.
We have scripts that show how to use these exact values for geometry and polygon:
This document [1] shows how to calculate the bounding box, and this document [2] shows how to use the polygon values.
Please follow this guide [3] for a full working example, whereby we upload an image for text detection, we then use python code to draw bounding boxes around the detected text in the image and then display the image in a browser.
References:
[1]. https://docs.aws.amazon.com/textract/latest/dg/text-location.html#bounding-box
[2]. https://docs.aws.amazon.com/textract/latest/dg/text-location.html#polygon
[3]. https://docs.aws.amazon.com/textract/latest/dg/detecting-document-text.html
Hi, you will need to compute the bounding box coordinates for the selection of words which are annotating using the coordinates of the corresponding OCR words.
Relevant content
- asked 10 months ago
- asked 5 days ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 4 days ago
- AWS OFFICIALUpdated 10 months ago