Bedrock versus Textract for document text/meaning extraction

0

I'm looking to start a project focused on text/meaning extraction from semi-structured to unstructured documents. There are specific categories of data that I'm looking for, but they may be presented in different formats in the document.

Which would be the right technology to use for such a use case? Would it be Bedrock or Textract? What is the difference between these two services in the context of document text/meaning extraction? What types of use cases are better on Textract versus Bedrock?

Thank you for the help!

Scott

Scott
gefragt vor 7 Monaten887 Aufrufe
2 Antworten
0
Akzeptierte Antwort

Textract is trained on a wide variety of documents and is great for extracting the data. Amazon Bedrock Large Language Models (LLMs) are great at both classification with few shot learning and summarisation tasks. Depending on your usecase textract will have low latency, relaitively lower costs (are you dealing with 100's or millions of documents), and offer integrations with other AWS servies such as Amazon Comprehend.

I would recommend a hybrid approach using textract to extract the data and then either comprehend or Bedrock LLMs for classification, LLMs for summarisation, and LLMs for any Q&A type tasks.

The blog post Enhancing AWS intelligent document processing with generative AI offers methods to include generative AI LLMs in an intelligent document processing pipeline.

Also review the following workshop, Intelligent Document Processing with AWS AI Services , and the companion github Intelligent Document Processing with AWS AI Services

AWS
EXPERTE
beantwortet vor 7 Monaten
0

Hi Scott,

I would like to share with you some of my understanding and thoughts on Textract and Bedrock. First of all, I believe that Textract and Bedrock are not contradictory, in fact, they can work well together. We can use Textract to recognize unstructured data and then use Bedrock to organize and process the data in a structured way. I believe the following two examples will be helpful to you:

  1. serverless-pdf-chat: This sample application allows you to ask natural language questions about any PDF document you upload.
  2. Amazon Bedrock: This blog introduces how to use Amazon Bedrock to accelerate accurate extraction in OCR scenes, and it is also a good reference resource.

I hope this information is helpful to you!

Best regards, Keith

AWS
keithyu
beantwortet vor 6 Monaten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen