Bedrock versus Textract for document text/meaning extraction

0

I'm looking to start a project focused on text/meaning extraction from semi-structured to unstructured documents. There are specific categories of data that I'm looking for, but they may be presented in different formats in the document.

Which would be the right technology to use for such a use case? Would it be Bedrock or Textract? What is the difference between these two services in the context of document text/meaning extraction? What types of use cases are better on Textract versus Bedrock?

Thank you for the help!

Scott

Scott
질문됨 7달 전881회 조회
2개 답변
0
수락된 답변

Textract is trained on a wide variety of documents and is great for extracting the data. Amazon Bedrock Large Language Models (LLMs) are great at both classification with few shot learning and summarisation tasks. Depending on your usecase textract will have low latency, relaitively lower costs (are you dealing with 100's or millions of documents), and offer integrations with other AWS servies such as Amazon Comprehend.

I would recommend a hybrid approach using textract to extract the data and then either comprehend or Bedrock LLMs for classification, LLMs for summarisation, and LLMs for any Q&A type tasks.

The blog post Enhancing AWS intelligent document processing with generative AI offers methods to include generative AI LLMs in an intelligent document processing pipeline.

Also review the following workshop, Intelligent Document Processing with AWS AI Services , and the companion github Intelligent Document Processing with AWS AI Services

AWS
전문가
답변함 7달 전
0

Hi Scott,

I would like to share with you some of my understanding and thoughts on Textract and Bedrock. First of all, I believe that Textract and Bedrock are not contradictory, in fact, they can work well together. We can use Textract to recognize unstructured data and then use Bedrock to organize and process the data in a structured way. I believe the following two examples will be helpful to you:

  1. serverless-pdf-chat: This sample application allows you to ask natural language questions about any PDF document you upload.
  2. Amazon Bedrock: This blog introduces how to use Amazon Bedrock to accelerate accurate extraction in OCR scenes, and it is also a good reference resource.

I hope this information is helpful to you!

Best regards, Keith

AWS
keithyu
답변함 6달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠