"Failed to create the Amazon Opensearch Serverless collection. Failed to fetch" when creating bedrock knowledge base with web crawler

0

data source -> web crawler

Regex include pattern -> none

content chunking and parsing -> default

embeddings model -> Embed english v3

vector database -> quick create a new vector store

policies attached to user:

  1. AmazonBedrockFullAccess

  2. Custom policies:

A. { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "aoss:CreateAccessPolicy", "aoss:CreateSecurityPolicy", "aoss:CreateCollection", "aoss:ListCollections", "aoss:BatchGetCollection", "aoss:UpdateCollection", "aoss:DeleteCollection", "aoss:ListAccessPolicies", "aoss:ListSecurityPolicies", "aoss:ListTagsForResource", "aoss:UpdateAccessPolicy", "aoss:GetSecurityPolicy", "aoss:UpdateSecurityPolicy", "iam:ListUsers", "iam:ListRoles" ], "Resource": "*" } ] }

B.
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "bedrock:ListFoundationModels", "bedrock:ListCustomModels" ], "Resource": "*" }, { "Effect": "Allow", "Action": [ "bedrock:InvokeModel" ], "Resource": [ "arn:aws:bedrock:ap-southeast-1::foundation-model/amazon.titan-embed-text-v1", "arn:aws:bedrock:ap-southeast-1::foundation-model/cohere.embed-english-v3", "arn:aws:bedrock:ap-southeast-1::foundation-model/cohere.embed-multilingual-v3" ] } ] }

C. { "Version": "2012-10-17", "Statement": [ { "Sid": "MarketplaceBedrock", "Effect": "Allow", "Action": [ "aws-marketplace:ViewSubscriptions", "aws-marketplace:Unsubscribe", "aws-marketplace:Subscribe" ], "Resource": "*" } ] }

Please, badly need help from expert. I'm stuck on this one i don't know what to do anymore. what i'm missing with policies ?

'Preparing vector database in Amazon Opensearch Serverless. This process may take several minutes to complete.' then: 'Failed to create the Amazon Opensearch Serverless collection. Failed to fetch'

but upon checking collections in opensearch service, it was there, the only thing is the creation of knowledge base is not successful. just following a video -> https://www.youtube.com/watch?v=oSnFZhHuIgg

1 Answer
1

Hi,

To simplify your debugging, i suggest to start creating a Bedrock KB from S3 content (a bucket with 2-3 manually created files will be enough) as a first data source. That will allow you to configure all the Bedrock KB parameters properly. Then you can add the web crawler as a second direct datasource.

On you can even use your S3 bucket as a place where the crawler stores the crawled content then loaded to KB via a recurrent sync job. Storing the content in S3 will make your observability much better: if KB sync job complains about content to be parsed and vectorized, it's be much simpler to analyze.

A step by step description of KB creation with S3 is here: https://medium.com/@saikatm.courses/implementing-rag-app-using-knowledge-base-from-amazon-bedrock-and-streamlit-e52f8300f01d

Best,

Didier

profile pictureAWS
EXPERT
answered a month ago
profile picture
EXPERT
reviewed a month ago
profile picture
EXPERT
reviewed a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions