By using AWS re:Post, you agree to the AWS re:Post Terms of Use

Bedrock Knowledgebase Provisioning Issue with OpenSearch Serverless Collection as the Vector Store

0

We are working on ingesting websites into an OpenSearch Serverless Collection, with a requirement to provision the infrastructure at runtime and destroy it once vector generation is complete. We are using boto3 APIs for provisioning. Below is an outline of our process:

Process Overview:

  1. When a website ingestion job starts, it provisions the following resources via boto3:
    • A VECTOR-SEARCH OpenSearch Serverless collection.
    • A vector index within that collection.
    • A Bedrock Knowledgebase with storage configuration pointing to the above OpenSearch collection.
    • A new data source in the knowledgebase, with the incoming URL attached for ingestion.
  2. Once the infrastructure is provisioned, the job initiates the Bedrock Knowledgebase Sync.
    • After the sync completes, the vectors are migrated from the OpenSearch Serverless collection to our internal PostgreSQL database.
  3. Finally, the job tears down the infrastructure.

Issue:

We are encountering an error during the Bedrock Knowledgebase provisioning step. The OpenSearch Serverless collection and its index are created successfully, but the create_knowledge_base API call fails with the following error:

An error occurred (ValidationException) when calling the CreateKnowledgeBase operation: The knowledge base storage configuration provided is invalid... no such index [index-name]

It appears that the Bedrock service is unable to locate the newly created index. We have added like a 60sec delay so that index is available before knowledgebase creation but it didn't work. Then we added a delay of like 10minutes but still no luck.

We have used the OpenSearch CAT API to list the created indices, and while the index appears in the list, its health status is empty. We are unsure if this is related to the issue.

Request for Help:

We need assistance in identifying why Bedrock is unable to find the newly created index. Is this something related to the index propagation to the opensearch? or maybe we are missing somthing in the settings. Below, I will share the settings we are using for the relevant boto3 client API calls, as well as the OpenSearch dashboard settings, for better context.


Settings:

We have default 10OCUs for Index and 10OCUs for search set in our Opensearch severless dashboard.

ocu-ss

Boto3 Opensearch-serverless client

create_collection (name='some-name', description='some-desc', type='VECTORSEARCH', standbyReplicas='DISABLED' )

Opensearch python client for index creation

    def __load_aws_auth(self):
        service = "aoss"
        credentials = boto3.Session().get_credentials()
        awsauth = AWS4Auth(
            credentials.access_key,
            credentials.secret_key,
            self.region,
            service,
            session_token=credentials.token,
        )

        return awsauth

    def get_opensearch_client(self, collection_endpoint):
        client = OpenSearch(
            hosts=[{"host": collection_endpoint, "port": 443}],
            http_auth=self.__load_aws_auth(),
            use_ssl=True,
            verify_certs=True,
            connection_class=RequestsHttpConnection,
            timeout=300,
        )
        return client

# CREATE VECTOR INDEX
client = get_opensearch_client(collection_endpoint)

index_body = {
         "index":{
            "knn":true,
            "knn.algo_param.ef_search":512
         }
      },
      "mappings":{
         "properties":{
            "bedrock-knowledge-base-vector":{
               "type":"knn_vector",
               "dimension":1024,
               "method":{
                  "engine":"faiss",
                  "name":"hnsw",
                  "parameters":{
                     
                  },
                  "space_type":"l2"
               }
            },
            "AMAZON_BEDROCK_METADATA":{
               "type":"text",
               "index":"false"
            },
            "AMAZON_BEDROCK_TEXT_CHUNK":{
               "type":"text",
               "index":"true"
            }
         }
      }
   }

response = client.indices.create(
    index_name='some-vector-index-name',
    body=index_body,
)

Boto3 Bedrock agent client

create_knowledge_base(
    name='some-name',
    description='some-desc',
    roleArn='bedrock-role-arn',
    knowledgeBaseConfiguration={
      "type":"VECTOR",
      "vectorKnowledgeBaseConfiguration":{
         "embeddingModelArn":"arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0",
         "embeddingModelConfiguration":{
            "bedrockEmbeddingModelConfiguration":{
               "dimensions":1024
            }
         }
      },
    storageConfiguration={
      "type":"OPENSEARCH_SERVERLESS",
      "opensearchServerlessConfiguration":{
         "collectionArn":"",
         "fieldMapping":{
            "metadataField":"AMAZON_BEDROCK_METADATA",
            "textField":"AMAZON_BEDROCK_TEXT_CHUNK",
            "vectorField":"bedrock-knowledge-base-vector"
         },
         "vectorIndexName":"some-vector-index-name"
      }
   })
1 Answer
1
Accepted Answer

Based on the information provided, it appears that you're encountering an issue with the creation of a Bedrock Knowledge Base using an OpenSearch Serverless collection as the vector store. The error suggests that the Bedrock service is unable to find the newly created index in the OpenSearch Serverless collection. This could be due to a few reasons:

  1. Propagation Delay: OpenSearch Serverless might take some time to propagate systemic changes or updates across all nodes or replicas of an index. This propagation delay can cause the index to appear unavailable to other services like Bedrock, even if it's visible in the OpenSearch dashboard.

  2. Index Health: You mentioned that the index's health status is empty when checked via the CAT API. This could indicate that the index is not fully initialized or healthy, which might prevent Bedrock from accessing it.

  3. Permissions: Ensure that the IAM role used for creating the Knowledge Base has the necessary permissions to access the OpenSearch Serverless collection and its index.

To resolve this issue, you can try the following steps:

  1. Increase the delay between creating the OpenSearch index and creating the Bedrock Knowledge Base. Instead of a fixed delay, you could implement a polling mechanism that checks the index health status periodically until it becomes active or healthy.

  2. Verify that you can manually create and access the index in OpenSearch Serverless. This will help confirm if the issue is related to index creation or permissions.

  3. Double-check that the service role you're using for Bedrock has the necessary permissions to access OpenSearch Serverless collections and indexes.

  4. Ensure that the vector index is configured with the 'faiss' engine, as required by Bedrock Knowledge Bases.

  5. If possible, try creating the Knowledge Base through the AWS console to see if the issue persists. This can help determine if it's an API-specific problem or a general configuration issue.

If the issue persists after trying these steps, you may need to contact AWS support for further assistance, as there could be underlying service-specific issues or limitations that are not immediately apparent.
Sources
Resolve knowledge base permission errors | AWS re:Post
Sync your data with your Amazon Bedrock knowledge base - Amazon Bedrock

profile picture
answered 2 months ago
profile picture
EXPERT
reviewed 2 months ago
  • Our Opensearch Serverless data access policy and Bedrock Knowledgebase IAM role has the necessary permissions. Let me paste down them for your review.

    Bedrock Knowledgebase IAM Role policy for Openseach access { "Statement": [ { "Action": [ "aoss:APIAccessAll" ], "Effect": "Allow", "Resource": [ "arn:aws:aoss:us-east-1:account-id:collection/*" ], "Sid": "OpenSearchServerlessAPIAccessAllStatement" } ], "Version": "2012-10-17" }

    Look at the allowed permissions in the Opensearch Serverless Data Acces Policy.

    Granted resources

    Resource type

    Permissions

    aoss:CreateIndex aoss:DeleteIndex aoss:UpdateIndex aoss:DescribeIndex aoss:ReadDocument aoss:WriteDocument

    aoss:CreateCollectionItems aoss:DeleteCollectionItems aoss:UpdateCollectionItems aoss:DescribeCollectionItems

  • Ok, So we added a delay of 5 minutes as suggested by this AWS GenAI Bot and also refered at below forum post(I am sharing the link) and it worked. Just wait for some time after the vector index creation. I guess, it takes some time to propagate the index and become available to other services.

    SEE THE POST HERE

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions