By using AWS re:Post, you agree to the AWS re:Post Terms of Use

Bedrock knowledge base data source sync is not working

0

I created a Bedrock knowledge base and specified an S3 bucket with 5 PDFs in it as the data source. When I click "Sync", in the knowledge base configuration, nothing happens. No indication of a sync starting, failure, etc. The files are valid PDFs with selectable text and the knowledge base has all S3 permissions enabled.

Sync button does nothing

5 Answers
1

I apologize for the inconvenience you're experiencing with your Bedrock knowledge base data source sync. Based on the information you've provided, there are a few things we can check to troubleshoot this issue:

  1. File size: Ensure that your PDF files don't exceed the maximum ingestion job file size specified in Amazon Bedrock endpoints and quotas. If any of your files are too large, they may be preventing the sync process from starting.

  2. IAM permissions: Double-check that the IAM role associated with your knowledge base has the correct permissions. It should have:

  • s3:GetObject and s3:ListBucket permissions for reading from the S3 bucket
  • s3:PutObject permissions for writing output data
  • If your S3 bucket is encrypted with a KMS key, ensure the role has kms:Decrypt permissions for that key
  1. Vector index configuration: If you're using Amazon OpenSearch Serverless as your vector store, verify that the vector index is configured with the 'faiss' engine. If it's using 'nmslib', you'll need to create a new knowledge base or reconfigure the vector index.

  2. Region compatibility: Ensure that your S3 bucket and the model customization job are in the same AWS region.

  3. Metadata files: If you're using metadata files, make sure they follow the correct naming convention (same filename and extension as the source file, with .metadata.json appended).

  4. Console issues: Sometimes, the console may not reflect the actual state of operations. Try refreshing the page or waiting a few minutes to see if the sync status updates.

  5. API alternative: If the console sync button is not responding, you could try initiating a sync using the AWS CLI or SDK with the UpdateDataSource API call, followed by a StartIngestionJob call.

If you've verified all these points and the issue persists, it may be an internal error. In this case, you should contact AWS Support for further assistance. They can help investigate any backend issues that might be preventing the sync process from starting or being displayed correctly in the console.
Sources
Modify a data source for your Amazon Bedrock knowledge base - Amazon Bedrock
Troubleshooting model customization issues - Amazon Bedrock

profile picture
answered 2 months ago
1

I just experienced this. In my case the problem was the Data Source showed available (as your pic shows), but I was trying to sync before the knowledge base was fully available.

I'd love to see the datasource show waiting and sync button disabled until the KB is active. According to CloudWatch it didn't actually initiate a sync each time I pushed the button, so I wonder if it is meant to be disabled in the UI. I will report it if I can reproduce it.

profile pictureAWS
answered 2 months ago
0

Hello.

Is there a history of "StartIngestionJob" API execution in the CloudTrail event history?
If this event remains, the API itself should basically be running.
You may also find out something by looking at the details of the event.
https://docs.aws.amazon.com/awscloudtrail/latest/userguide/view-cloudtrail-events-console.html

profile picture
EXPERT
answered 2 months ago
0

Click into the Data Source tab "knowledge.." this should show you either a failed job or suceeded with number of documents ingested. Any failures or files ingested should also be viewable within cloudwatch.

AWS
EXPERT
answered 2 months ago
0

Thanks for posting. I also experiencing the syncing issue described above. I resolved it by recognizing that I was using Titan V2 for the OSS vector store embeddings and had not yet requested access to the model.

answered 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions