- Newest
- Most votes
- Most comments
Hi,
To simplify your debugging, i suggest to start creating a Bedrock KB from S3 content (a bucket with 2-3 manually created files will be enough) as a first data source. That will allow you to configure all the Bedrock KB parameters properly. Then you can add the web crawler as a second direct datasource.
On you can even use your S3 bucket as a place where the crawler stores the crawled content then loaded to KB via a recurrent sync job. Storing the content in S3 will make your observability much better: if KB sync job complains about content to be parsed and vectorized, it's be much simpler to analyze.
A step by step description of KB creation with S3 is here: https://medium.com/@saikatm.courses/implementing-rag-app-using-knowledge-base-from-amazon-bedrock-and-streamlit-e52f8300f01d
Best,
Didier
Relevant content
- asked 2 months ago
- asked 3 months ago
- asked 9 months ago
- AWS OFFICIALUpdated 8 months ago
- AWS OFFICIALUpdated 14 days ago
- AWS OFFICIALUpdated 5 months ago