- Newest
- Most votes
- Most comments
Hello!
So this seems to be a prompt engineering problem. In the back-end of this architecture, the wokflow is orchestrated by an open source Python Library called Langchain. If you take a look at the Python code and the way Langchain is orchestrating the RAG architecture you will see a section called 'Prompt Template'. In this prompt template you will be able to view the background prompt and where the "context" (excerpts and documents from kendra) and "question" (user input) go. It can look something like this:
*prompt_template = """ Human: This is a friendly conversation between a human and an AI. The AI is talkative and provides specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.
Assistant: OK, got it, I'll be a talkative truthful AI assistant.
Human: Here are a few documents in <documents> tags: <documents> {context} </documents> Based on the above documents, provide a detailed answer for, {question} Answer "don't know" if not present in the document. Assistant:""" *
As you can see the line "Answer "don't know" if not present in the document." is there as a safeguard to ensure the model does not hallucinate. BUT what this also does is ensure that any query that does not pertain do the information stored in your vector store would be met with the answer "don't know". If you would like to have the more general answers to questions, you can remove that line and anywhere else in the prompt where it indicates to the LLM to answer questions ONLY based on the documents provided. This will increase your model hallucinations, but will also allow you to ask general questions.
I hope this helps! -Moh
I was exploring this topic recently and was recommended the following blog post to read more on this.
I was able to resolve this problem by increasing the temperature. The model looks for knowledge from external data as its creativity (temperature) increases.
I'm glad it is working.
You are right. Kendra is limited to the knowledge of the documents it has. This is why you often will see that different chatbots or Q&A systems leverage the power of LLM to provide more human like answers. Please see this example. It describes a sample chat that user both Kendra and LLM. The LLM is from the SageMaker Jump start, but you can modify the code to work with Bedrock
https://github.com/aws-samples/generative-ai-on-aws-immersion-day/blob/main/lab4/rag-lab.ipynb
If you want to get answers based on the context, your prompt (in Python) might be something like this:
f"""{context}
Answer the following question based on the context above:
{question}"""
However, if you want to ask a question without Retrieval Augmented Generation, then your prompt would just be the question itself:
f"{question}"
Relevant content
- asked a year ago
- Accepted Answerasked a year ago
- asked 7 months ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 months ago
- AWS OFFICIALUpdated 2 years ago
Thank you Vijay for the pointer. I ran into errors when trying out the sample notebook. We’re you able to import the sagemakerEmbedding library?