Bedrock Llama 3 empty string response problem

0

I'm experimenting with several llms in Bedrock. I have no problem with claude models but there is a problem that I can not solve with llama 3 instruct models. I'm using correct format of llama 3 models in my prompt as stated in this guide [https://aws.amazon.com/blogs/aws/metas-llama-3-models-are-now-available-in-amazon-bedrock/#:~:text=Input%3A-,%3C%7Cbegin_of_text%7C%3E%3C%7Cstart_header_id%7C%3Euser%3C%7Cend_header_id,%7Cstart_header_id%7C%3Eassistant%3C%7Cend_header_id%7C%3E%5Cn%5Cn,-Output%3A%20The%20Eiffel](llama 3 guide) The problem is I'm getting empty string output sometimes :

"output": {
        "outputContentType": "application/json",
        "outputBodyJson": {
            "generation": "",
            "prompt_token_count": 1056,
            "generation_token_count": 1,
            "stop_reason": "stop"
        },
        "outputTokenCount": 1
    }

Example input (only ending part):

<|eot_id|><|start_header_id|>assistant<|end_header_id|>/n

I'm expecting model to continue with assistant answer and I'm getting meaningful response 80% of my calls. At 20% of my calls, I'm getting this empty response. I'm suspecting that model returns <|end_of_text|> token at starting but it does not makes sense to me. (I'm not sure about it since Aws doesn't provide explicit model output.) I'm trying with the same prompt in the playground and it returns correct output. Is there a problem with llama 3 API or is it another problem?

I'm getting this input prompt in the logs. As far as I know

"inputBodyJson": {
            "temperature": 0.5,
            "prompt": "[INST] <|begin_of_text|><|start_header_id|>system<|end_header_id|>\n<|eot_id|>
--REST OF THE PROMPT--
<|start_header_id|>assistant<|end_header_id|>/n/n [/INST]"
        },
        "inputTokenCount": 1056

[INST] [/INST] format used in llama 2 as far as I know. API adds these tokens automatically. This might be the cause of the problem. What do you think?

  • sounds a bit like intermittent issues. check out the answer i submitted

2 Answers
0
Accepted Answer

I found the problem. The problem stems from langchain

Ihsan
answered 20 days ago
  • Hi Ihsan @Ihsan, would you mind kindly sharing more information on the cause and solution of this problem? Now I'm stuck in the same issue. Thank you.

  • Do not use langchain. It changes prompt internally @Yuz

  • Thx @Ihsan. Do you mean the Langchain PromptTemplatet re-write the prompt and cause the problem, as llama requires a different prompt format? Can you share how it changes it if possible? Sorry a lot of questions.

-1

It seems like you're experiencing intermittent issues with receiving empty responses from the Llama 3 model in Bedrock, despite providing correct input prompts. Here are a few possible reasons and troubleshooting steps to consider:

  1. Prompt Format: Ensure that the prompt format follows the guidelines provided for Llama 3 models. Based on the example provided in the AWS blog post you mentioned, the prompt should start with <|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\n followed by your actual prompt content, and end with <|start_header_id|>assistant<|end_header_id|>\n\n.

  2. Token Count: Check the token count of your input prompt. In your example, the token count is 1056. Make sure this count accurately reflects the length of your input prompt.

  3. Temperature Setting: Experiment with different temperature settings to see if it affects the model's behavior. Higher temperatures can result in more creative but potentially less coherent responses, while lower temperatures can produce more conservative and predictable outputs.

  4. Retry Logic: Implement retry logic in your application to handle cases where you receive empty responses. If you consistently receive empty responses for the same input prompt, retrying the request may yield a valid response.

  5. Check API Limits: Verify that you're not hitting any API rate limits or quotas that could cause intermittent issues with model responses. AWS provides documentation on API usage limits for Bedrock services that you can review.

  6. Contact Support: If the issue persists and you're unable to determine the cause, consider reaching out to AWS support for assistance. They can investigate further and provide guidance based on the specifics of your situation.

Regarding your suspicion about the [INST] tokens, if these tokens are automatically added by the API and are not part of the expected input format for Llama 3 models, they could potentially cause issues. You may want to confirm with AWS support or consult the documentation to ensure that your input prompt format is correct.

Overall, troubleshooting intermittent issues with model responses can be challenging, but by systematically checking and adjusting various factors, you may be able to identify the root cause and resolve the problem.

Mustafa
answered 20 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions