Skip to content

Intermittent "Too many connections" errors with Claude Sonnet 4.5 in eu-central-1

0

We experienced intermittent ServiceUnavailableException errors on October 16, 2025 (09:00-15:00 UTC) when calling the ConverseStream operation for Claude Sonnet 4.5.

Error Message: "ServiceUnavailableException: Too many connections, please wait before trying again"

Configuration:

  • Model: anthropic.claude-sonnet-4-5-20250929-v1:0
  • Inference Profile: eu.anthropic.claude-sonnet-4-5-20250929-v1:0
  • Region: eu-central-1
  • Account quota: 200 RPM
  • Actual usage: ~10-20 concurrent requests

Behavior:

  • Error persisted for approximately 6 hours
  • Resolved spontaneously without any code changes
  • Same code works fine in us-east-1 region
  • boto3 retries exhausted (4 attempts)

Questions:

  1. Was there a known infrastructure issue with the EU Sonnet 4.5 inference profile on Oct 16?
  2. Are there connection limits at the infrastructure level separate from account quotas?
  3. What's the recommended approach to prevent this in the future?

The issue appears to have been AWS-side. Has anyone else experienced similar issues with cross-region inference profiles?

1 Answer
0

Based on the information available, there have been documented reliability issues with Claude Sonnet 4.5 in the EU region. While I don't have specific confirmation about an outage on October 16, 2025, there have been similar reported incidents with Bedrock services in the EU region.

Regarding your questions:

  1. There have been previous complete outages of Sonnet 4.5 in the EU region that were confirmed by AWS support. Your experience aligns with these documented issues.

  2. Yes, there are connection limits at the infrastructure level separate from your account quotas. For Claude models on Amazon Bedrock with on-demand provisioning, there are both requests per minute and tokens per minute limits. When you exceed these limits, Bedrock returns throttling exceptions. The "Too many connections" error suggests you hit infrastructure capacity constraints rather than your account quota limits.

  3. Recommended approaches to prevent this in the future:

  • Implement retry logic with exponential backoff in your application
  • Consider using the global inference profile which has been reported as more stable (though potentially slower)
  • Monitor your usage with CloudWatch metrics to better understand if you're consistently hitting limits
  • Consider using provisioned throughput for more reliable access, though currently only older Claude models are available with this option
  • Space out your requests to stay within per-minute limits

The global inference profile has been reported to be more stable than the EU-specific one, which aligns with your observation that the same code works fine in us-east-1. AWS is reportedly working with Anthropic to improve reliability, but no specific timeline has been provided for when these issues will be fully resolved.
Sources
Building Agentic AI app with Strands Agents & MCP | AWS re:Post
Bedrock service unavailable 80% of the time | AWS re:Post

answered 3 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.