Skip to content

ElastiCache Serverless (Valkey): Single read replica across AZs causing cross-AZ latency - Can we ensure one replica per AZ?

0

We’re using ElastiCache Serverless for Valkey in a region with 3 Availability Zones (eu-west-2a, eu-west-2b, eu-west-2c). The cluster was created with 3 subnets, one in each AZ.

However, from observing network traffic and latency patterns, it seems that only one read replica is currently active - located in one of the subnets/AZs.

Our application, which is deployed on EKS with an equal number of pods running in each AZ, experiences cross-AZ calls to that single replica. As a result, we observe higher latencies on higher percentiles (P95–P99) due to cross-AZ network hops. Ideally, we would like to have one read replica per AZ, so each application pod can access a local replica and minimize latency.

My questions are:

  • Is it possible to configure an ElastiCache Serverless (Valkey) cluster so that it maintains at least one read replica per AZ for optimal latency and locality?
  • Our application currently handles around 3K requests per second per region, and the single replica appears to handle all this load.
    • What are the criteria or scaling rules for ElastiCache Serverless to spawn replicas in multiple AZs?
    • Is there any metric or threshold (throughput, connections, CPU load, etc.) that controls when additional replicas are created?

Any clarification on how serverless Valkey manages its replica topology and AZ distribution would be greatly appreciated.

Thanks, Ruslan

asked a month ago80 views
1 Answer
0

ElastiCache Serverless for Valkey automatically manages a replicated Multi-AZ architecture that's designed to handle availability zone failures transparently for your application. However, there are some important points to understand about how it works:

For ElastiCache Serverless caches, the replica distribution across AZs is managed automatically by the service. Unlike self-designed clusters where you can explicitly configure the number and placement of replicas, serverless caches handle this internally based on their own scaling algorithms.

ElastiCache Serverless automatically mitigates availability zone failures through its Multi-AZ architecture, but it doesn't guarantee that replicas will be placed in every AZ where you have subnets defined. The service optimizes for availability and performance based on its internal metrics and algorithms.

Regarding your specific questions:

  • It's not currently possible to explicitly configure ElastiCache Serverless to maintain one read replica per AZ. The replica placement is managed automatically by the service.

  • The exact criteria for when ElastiCache Serverless spawns additional replicas across multiple AZs isn't publicly documented in detail. The service uses internal algorithms to determine when to scale and where to place replicas based on factors like load, traffic patterns, and availability requirements.

  • There are no published thresholds or metrics that you can monitor to predict when additional replicas will be created across AZs. The scaling decisions are handled internally by the service.

For your use case with latency concerns, you might want to consider:

  1. If consistent low latency across all percentiles is critical for your application, you might need to evaluate using a self-designed ElastiCache cluster instead of Serverless, as this would give you more control over replica placement.

  2. With self-designed clusters, you can explicitly configure replicas per shard and ensure they're distributed across multiple AZs to minimize cross-AZ traffic.

  3. For self-designed clusters with Multi-AZ enabled, you need to ensure you have at least one replica per shard, and you can distribute your nodes across as many Availability Zones as possible to mitigate AZ failures.
    Sources
    Resilience in Amazon ElastiCache - Amazon ElastiCache
    Amazon ElastiCache Well-Architected Lens Reliability Pillar - Amazon ElastiCache
    Creating a cluster for Valkey or Redis OSS - Amazon ElastiCache
    Valkey or Redis OSS nodes and shards - Amazon ElastiCache

answered a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.