How do I implement disaster recovery or fault tolerance for my ElastiCache for Redis self-designed cluster?

3 minute read
0

I want to implement disaster recovery or fault tolerance for my Amazon ElastiCache for Redis self-designed cluster.

Resolution

To implement disaster recovery or fault tolerance for your ElastiCache for Redis self-designed cluster, choose one of the following methods based on your use case:

Multi-Availability Zone

If data retention, minimal downtime, and application performance are a priority, then use the Multi-AZ solution. This method offers the following benefits:

  • Low data loss potential - Multi-AZ provides fault tolerance for every scenario and hardware related issues.
  • Low performance impact - Multi-AZ provides the fastest recovery time because there's no manual procedure to follow after the process is implemented.
  • Low to high cost - Multi-AZ is the lowest cost option. Use Multi-AZ when you can't risk data loss because of hardware failure. Also, if you can't afford the downtime required by other options in your response to an outage, then use this option.

For more information, see Minimizing downtime in ElastiCache for Redis with Multi-AZ.

Cross-Region

Use Global Datastore for Redis to write and read data between an ElastiCache for Redis self-designed cluster in one AWS Region and different cross-Region replica clusters. This feature allows low latency reads and disaster recovery across Regions. This method offers the following benefits:

  • Medium data loss potential - When the manual promotion is initiated, the promotion completes in less than 1 minute and allows your applications to remain available.
  • Low performance impact - If Regional degradation occurs, then a cross-Region replica cluster in the Global Datastore can be promoted to a primary cluster with full capabilities. This promotion occurs in less than one minute and allows your applications to remain available.
  • Medium to high cost - Global Datastore introduces the secondary Regions cost for disaster recovery support across Regions.

For more information, see Replication across AWS Regions using global datastores.

Daily automatic backups

Schedule your daily automatic backups at times that you expect low resource use for your cluster. ElastiCache creates a backup of the cluster, and then writes the data from the cache to a Redis rdb file. Redis versions 2.8.22 and later implement a forkless backup that improves performance.

Note: Redis backup and restore aren't supported on cache.t1.micro nodes for clusters with cluster mode turned off.

  • High data loss potential - Daily automatic backups are retained for up to 35 days.
  • Medium to high performance impact - Performance is affected when you run multiple file backups throughout the day. To improve performance, turn on RDB snapshots on a designated persistence only secondary node. Then, turn off both RDB snapshots and the Redis append-only file (AOF) on the primary node and all other secondary nodes.
  • Low to medium cost - Storage costs increase with the number of backups and the data retention duration.

Note: Before you implement backup and restore, make sure that you review the limitations that are caused by backup constraints. For more information, see Snapshot and restore and Taking manual backups.

AWS OFFICIAL
AWS OFFICIALUpdated 22 days ago