Slight delays between actual round robining of Aurora reader instances

0

When hitting the aurora reader endpoint DNS name for the cluster, we understand the default behavior under the hood, is AWS may attempt some basic round robin-ing to distribute connections via that to the actual multiple reader instances in the Aurora cluster. However, we noticed on test only after a small wait will the reader endpoint might actually switch over to route connections to a different internal reader instance for the next few seconds. For instance multiple simultaneous queries run in 4-6s they can sometimes still end up hitting the same actual reader/replica instance IP address.

Question: Is there any possible way we can *control *this period, for instance to reduce the switch-over, for better distribution/load balancing during connection spikes Thanks

asked a year ago387 views
1 Answer
0

Hello There,

I understand that when hitting the Aurora Reader DNS endpoint you have noticed that simultaneous queries ran within few seconds sometimes end up hitting the same reader instance endpoint and you are looking for way if it is possible to reduce the time taken for switch-over for better load balancing during the connection spikes.

Firstly, as you correctly stated load balancing among Aurora readers is done in a round robin fashion [1][2]. It can provide DNS-based, round robin load balancing for new connections. Every time you resolve the reader endpoint, you'll get an instance IP that you can connect to, chosen in round robin fashion.

[1] : Load Balancing with Reader endpoint - https://docs.aws.amazon.com/whitepapers/latest/amazon-aurora-mysql-db-admin-handbook/load-balancing-with-the-reader-endpoint.html

[2] : https://aws.amazon.com/premiumsupport/knowledge-center/aurora-mysql-postgresql-reader-nodes/

However, unless you are using a smart database driver, you are depending on DNS record updates and DNS propagation for failovers, instance scaling, and load balancing across Aurora Replicas on the Aurora architecture itself. And Currently, Aurora DNS zones use a short Time-To-Live (TTL) of ‘five seconds’.

[+] : DNS Caching in Aurora - https://docs.aws.amazon.com/whitepapers/latest/amazon-aurora-mysql-db-admin-handbook/dns-caching.html

As you have mentioned, you are currently running simultaneous queries within 4-6 secs which means it is highly possible that all these burst of connections have been opened at the same time from the application, they will all be directed to the same reader instance at that point in time and this is an expected behaviour in Aurora since the current Time-to-Live is of five seconds as per Aurora architecture.

And for the same above reason, you might not be able to reduce the Time-to-Live to much lower period in Aurora. However, as a current workaround you can make use of custom endpoints if you want more flexibility when managing the distribution of your workload and the same use-case has been discussed at the end of the following AWS Re:Post.

[+] : How does Aurora MySQL or PostgreSQL distribute workload between reader nodes? - https://repost.aws/knowledge-center/aurora-mysql-postgresql-reader-nodes

[+] : Custom Endpoints - https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/Aurora.Overview.Endpoints.html#Aurora.Endpoints.Custom

I hope that the above information was quite helpful. Have a great day ahead!

AWS
answered a year ago
  • Thank you for the very helpful information.

    Can you provide estimates on the delays before DNS host names maybe updated on failovers or DNS updates (worst cases)? (i.e. any cases where instance IPs may change etc internally while DNS names maybe stale)

  • Hello There,

    I am glad to hear that the above information was helpful for you. And I can see that you have a quick follow up question as well.

    Coming to your concern, Firstly, Amazon RDS/Aurora IP address do change in some of the scenarios such as start/stop of the instance or if there is a change in the underlying host for any unexpected reasons (hardware issues, network issues) you will observe the change in the IP address.

    But However, Note that DNS Endpoint or DNS Hostname does not change for your corresponding Aurora Cluster and as mentioned previously Aurora DNS uses a TTL of 5 seconds which ensures that clients don’t keep the IP address for too long and DNS record gets updated with the new IP address while having the same DNS Endpoint for that aurora cluster.

    [+] : https://docs.aws.amazon.com/whitepapers/latest/amazon-aurora-mysql-db-admin-handbook/dns-endpoints.html

    I hope the above information was resourceful. Have a great day ahead!

  • Thanks again for the useful info.

    One more follow up question. Would using RDS Proxy in front our Aurora DB clusters, be able to avoid the 5s delays seen here? Specifically if we want to ensure more even load balancing across Aurora read replicas per 5s (not have requests mostly just routing to the same read replica instance during each 5s), would RDS Proxy be able to do that? Thanks

  • Hello There,

    Looks like you have another quick follow up question.

    Regarding your query, I would like to inform you that Yes, You can use Amazon RDS Proxy to create additional read-only endpoints for an Aurora cluster. The same information has also been iterated in below AWS documentation as well.

    [+] : Load Balancing with reader endpoint - https://docs.aws.amazon.com/whitepapers/latest/amazon-aurora-mysql-db-admin-handbook/load-balancing-with-the-reader-endpoint.html

    “You can use Amazon RDS Proxy to create additional read-only endpoints for an Aurora cluster. These endpoints perform the same kind of load-balancing as the Aurora reader endpoint”.

    The additional read-only endpoints would help to load balance the read requests more evenly if your application bursts lot of connections within few seconds.

    You can refer the below handy article as well for more information: [+] : https://aws.amazon.com/blogs/database/use-amazon-rds-proxy-with-read-only-endpoints/

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions