We run a Elastic cache Memcache server that store around 30,000 items and gets queried and updated around 50 times per second. One request per client. During a peak load we started to see Memcached get() calls timeouts and set() erroring as well.
The get() calls failed with error 31 which RES_TIMEOUT or MEMCACHED_TIMEOUT
the set() calls then failed with error 47 which is MEMCACHED_SERVER_TEMPORARILY_DISABLED
memcached constants from: https://www.php.net/manual/en/memcached.getresultcode.php
Version specifics.
Memcache (v1.6.6) - 1 node in a cluster
Clients: php7, Ubuntu v20, memcached version (v1.5.22)
Now if we had implemented an memcache multiple node cluster, utilized the ElastiCache cluster client with auto discovery, and that there was a health node in the cluster, does auto discovery failover to other nodes in the cluster or do we have to build that logic into out application?
Our caching is strings and small images. Alternatively, should we move to a Redis cluster? From what I understand Redis automatically promotes a replica to a new master in the event of a failure.
Thanks for any suggestions!
Thanks for the links Didier, I do see some information that is helpful/interesting. What I'm really trying to determine though is, not so much identifying why we hit some timeouts or how to fix them. More I'm trying to determine if memcache returns a timeout message, would we failover to another node in a memcache cluster or do we have to build that failover/rerouting into our application? Alternatively does Redis provide automatic failover in this scenario? Either failover of replica replacement?