Timeouts on access to S3 by redshift after maintenance update



Last weekend 20/01/2024, after automatic maintenance upgrade of our provisioned redshift cluster (in eu-west-1a) to version 1.0.61678, all our LOAD commands started to fail. Also queries to "external" tables on S3 time out 100%. Regular queries still work normally, so it really looks to be a problem with the connectivity of redshift to S3 only.

Getting data from S3 from regular EC2 instances in same AZ still work as before, proving that VPC and routing tables are still fine (also: nothing else has changed besides this redshift maintenance update).

In STL_ERROR table, entries as below can be seen:

  • Problem reading manifest file - S3CurlException: Resolving timed out after 50001 milliseconds, CurlE
  • S3CurlException: Resolving timed out after 50001 milliseconds, CurlError 28, multiCurlError 0, CanRe

As there is no further visibility on this managed service, reached out to AWS support for investigation of this issue.

Anybody else experienced the same?

Wkr, Bert

gefragt vor 4 Monaten217 Aufrufe
1 Antwort

It is unfortunate Bert that you have encountered this issue after a routine upgrade and Support should be able to assist you with resolution.

I just wanted to bring the a general recommendation of running the Production cluster on trailing track and have non-prod like UAT, QA, Dev on the latest i.e. current track. This way you catch such issues before they impact Production workloads.

Best of luck!

profile pictureAWS
beantwortet vor 4 Monaten
profile picture
überprüft vor 2 Monaten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen