Timeouts on access to S3 by redshift after maintenance update

0

Hi,

Last weekend 20/01/2024, after automatic maintenance upgrade of our provisioned redshift cluster (in eu-west-1a) to version 1.0.61678, all our LOAD commands started to fail. Also queries to "external" tables on S3 time out 100%. Regular queries still work normally, so it really looks to be a problem with the connectivity of redshift to S3 only.

Getting data from S3 from regular EC2 instances in same AZ still work as before, proving that VPC and routing tables are still fine (also: nothing else has changed besides this redshift maintenance update).

In STL_ERROR table, entries as below can be seen:

  • Problem reading manifest file - S3CurlException: Resolving timed out after 50001 milliseconds, CurlE
  • S3CurlException: Resolving timed out after 50001 milliseconds, CurlError 28, multiCurlError 0, CanRe

As there is no further visibility on this managed service, reached out to AWS support for investigation of this issue.

Anybody else experienced the same?

Wkr, Bert

已提问 4 个月前211 查看次数
1 回答
0

It is unfortunate Bert that you have encountered this issue after a routine upgrade and Support should be able to assist you with resolution.

I just wanted to bring the a general recommendation of running the Production cluster on trailing track and have non-prod like UAT, QA, Dev on the latest i.e. current track. This way you catch such issues before they impact Production workloads.

Best of luck!

profile pictureAWS
已回答 4 个月前
profile picture
专家
已审核 1 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则