Random Queries with [ErrorCode: INTERNAL_ERROR_QUERY_ENGINE] Amazon Athena experienced an internal error

0

Hi

I'm testing a lambda function that executes batches of small athena queries against an S3 glue table.

When I execute the function code locally as my admin it schedules the queries and they never error, when the function executes them perhaps 1 in 30 fail (each query is the same with different partition constraints, changing the date range of files to search).

I have even set the code to execute the identical query again if the original fails, in the majority of cases (not 100% of them but most) the 2nd execution succeeds, with no change in permissions or the query itself.

If I select the failed query from the athena console and re-run it, it executes without error.

If anyone from AWS support can look at the error, here are two identical queries in us-east-1 that failed and then succeeded... 042735b7-31e6-4ac4-90ef-913ac8c5201f FAILED 04a415cd-3f01-42b8-a05c-8b344d06085e SUCCEEDED

Regards

Matthew

asked a year ago444 views
1 Answer
0
Accepted Answer

Could you please share the exact error with which the query fails?

based on the few information it could be that when the lambda function executes the query it submits more queries that your quota allow as explained here.

If this is the case you could be able to resolve the issue either:

  1. requesting a quota increase (could not work if later you continue to submit more queries than the quota)
  2. throttling the number of queries that your lambda can run in parallel.
  3. set up a retry mechanism hope this helps
AWS
EXPERT
answered a year ago
  • Thanks, that was more or less the entire message, i had previously hit the quota exceeded messages, when this happens you get a meaningful error. I have asked for a quota increase but this particular message was not quota based (or the error was not saying quota), even when only one process was running I was getting random instances of these. That was on the 10th, however yesterday this behaviour stopped, the same processes running yesterday didn't fail randomly in the same way. My gut is something was failing behind the scenes and AWS have resolved the issue in this region. I did spend a day adding in retry mechanisms and reducing the batch sizes of the queries to make my process more tolerant to things like this.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions