Usando AWS re:Post, accetti AWS re:Post Termini di utilizzo

How to solve HIVE_CANNOT_OPEN_SPLIT: Error opening Hive split s3://bucket/raw/schema/table/file.parquet Rate exceeded (Service: AWSLakeFormation; Status Code: 400; Error Code: ThrottlingException...?

0

I'm trying to execute queries in Athena that involve multiple tables using joins. However, when I attempted to run several queries simultaneously (8 queries), I encountered the error. To address this, I reduced the number of files in each table to a total of 356 files across all tables. Despite this, the issue persists.

What could be causing this problem, and how can I resolve it?

For context, I'm using the Boto3 client to call athena_client.start_query_execution(...) and then checking the status of each query every X seconds with athena_client.get_query_execution(QueryExecutionId=_id).

posta 7 mesi fa342 visualizzazioni
1 Risposta
0

Since running multiple queries simultaneously (8 in your case) leads to throttling, try reducing the number of concurrent queries. Start by running fewer queries at a time, such as 2 or 3, and gradually increase while monitoring for throttling exceptions.

💡 If you need to run multiple queries, batch them to reduce the number of concurrent queries. This can help avoid rate limits.

profile picture
ESPERTO
con risposta 7 mesi fa
  • Okay, thanks for your answer, Osvaldo. Do you know which limit can I check in AWS that is raising those throttling exceptions?

  • For Amazon Athena, you can check the service quotas under Per Account API Call Quotas. The default limits are:

    • StartQueryExecution and StopQueryExecution: 20 calls per second
    • GetQueryExecution and GetQueryResults: 100 calls per second

    ℹ️ If you use any of these APIs and exceed the default quota for the number of calls per second, or the burst capacity in your account, the Athena API issues an error similar to the following: "ClientError: An error occurred (ThrottlingException) when calling the <API_name> operation: Rate exceeded. Reduce the number of calls per second, or the burst capacity for the API for this account.

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande