Athena Iceberg table Commit error with Lambda Service

I'm trying to update iceberg table with Athena Client in AWS lambda but getting COMMIT error. My query runs about 100 queries at a time, and the time interval for generating a query is within 1 to 2 seconds.

following error: ICEBERG_COMMIT_ERROR: Failed to commit Iceberg update to the table: . If a data manifest file was generated at 's3://bucket_name/path/manifest.csv', you may need to manually clean the data from locations specified in the manifest. Athena will not delete data in your account.

Any idea what is the issue?

Topics

Analytics Serverless Compute

Tags

Amazon Athena AWS Lambda

Language

English

yunbro_lee

asked 23 days ago148 views

1 Answer

Newest
Most votes
Most comments

Accepted Answer

Hello.

I haven't seen the code and query you are using so I don't know the details, but if I delete 's3://bucket_name/path/manifest.csv' as the error message says, will the query run?

If you can share the query and code you are running, could you please do so?

EXPERT

Riku_Kobayashi

answered 23 days ago

EXPERT

Oleksii Bebych

reviewed 22 days ago

EXPERT

Adeleke Adebowale J

reviewed 23 days ago

yunbro_lee

23 days ago

Hello.

When I checked the "manifest.csv" file after this error, the file did not exist. I also checked that it worked well when I re-run the query.

It is difficult to share all the codes, but the code in the query part is as follows. code:

session = boto3.Session()
athena_client = session.client('athena')
response = athena_client.start_query_execution(
        QueryString=query_string,
        QueryExecutionContext={
            'Database': database_name
        },
        ResultConfiguration={
            'OutputLocation': s3://bucket_name/path/'
        },
        WorkGroup='group'
    )

if status == 'SUCCEEDED':
        print(f"[{query_execution_id}] Query SUCCEEDED!")
        results = athena_client.get_query_results(QueryExecutionId=query_execution_id)
        return results['ResultSet']['Rows']
else:
        print(f"[{query_execution_id}] Query failed!")
        return None

I think parallel execution is a problem. Is there any way to fix it?

Riku_Kobayashi EXPERT
23 days ago
Thank you for sharing the code. I'm not sure if it will lead to a direct solution, but how about fixing the number of concurrent executions of Lambda to 1 as described in the GitHub issue below? https://github.com/aws/aws-sdk-pandas/issues/2651#issuecomment-1955081562
Also, can I run the query with a smaller number of parallel executions, such as 10 instead of 100?
yunbro_lee
23 days ago
Thank you for your reply. Let's apply the concurrency limit to the issue.

Relevant content

Merge into query using Iceberg table throws iceberg v2 support error
Accepted Answer
azmaktr
asked 4 months ago
Athena: encounter errors when update Iceberg table
FloraWu
asked 2 years ago
[Athena] Error creating Iceberg table
manny
asked 10 months ago
Querying Athena Iceberg Tables Cross Account
jruckman
asked 2 years ago
How can I troubleshoot Apache Iceberg table errors with Athena?
AWS OFFICIALUpdated 3 months ago
How can I resolve the "HIVE_METASTORE_ERROR" error when I query a table in Amazon Athena?
AWS OFFICIALUpdated 3 months ago
How do I resolve the error "GENERIC_INTERNAL_ERROR" when I query a table in Amazon Athena?
AWS OFFICIALUpdated a year ago
Why does my Athena query fail with the error "HIVE_PARTITION_SCHEMA_MISMATCH"?
AWS OFFICIALUpdated 4 months ago
Migrating Glue Data Catalog tables to use Apache Iceberg open table format using Athena
EXPERT
Hamzah Chaudhry
published 2 months ago
Creating a DynamoDB Table with CloudFormation and Adding Items at Creation Time
EXPERT
Leeroy Hannigan
published a year ago