Running MERGE INTO with more than one WHEN condition fails if the number of columns in the target table is > 321

0

Apache Iceberg version None

Query engine Athena (engine v3)

Hello everyone, today my team incurred in a very strange bug using Iceberg via Athena. I'll descrive the steps we used to reproduce the error below:

  1. We create an iceberg table with an "id" column and 321 other columns with random strings - in the example below we use awsrangler to create the table, but the same happens when the table is created using Athena directly.

import awswrangler as wr import pandas as pd import random, string

EXAMPLE CODE:

NUM_COLS=322

def get_random_string(length): letters = string.ascii_lowercase result_str = ''.join(random.choice(letters) for i in range(length)) return result_str

columns = ['id']+[get_random_string(5) for i in range(NUM_COLS-1) ] data = pd.DataFrame(data=[columns], columns=columns)

wr.athena.to_iceberg( data, workgroup="my-workgroup", database="my_database", table="iceberg_limits_322", table_location="s3://my_bucket/iceberg_limits", )

  1. we then run the following query in athena to insert a random value

EXAMPLE QUERY:

MERGE INTO my_database.iceberg_limits_322 as existing using ( SELECT 'something' as id ) as new on existing.id = new.id WHEN NOT MATCHED THEN INSERT (id) VALUES (new.id) WHEN MATCHED THEN DELETE

  1. which results in the error:

[ErrorCode: INTERNAL_ERROR_QUERY_ENGINE] Amazon Athena experienced an internal error while executing this query. Please contact AWS support for further assistance. You will not be charged for this query. We apologize for the inconvenience.

Notice that the error only occurs when multiple WHEN are used in the MERGE INTO query! - in case one WHEN is used (just to insert or to delete records) everything works fine, and the table can be used normally.

We can replicate this behaviour on multiple AWS accounts and with different tables/databases/s3 locations.

After trying with different number of columns we consistently found that 321 is the maximum limit for the number of columns of the table. Everything works fine below this threshold.

asked 20 days ago197 views
No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions