Strange Behavior with Athena UPDATE of an Iceberg table

1

We are getting strange behavior updating an iceberg table with a mix of 'string' and 'int' data types:

UPDATE table1 set int_col1=0 where key_col = 'key'; -- works fine

UPDATE table1 set int_col2=0 where key_col = 'key'; -- works fine

UPDATE table1 set int_col1=9, int_col2=10 where key_col = 'key'; -- works fine

UPDATE table1 set in_col1=0, int_col=0 where key_col = 'key'; -- fails with error GENERIC_INTERNAL_ERROR: symbolCounter 2 should be columnValueAndRowIdChannels.size() %s

It seems there is something wonky with the parser when there is more than one SET to 0. Anyone else seeing this, or is there a workaround or problem with our approach? If we do a DELETE and INSERT it works fine, but UPDATE is sure handy...

已提問 2 年前檢視次數 301 次
1 個回答
2

As you might already be aware that Athena uses Presto in the back end. It seems that there was a known issue with PrestoSQL (Trino) where the query would fail if two columns are being updated with the same value. Please refer the below Github link for more details: https://github.com/trinodb/trino/commit/4e1cc58c2a73129a5d590c01e1e0040755e58248

But this was fixed in the newer versions of the PrestoSQL and since Athena version 2 still uses an older version of Presto which is causing the above problem. I have tested with Athena version 2 and faced the same issue. On testing with Athena version 3 this issue seem to have been fixed as the Update query is working fine when two columns are being updated with same value, such as 0 in your case.

Therefore, to overcome the issue I would request you to upgrade to Athena version to 3 and then test. Please refer the below document for Athena version 3: https://docs.aws.amazon.com/athena/latest/ug/engine-versions-reference-0003.html

AWS
支援工程師
已回答 2 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南