I'm confused by AWS documentation regarding compatibility with delta tables. We need to delete a column that is the "column mapping" feature supported in delta-lake 1.2.0 and we do it through spark sql and it's mandatory to to specify parameters for our table in S3 in order to do this:
'delta.columnMapping.mode' = 'name',
'delta.minReaderVersion' = '2',
'delta.minWriterVersion' = '5'
https://docs.delta.io/latest/delta-column-mapping.html
After doing this we want to update our table in Glue but the crawler fails because it uses Glue version 3 for crawling and isn't compatible with delta lake 1.2.0. https://repost.aws/questions/QUyDYz31OnREGxy7gz2qIeuw/error-internal-service-exception-of-glue-crawler
We tried to create the table through Athena that states:
Column mapping and timestampNtz – Delta column mapping, which allows Delta table columns and the underlying Parquet file columns to use different names, and
timestamp without timezone (timestampNtz) are supported.
Delta Lake reader version – Delta Lake reader protocol up to version 3 is supported.
(nothing about the writer though)
https://docs.aws.amazon.com/athena/latest/ug/delta-lake-tables.html
But it fails with:
Delta protocol version is too new for Athena DDL engine
So what are the options to update our table in Glue to be usable with Athena?
As I mentioned in my initial question the crawler fails with an error and there is a known issue:
" update our table in Glue but the crawler fails because it uses Glue version 3 for crawling and isn't compatible with delta lake 1.2.0. https://repost.aws/questions/QUyDYz31OnREGxy7gz2qIeuw/error-internal-service-exception-of-glue-crawler "