AWS Glue Visual ETL Job fails using MERGE statement on Redshift, type mismatch error

Scenario:

Source table: Glue Data Catalog table study crawled from MySQL with columns:

id (int),
code (varchar),
desc (varchar)
and 2 other columns not used in the job.

Target table: Amazon Redshift table dim_study with columns:

id_pk (int; identity, primary key),
original_id (int),
code (varchar),
desc (varchar),
inserted_on (timestamp default getdate()),
updated_on (timestamp default null)

After setting up the Data Source, I added a Change Schema step to map the source and target columns (id to original_id, code to code and desc to desc), change names, drop those not used. The data preview looks fine, it only displays 3 columns, with the target names as headers.

Then I set up the target node, Redshift, pointing to the dimension table. Access type: Direct connection. Handling method: MERGE data into target table. Choose keys and simple actions: Matching keys original_id (as in target) Set up performance and security.

When I run the job, it throws this ERROR: column "original_id" is of type integer but expression is of type character varying. Hint: You will need to rewrite or cast the expression.

Note: If I change the handling method to APPEND, it works fine, BUT it doesn't fill the inserted_on column, which has a default GETDATE() value.

So I need to solve 2 issues: what is causing the type mismatch, and why the timestamp column is filling up with NULL values instead of current datetime (I suspect because the data source has only 3 columns).

トピック

分析データベース

タグ

AWS Glue データの抽出、変換、ロード Amazon Redshift

言語

English

NLopeDeBarrios

質問済み 5ヶ月前96ビュー

回答なし

新しい順
投票が多い順
コメントが多い順

AWS Glue Visual ETL Job fails using MERGE statement on Redshift, type mismatch error

関連するコンテンツ