AWS Glue Visual ETL Job fails using MERGE statement on Redshift, type mismatch error

0

Scenario:

Source table: Glue Data Catalog table study crawled from MySQL with columns:

  • id (int),
  • code (varchar),
  • desc (varchar)
  • and 2 other columns not used in the job.

Target table: Amazon Redshift table dim_study with columns:

  • id_pk (int; identity, primary key),
  • original_id (int),
  • code (varchar),
  • desc (varchar),
  • inserted_on (timestamp default getdate()),
  • updated_on (timestamp default null)

After setting up the Data Source, I added a Change Schema step to map the source and target columns (id to original_id, code to code and desc to desc), change names, drop those not used. The data preview looks fine, it only displays 3 columns, with the target names as headers.

Then I set up the target node, Redshift, pointing to the dimension table. Access type: Direct connection. Handling method: MERGE data into target table. Choose keys and simple actions: Matching keys original_id (as in target) Set up performance and security.

When I run the job, it throws this ERROR: column "original_id" is of type integer but expression is of type character varying. Hint: You will need to rewrite or cast the expression.

Note: If I change the handling method to APPEND, it works fine, BUT it doesn't fill the inserted_on column, which has a default GETDATE() value.

So I need to solve 2 issues: what is causing the type mismatch, and why the timestamp column is filling up with NULL values instead of current datetime (I suspect because the data source has only 3 columns).

asked 5 months ago92 views
No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions