Redshift UNION ALL handling of re-defined data types

0

I'm using the following ctas syntax to create a temp table:

create temp table test as
select 'Value1'::varchar(100) as field1
union all 
select 'Value2'::varchar(100) as field1

The expected result is that field1 will have the data type varchar(100) but it instead has varchar(6). Whereas running the following does result in field1 being varchar(100).

create temp table test as
select 'Value1'::varchar(100) as field1

Is this expected behaviour, and if so, is there some syntax I can use to have field1 varchar(100) using a union all statement?

Zan
已提问 1 年前383 查看次数
2 回答
1
已接受的回答

For UNION ALL and UNION, the final field size is determined by the maximum length of the actual string values present. In the above case it is 6 because both ‘Value1’ and ‘Value2’ have 6 characters.

If you change the example to:

create temp table test3 as
select 'Value12345’::varchar(100) as field1
union  
select 'Value2'::varchar(100) as field1;

, then field1 column will be defined as VARCHAR(10).

To get the desired outcome of varchar(100) for field1, you can do the following:

create temp table test5 as
	with cte as(
		select 'Value1'::varchar(100) as field1
			union all 
		select 'Value2'::varchar(100) as field1
	) 
select field1::varchar(100) from cte;
AWS
支持工程师
Regis_M
已回答 1 年前
profile picture
专家
已审核 1 年前
1

Redshift will decide column length based on actual data. If you have a need to define column length longer than actual data then you can follow above answer, OR another option is to manually create the table first and use the no backup option. This being a permanent table has one advantage - system catalog tables do not need to be updated every time like in creating temp table. This could lead to performance issues on a busy system with lots of tables being created and dropped as the system catalog tables can bloat, and these need a system restart to be cleaned up.

profile pictureAWS
已回答 1 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则