What is the table IDs range for the temporary intermediate tables created by redshift during query execution?

0
  • For INSERT, UPDATE, DELETE, and MERGE queries, Redshift creates a bunch of temporary intermediate tables

  • For a given query_id, SYS_QUERY_DETAIL shall contain a lot of steps with a table_id and table_name. These tables at least from their name look like they are a result of temporary intermediate operations. We also noticed empirically that these tables have a very large table ID e.g. 1750937579.

  • We are trying to identify any tables that have an insert or delete step associated with it. This information will be used for internal auditing purposes. We try to identify such tables using the following query:

select table_id,
       table_name
from sys_query_detail
where step_name in ('insert', 'delete')
  and start_time >= DATEADD(DAY, -1, GETDATE())
  and table_name <> ''
  and table_id <> -1
order by table_name desc
  • However, while running such a query, we get a lot of (table_id, table_name) entries that are not user-defined permanent tables.

Is there a range that can be used to exclude such tables?

  • In the following example, we can see that the source and target tables have large table IDs 113490046 and 1750937579 respectively.
CREATE TABLE target (id INT, name CHAR(10));
CREATE TABLE source (id INT, name CHAR(10));

INSERT INTO target VALUES (101, 'Bob'), (102, 'John'), (103, 'Susan');
INSERT INTO source VALUES (102, 'Tony'), (103, 'Alice'), (104, 'Bill');

MERGE INTO target USING source ON target.id = source.id
WHEN MATCHED THEN UPDATE SET id = source.id, name = source.name
WHEN NOT MATCHED THEN INSERT VALUES (source.id, source.name);

select
    *
from sys_query_history
where query_text like '%merge%'
order by start_time desc;

select
    table_id, table_name, *
from sys_query_detail
where query_id = <query id of the above merge command>
order by table_name desc;

+----------+----------------------------+
|table_id  |table_name                  |
+----------+----------------------------+
|1699674   |table_id_repro.public.target|
|113490046 |table_id_repro.public.target|
|1750937579|table_id_repro.public.source|
|1699669   |table_id_repro.public.source|
|1699672   |merge_tt_611712f194a32      |
|1699674   |$target                     |
|1699674   |$target                     |
|1699674   |$target                     |
|1699674   |$target                     |
|90637     |                            |
|-1        |                            |
.
.
.
.
+----------+----------------------------+
Sumeet
已提问 3 个月前112 查看次数
1 回答
0
已接受的回答

In the above context you want to know the temporary table ID range in redshift and also you would like to filter out the temporary tables from user defined tables. when you perform operations like INSERT, UPDATE, DELETE, MERGE, the system creates temporary tables whose table IDs are quite large such as 113490046 and 1750937579 which you got above. Redshift manages these tables behind the scenes, and the IDs assigned are not sequential. Instead ,they are generated based on the systems internal processes and mechanisms. We can't filter out these temporary tables based on their table ID as they are dynamically allocated by redshift ,we can use SVV_TABLE_INFO view filters system tables and shows only user-defined tables. https://docs.aws.amazon.com/redshift/latest/dg/r_SVV_TABLES.html

AWS
Abbas_K
已回答 3 个月前
  • Thanks Abbas for letting us know.

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则