What is the table IDs range for the temporary intermediate tables created by redshift during query execution?

0
  • For INSERT, UPDATE, DELETE, and MERGE queries, Redshift creates a bunch of temporary intermediate tables

  • For a given query_id, SYS_QUERY_DETAIL shall contain a lot of steps with a table_id and table_name. These tables at least from their name look like they are a result of temporary intermediate operations. We also noticed empirically that these tables have a very large table ID e.g. 1750937579.

  • We are trying to identify any tables that have an insert or delete step associated with it. This information will be used for internal auditing purposes. We try to identify such tables using the following query:

select table_id,
       table_name
from sys_query_detail
where step_name in ('insert', 'delete')
  and start_time >= DATEADD(DAY, -1, GETDATE())
  and table_name <> ''
  and table_id <> -1
order by table_name desc
  • However, while running such a query, we get a lot of (table_id, table_name) entries that are not user-defined permanent tables.

Is there a range that can be used to exclude such tables?

  • In the following example, we can see that the source and target tables have large table IDs 113490046 and 1750937579 respectively.
CREATE TABLE target (id INT, name CHAR(10));
CREATE TABLE source (id INT, name CHAR(10));

INSERT INTO target VALUES (101, 'Bob'), (102, 'John'), (103, 'Susan');
INSERT INTO source VALUES (102, 'Tony'), (103, 'Alice'), (104, 'Bill');

MERGE INTO target USING source ON target.id = source.id
WHEN MATCHED THEN UPDATE SET id = source.id, name = source.name
WHEN NOT MATCHED THEN INSERT VALUES (source.id, source.name);

select
    *
from sys_query_history
where query_text like '%merge%'
order by start_time desc;

select
    table_id, table_name, *
from sys_query_detail
where query_id = <query id of the above merge command>
order by table_name desc;

+----------+----------------------------+
|table_id  |table_name                  |
+----------+----------------------------+
|1699674   |table_id_repro.public.target|
|113490046 |table_id_repro.public.target|
|1750937579|table_id_repro.public.source|
|1699669   |table_id_repro.public.source|
|1699672   |merge_tt_611712f194a32      |
|1699674   |$target                     |
|1699674   |$target                     |
|1699674   |$target                     |
|1699674   |$target                     |
|90637     |                            |
|-1        |                            |
.
.
.
.
+----------+----------------------------+
Sumeet
asked 2 months ago103 views
1 Answer
0
Accepted Answer

In the above context you want to know the temporary table ID range in redshift and also you would like to filter out the temporary tables from user defined tables. when you perform operations like INSERT, UPDATE, DELETE, MERGE, the system creates temporary tables whose table IDs are quite large such as 113490046 and 1750937579 which you got above. Redshift manages these tables behind the scenes, and the IDs assigned are not sequential. Instead ,they are generated based on the systems internal processes and mechanisms. We can't filter out these temporary tables based on their table ID as they are dynamically allocated by redshift ,we can use SVV_TABLE_INFO view filters system tables and shows only user-defined tables. https://docs.aws.amazon.com/redshift/latest/dg/r_SVV_TABLES.html

AWS
Abbas_K
answered 2 months ago
  • Thanks Abbas for letting us know.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions