跳至内容

Redshift diagnostics queries to identify the performance bottleneck

3 分钟阅读
内容级别:高级
0

This article provides the tool/queries to identify redshift tables having missing statistics, data skewness and longest running query.

  1. Query to find out tables w/ missing statistics

To find queries impacted by missing statistics in Redshift, you can query stl_explain and look for "missing statistics" in the plannode field. Group by the substring of the plannode, order by the count descending, like this:

select substring(trim(plannode),1,100) as plannode, count(*) 
from stl_explain 
where plannode like '%missing statistics%' 
group by plannode 
order by 2 desc;

This will provide a count of the queries impacted by missing statistics on each table, identified by the plannode value.

  1. Query to identify tables with data skew or unsorted rows in Redshift cluster
select trim(pgn.nspname) as schema, 
trim(a.name) as table, id as tableid, 
decode(pgc.reldiststyle,0, 'even',1,det.distkey ,8,'all') as distkey, dist_ratio.ratio::decimal(10,4) as skew, 
det.head_sort as "sortkey", 
det.n_sortkeys as "#sks", b.mbytes,  
decode(b.mbytes,0,0,((b.mbytes/part.total::decimal)*100)::decimal(5,2)) as pct_of_total, 
decode(det.max_enc,0,'n','y') as enc, a.rows, 
decode( det.n_sortkeys, 0, null, a.unsorted_rows ) as unsorted_rows , 
decode( det.n_sortkeys, 0, null, decode( a.rows,0,0, (a.unsorted_rows::decimal(32)/a.rows)*100) )::decimal(5,2) as pct_unsorted 
from (select db_id, id, name, sum(rows) as rows, 
sum(rows)-sum(sorted_rows) as unsorted_rows 
from stv_tbl_perm a 
group by db_id, id, name) as a 
join pg_class as pgc on pgc.oid = a.id
join pg_namespace as pgn on pgn.oid = pgc.relnamespace
left outer join (select tbl, count(*) as mbytes 
from stv_blocklist group by tbl) b on a.id=b.tbl
inner join (select attrelid, 
min(case attisdistkey when 't' then attname else null end) as "distkey",
min(case attsortkeyord when 1 then attname  else null end ) as head_sort , 
max(attsortkeyord) as n_sortkeys, 
max(attencodingtype) as max_enc 
from pg_attribute group by 1) as det 
on det.attrelid = a.id
inner join ( select tbl, max(mbytes)::decimal(32)/min(mbytes) as ratio 
from (select tbl, trim(name) as name, slice, count(*) as mbytes
from svv_diskusage group by tbl, name, slice ) 
group by tbl, name ) as dist_ratio on a.id = dist_ratio.tbl
join ( select sum(capacity) as  total
from stv_partitions where part_begin=0 ) as part on 1=1
where mbytes is not null 
order by  mbytes desc;
  1. Query to identify longest running queries in your cluster :
select trim(database) as db, count(query) as n_qry, 
max(substring (qrytext,1,80)) as qrytext, 
min(run_minutes) as "min" , 
max(run_minutes) as "max", 
avg(run_minutes) as "avg", sum(run_minutes) as total,  
max(query) as max_query_id, 
max(starttime)::date as last_run, 
sum(alerts) as alerts, aborted
from (select userid, label, stl_query.query, 
trim(database) as database, 
trim(querytxt) as qrytext, 
md5(trim(querytxt)) as qry_md5, 
starttime, endtime, 
(datediff(seconds, starttime,endtime)::numeric(12,2))/60 as run_minutes,     
alrt.num_events as alerts, aborted 
from stl_query 
left outer join 
(select query, 1 as num_events from stl_alert_event_log group by query ) as alrt 
on alrt.query = stl_query.query
where userid <> 1 and starttime >=  dateadd(day, -7, current_date)) 
group by database, label, qry_md5, aborted
order by total desc limit 50;
AWS
专家
已​发布 1 年前708 查看次数