- 最新
- 投票最多
- 评论最多
Redshift is a columnar database. So instead of using select * from a table, selecting specific columns will perform lot better. Based on query you have provided, please try creating sort key on column id and see if it helps. Typically Redshift takes care of updating statistics automatically but you can also update it using analyze table command. https://docs.aws.amazon.com/redshift/latest/dg/t_Analyzing_tables.html
Hello, to improve performance for this specific query and table, I would first explore data model optimizations such as ensuring that you have optimal compression and sort keys (e.g. id column) for the table (distribution style is also important in most cases but since this query doesn't involve joins, not so much). You can easily add these characteristics to your table via the ALTER command. Try looking at the Redshift Advisor recommendations in the Redshift console to see if there any data model optimizations recommended by the Redshift ML algorithms. Another aspect worth considering is if you have an underpowered Redshift cluster vis-a-vis this workload and/or other concurrent workloads. Try examining the CPU utilization for example to see if it is peaking. Try experimenting with an increased node count to see if it results in improved query runtimes.
相关内容
- AWS 官方已更新 1 年前
- AWS 官方已更新 2 年前
- AWS 官方已更新 4 个月前
Why are you getting 5M records from Redshift at a time? That isn't a typical pattern for Redshift.