Redshift Cloud Watch alarms


Hi All, Is there any document that illustrates best/must have cloud alarms to keep track of Redshift resource usage/ resource contention/ blocking queries.

Background - We have seen instances wherein our Redshift cluster becomes unresponsive until some of the blocking queries are killed. However, this activity is more manual in nature and time consuming. Please note that in most of these instances, we have seen the CPU ultilization / cluster performance parameters is nominal. I have seen that the "Average queue wait time by priority" was around 9 min. Total data scanned looked nominal too. I am looking to see if there is a way to identify such abnormalities via Cloud watch alarms.

Any inputs in this regard is greatly appreciated.

질문됨 2년 전879회 조회
1개 답변
수락된 답변

There should be many in the different books on the tool, for the team i lead , we track the following things

Monitor cluster CPU utilization: You can set up an alarm to trigger if the average CPU utilization of the cluster exceeds a certain threshold. This can indicate that the cluster is under heavy load and may require additional resources or optimization.

Monitor queue wait time: You can monitor the average queue wait time for queries by priority. If the wait time is consistently high, it may indicate that the cluster is under heavy load or that there are blocking queries.

Monitor data scanned: You can set up an alarm to trigger if the amount of data scanned by queries exceeds a certain threshold. This can indicate that queries are performing full table scans or that the cluster is under heavy load.

Monitor disk space: You can set up an alarm to trigger if the amount of disk space used by the cluster exceeds a certain threshold. This can indicate that the cluster is running out of space and may need to be resized or that data needs to be deleted or moved to cold storage.

Monitor network throughput: You can set up an alarm to trigger if the network throughput of the cluster exceeds a certain threshold. This can indicate that the cluster is experiencing heavy network traffic and may need additional resources or optimization.


답변함 2년 전
  • I tried built a cloud watch alarm againt the redshift metric Query_waittime. this helped me to act on any monitoring the cluster behaviour better.

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠