Hi,
some doubts about why my EMR clusters are failing. As you can see on the image below, I have a cluster which utilizes HDFS (but only with peaks of 5% of utilization), and while its running a job suddenly the nodes start to be marked as UNHEALTHY.
Knowing that the cluster has a 95% of HDFS free space. The reason of having UNHEALTHY nodes is that they are running out of space on their Instance Storage volumes (local EBS)? (due to some temporary files, for example). Should I add/increase the EBSs volumes on the cluster's nodes?