How do I check the resource utilization for my SageMaker notebook instance?

3 minute read
1

I started an Amazon SageMaker notebook instance to train models, or to load large datasets, and the notebook instance appears to be frozen. How do I view my SageMaker instance resource use?

Resolution

When using the SageMaker notebook instance resources to prototype, train models, or when working with large datasets, SageMaker’s browser or notebook instances might appear unresponsive. If your browser or notebook instances appear unresponsive, you can view your SageMaker resource utilization to view current resource use.

You can view your SageMaker resource utilization using one of the following approaches:

  • Running Linux-based commands
  • Reviewing Amazon CloudWatch metrics

Viewing SageMaker resource utilization with Linux commands

SageMaker notebook instances are based on Amazon Linux. You can run Linux commands from the SageMaker terminal to view SageMaker resource utilization.

To run SageMaker Linux commands to view your resource utilization, do the following:

  1. Open the SageMaker console.

  2. On the navigation pane, choose Notebook Instances.

  3. Open Jupyter or JupyterLab next to the SageMaker notebook instance of your choice.

  4. Open terminal.

  5. Run the following commands to view your SageMaker resource utilization:

top

The preceding command displays available system memory (RAM) and processor load.

ps -ax

The preceding command displays tasks running and processor load.

df -h

The preceding command displays disk space utilization and availability.

free -m

The preceding command displays system memory (RAM) utilization and availability.

Viewing SageMaker resource utilization using CloudWatch

You can use CloudWatch to view your SageMaker resource utilization by using a lifecycle configuration script. For example, the publish-instance-metrics script publishes the system-level metrics from the notebook instance into CloudWatch.

To configure your SageMaker notebook instance to view all metrics from CloudWatch:

  1. Open the SageMaker console.

  2. On the navigation pane, choose Notebook Instances.

  3. Choose Open Jupyter or Jupyterlab next to the SageMaker notebook instance of your choice.

  4. Open terminal.

  5. Input the following command to open the amazon-cloudwatch-agent-config-wizard:

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard
  1. Follow the steps in the wizard. When prompted, do the following:
  • Choose On-premises host
  • Choose no for StatsD Daemon
  • Choose no for CollectD
  1. When the wizard completes, it automatically creates a config.json file. This file is used in the next step.

  2. Start the CloudWatch agent on your server with the following command:

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:///opt/aws/amazon-cloudwatch-agent/bin/config.json -s
  1. From your CloudWatch console, choose Metrics, then choose CWAgent.

  2. The CWAgent displays your current SageMaker metrics.

For more information about example AWS lifecycle configuration scripts for SageMaker notebooks, see amazon-sagemaker-notebook-instance-lifecycle-config-samples.


Related information

Monitor Amazon SageMaker with Amazon CloudWatch

Metrics collected by the CloudWatch agent

Monitor Amazon SageMaker

Terminals - Jupyter Project documentation for Terminals

AWS OFFICIAL
AWS OFFICIALUpdated 2 years ago