SSM send-command not running on GPU

Question

I have a GPU instance on EC2 running that I am using for an entity classification task. The instance type is a p3.2xlarge running Deep Learning AMI GPU TensorFlow 2.12.0 (Amazon Linux 2).

When I access the instance directly via SSH, I am able to execute the task on the GPU without issue. However, when attempting to automate this task via the SSM send-command option, I can execute the function, however it is only running on the CPUs. This is the case for other test scripts that I have created as well.

I am able to see that the script is running using a CPU rather than a GPU by inspecting the GPU load using GPUtil in python. Also the time to finish is considerably longer.

I have tried to amend the script using Numba to force the script to run using the GPU, but the script still defaults to the CPU.

Is there any way to ensure that commands sent via the send-command function are run on the GPU?

Accepted Answer

In case others encounter a similar issue, my solution was that I was that SSM was executing send-command as the root user, which for some reason couldn't find the GPU via python. Running the command as ec2-user solved the issue.

Answer

Hello,

Thank you for using Systems Manager Service.

Kindly allow me to convey that SSM doesn't have any such limitation where send-command will execute tasks on CPU only instead of GPU.

Hence to further help you here, we require details that are non-public information i.e details that are specific to your AWS account. Hence, please open a support case with AWS using the following link
	[+] https://console.aws.amazon.com/support/home#/case/create

Thanks :)

SSM send-command not running on GPU

Contenus pertinents