Sagemaker ml.g5.2xlarge instances not working as desired due to nvidia-drivers issue

0

Over the weekend my sagemaker ml.g5.2xlarge started failing with the following errors: -> RuntimeError: No CUDA GPUs are available -> NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

wipwai
asked 2 months ago266 views
1 Answer

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions