I want to troubleshoot issues that occur when I use a custom image to build my Amazon SageMaker Studio JupyterLab environment.
Resolution
Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshooting errors for the AWS CLI. Also, make sure that you're using the most recent AWS CLI version.
Check your permissions
You must have the correct AWS Identity and Access Management (IAM) permissions to push images to the Amazon Elastic Container Registry (Amazon ECR) repository.
Your SageMaker Studio users' custom IAM policies must include the sagemaker.ListImage permission to create spaces, view, and list images.
Use the WORKDIR Dockerfile instruction
The Amazon Elastic Block Store (Amazon EBS) volume that's associated SageMaker Studio is mounted at /home/sagemaker-user. You can't change the mount path. To make sure that your custom image operates within /home/sagemaker-user, add the WORKDIR Dockerfile instruction to a subfolder within /home/sagemaker-user to set your image's working directory. For more information, see WORKDIR on the Dockerdocs website.
Note: Don't use the /opt/.sagemakerinternal, /opt/ml, or /var/log/studio reserved directories in your custom images. SageMaker AI uses these directories to store its metadata and logs.
Check your Dockerfile configuration
The following example Docker file installs Python packages and sets the scope to users without permissions:
FROM public.ecr.aws/amazonlinux/amazonlinux:2
ARG NB_USER="sagemaker-user"
ARG NB_UID="1000"
ARG NB_GID="100"
RUN yum install —assumeyes python3 shadow-utils && \
useradd —create-home —shell /bin/bash —gid "${NB_GID}" —uid ${NB_UID} ${NB_USER} && \
yum clean all && \
python3 -m pip install jupyterlab
RUN python3 -m pip install —upgrade pip
RUN python3 -m pip install —upgrade urllib3==1.26.6
USER ${NB_UID}
CMD jupyter lab —ip 0.0.0.0 —port 8888 \
--ServerApp.base_url="/jupyterlab/default" \
--ServerApp.token='' \
--ServerApp.allow_origin='*'
Review your logs
If you still experience an issue when you create or launch the custom image, then review the SageMaker AI Studio logs in Amazon CloudWatch Logs. Locate the error messages in the /aws/sagemaker/studio log group and domain-id/app-name/JupyterLab/default log stream.
Troubleshoot your issue based on the error message that you find in the logs.
Resolve the "NotFoundError" error message
When the user ID (UID) and the group ID (GID) Docker specifications and the custom image definitions don't match, you receive the following error message:
"NotFoundError: SageMaker is unable to launch the app using the image [Image ARN]. Ensure that the UID/GID provided in the AppImageConfig matches the default UID/GID defined in the image."
The file system supports the sagemaker-user username. SageMaker supports a UID of 1000 and a GID of 100. When you build the Dockerfile, set UID to 1000 and GID to 100. If needed, grant sudo permission to your users.
To resolve the issue, complete the following steps:
-
After you build your image, run the following command to verify the user configuration:
docker run -it $IMAGE_URI
Expected output:
UID=1000(sagemaker-user) GID=100(sagemaker-user) groups=100(sagemaker-user)
-
Make sure that your Dockerfile includes the necessary configuration to start a JupyterLab server. Add the following CMD Dockerfile instruction to your file:
CMD jupyter lab —ip 0.0.0.0 —port 8888 \
—ServerApp.base_url="/jupyterlab/default" \
—ServerApp.token='' \
—ServerApp.allow_origin='*'```
Note: For more information about the CMD instruction, see CMD on the Dockerdocs website.
-
Run the describe-app-image-config AWS CLI command to verify that your custom image's AppImageConfig has the correct UID and GID:
aws sagemaker describe-app-image-config —app-image-config-name APP-CONFIG-NAME
Note: Replace APP-CONFIG-NAME with your app image configuration name. You can also use the SageMaker AI console to view the app image configuration.
-
Rebuild your Docker image, and then add it to Amazon ECR.
-
Relaunch your JupyterLab space.
Resolve the "ContainerExecutionFailedError" error message
If your Dockerfile doesn't have the required configurations when you use JupyterLab with SageMaker Studio, then you might receive the following error message:
"Creating JupyterLab application for space: hc-test-new ContainerExecutionFailedError: SageMaker is unable to launch the app because the container entrypoint for image [43xxxxx0.dkr.ecr.eu-central-1.amazonaws.com/jupyterlab-custom-hc@sha256:bae93f0342ba56624b132159354a22d540d65d200a0624dae637a4c005b66789] is not configured correctly. Please change the entrypoint for the image or use a different image to launch the app. Output: [App container prematurely exited with status [exited], exit code [1], and error []."
To resolve the issue, add the following required configurations to your Dockerfile:
-
Use a base image that includes Python 3.x.
-
Run the following command to install JupyterLab:
pip install jupyterlab
-
Set the following required environment variables:
ENV SHELL=/bin/bash
ENV NB_USER=sagemaker-user
ENV NB_UID=1000
ENV NB_GID=100
-
If you don't already have a SageMaker user, then run the following command to create one:
useradd -m -s /bin/bash -N -u ${NB_UID} ${NB_USER}
-
Run the following command to set the working directory:
WORKDIR ${HOME}
-
Run the following command to set the user:
USER ${NB_USER}
-
Run the following command to set the entry point to run JupyterLab:
ENTRYPOINT ["jupyter-lab"]CMD ["—ServerApp.ip=0.0.0.0", "—ServerApp.port=8888", "—ServerApp.allow_origin=*", "—ServerApp.token=''", "—ServerApp.base_url=/jupyterlab/default"]
Resolve the "400 ResourceNotFound" error message
If there's an image mismatch in SageMaker Studio, then you might receive the following error message:
"[400] ResourceNotFound: Image with ARN arn:aws:sagemaker:us-east-1:123456789:image/codeeditor-image"
The preceding error can occur for the following reasons:
- You deleted the previous image version but didn't detach it from SageMaker Studio first.
- The image Amazon Resource Name (ARN) in the domain settings doesn't match the available image versions.
- The reference image version no longer exists in your AWS account.
To resolve this issue, take the following actions:
- Verify that the image version exists in your account.
- Make sure that you correctly attached the image to SageMaker Studio.
- Run the following update-domain command to update to the latest SageMaker Studio image:
aws sagemaker update-domain \
--domain-id example-domain-name \
--default-user-settings '{
"JupyterLabAppSettings": {
"CustomImages": [
{
"ImageName": "example-image-name",
"ImageVersionNumber": example-image-version-number,
"AppImageConfigName": "example-domain-app-image-config-name"
}
]
},
"CodeEditorAppSettings": {
"CustomImages": [
{
"ImageName": "cbi-ds-sagemaker-base-code-editor-v1-image",
"AppImageConfigName": "example-domain-app-image-config-name"
}
]
}
}'
Note: Replace example-domain-name, example-image-name, example-image-version-number, and example-domain-app-config-name with the values from your configuration.
Related information
How to bring your own image