Setting Up a Local Development Environment for SageMaker

0

Hello everyone,

I'm currently working on a project where I have a set of Python scripts that train a variety of models (including sklearn, xgboost, and catboost) and save the most accurate model. I also have inference scripts that use this model for batch transformations.

I'm not interested in using the full suite of SageMaker Studio features, as I want to set up the development environment locally. However, I do want to leverage SageMaker when it comes to running the code on AWS resources (for model training and inference).

I'm also planning to use GitHub Actions to semi-automate this process. My current plan is to build my own environment using a Docker container. The image built can then be deployed to SageMaker via ECR.

I'm wondering if anyone has come across any resources that could help me achieve this? I'm particularly interested in best practices for setting up a local development environment that can easily transition to SageMaker for training and inference.

Any advice or pointers would be greatly appreciated! Thanks in advance!

2개 답변
1

There's quite a few different ways to go about this. So I'll try to steer you in the right direction.

For training, taking a look at the Sagemaker SDK would be a good start. It allows you to write code locally, but say train a model using remotely using Sagemaker. Note, that this will create Model in Sagemaker's model registry. If you don't want that, using a bare VM might be the best choice.

If you were set on using your own custom Docker container, you can still use Sagemaker for deployment (or something like ECS), this page here would be a helpful start, particularly the Steps for model deployment and Bring your own model section would be helpful.

JamesM
답변함 3달 전
0

Hi,

Did you try the (very) new local mode of Sagemaker Studio announced in December: https://aws.amazon.com/about-aws/whats-new/2023/12/sagemaker-studio-local-mode-docker/

Studio users can now run SageMaker processing, training, inference and batch 
transform jobs locally on their Studio IDE instance. Users can also build and test 
SageMaker compatible Docker images locally in Studio IDEs. 

Data scientists can iteratively develop ML models and debug code changes quickly 
without leaving their IDE or waiting for remote compute resources. Users can run 
small-scale jobs locally to test implementations and inspect outputs before running 
full jobs in the cloud. This optimizes workflows by providing instant feedback on code changes 
and catching issues early without waiting for cloud resources. 

It seems to match very well want you want to achieve.

Reference documentation is here: https://docs.aws.amazon.com/sagemaker/latest/dg/pipelines-local-mode.html

Best,

Didier

profile pictureAWS
전문가
답변함 3달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠