Call last Sagemaker Model in Batch Transform Jobs

0

Hi Dears,

Hope this message finds you well

I have a sagemaker model, buit by on demand notebook. I have been used batch transform jobs using lambda function, It take input inference json from s3 to create batch transform job and have finally predictions.

The question how can I make lambda to use last trained model automaticity ? model_name = 'forecasting-deepar-2022-05-20-22-23-20-225',

Lambda code :

if 'input_data_4' in file:

            def batch_transform():
                transformer = Transformer(
                    model_name = 'forecasting-deepar-2022-05-20-22-23-20-225',
                    instance_count = 2,
                    instance_type = 'ml.m5.xlarge',
                    assemble_with = 'Line',
                    output_path = output_data_path,
                    base_transform_job_name ='daily-output-predictions-to-s3',
                    #sagemaker_session = sagemaker.session.Session,
                    accept = 'application/jsonlines')
                transformer.transform(data = input_data_path, 
                                    content_type = 'application/jsonlines', 
                                    split_type = 'Line',
                                    wait=False, 
                                    logs=True)
                    #Waits for the Pipeline Transform Job to finish.
                print('Batch Transform Job Created successfully!')
            batch_transform()

Thanks Basem

  • Hi, how are you triggering that Lambda function that starts the batch transform job?

1 回答
0
已接受的回答

Hi Basem,

If I understood correctly you'd like your Lambda function to automatically choose the latest SageMaker model when it runs, instead of hard-coding the model name.

Although you could do this simply with boto3.client("sagemaker").list_models(...) (which can sort by creation time), I would not recommend it. The reason is that in general this lists all models present in SageMaker - which might include some for different use cases in future, even if you only have the one DeepAR forecasting use-case today. You'd have to manually filter after the API call.

A better approach would probably be to register your forecasting models in SageMaker Model Registry - which will allow you to register different versions and track extra metadata like metrics and approval status for each version if you need.

  • First (e.g. from your notebook) you can create a model package group to track your forecasting models.
  • Then (when you create your SageMaker Model) you can register it as a new version in the group - via Model.register().
  • At the point you want to look up which model to use, you can then list_model_packages which can filter to your specific group of models, and also by approval status if you like.

So for example you could set the model package group name as a configuration environment variable for your Lambda function, and have the function dynamically look up the latest version to use from the group when needed.

Of course there are also many more custom ways to do this such as creating an SSM Parameter to track the name of your current accepted model, or creating your own model registry using a data store like DynamoDB... But SageMaker Model Registry seems like the most purpose-built tool for the job here to me.

AWS
专家
Alex_T
已回答 2 年前
profile picture
专家
已审核 25 天前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则