- Newest
- Most votes
- Most comments
Hello. From your description, I understand that you would like to know the best approach to prepare your trained model with Triton inference container image.
There are some example notebooks on this topic [1], and I mainly tested this one [2]. In short, the notebook uses a Docker container to compile the model.pt, and uses some commands to create the folder, move the model file in the folder, and create the tar file. What the notebook does is similar to your steps. You can simply run a few cells and the Model is ready to be deployed.
Besides, in SageMaker, "Pipeline" usually refers to Amazon SageMaker Model Building Pipelines [3]. I am not quite sure whether you really mean it. If yes, you can create a Pipeline, which has a step to train your model, and a following Processing step [4] or multiple Processing steps to accomplish your steps 2-5.
Regarding these steps 2-5, I think I can group them into 2 categories.
- Step 3: I assume you are Python code like "transformers.onnx.export()" [5] to compile.
- Step 2, 4, 5: These steps can be done with bash commands.
With Processing Job, you can bring your own containers. This doc [6] gives an example about a Dockerfile, which uses python3 as entrypoint. You can implement the below logic in the processing_script.py.
Download your trained model to local disk.
compile the model.
Upload the compiled model to a S3 bucket.
Apart from running a Python script as demoed in [6], you can also use the below Dockerfile to let Processing Job run a bash.
FROM python:3.7-slim-buster
# Add a bash script and configure Docker to run it
ADD run.sh /
ENTRYPOINT ["/bin/bash", "/run.sh"]
In the run.sh file, implement the below logic.
(Step 2) Create the folder structure.
Download your compiled model from s3 to the folder.
(Step 4) Create a conda venv, install dependencies, conda pack, and copy to the folder.
Create config.pbtxt, and copy to the folder.
(Step 5) Once all files are ready, compress, and upload the tar.gz to S3.
To use your own container image, you need to follow "Step 5: Push the Container to Amazon Elastic Container Registry (Amazon ECR)" in [7] to build your image, push to ECR. Make sure to modify the code accordingly. Then you can use below code to run the Processing Job.
from sagemaker.processing import Processor
processor = Processor(image_uri='<YOUR_ECR_IMAGE>',
role=role,
instance_count=1,
instance_type="<INSTANCE_TYPE>")
processor.run()
You can also make it a ProcessingStep as in [4].
In sum, in the Pipeline, you have the following steps:
A training step to train your model.
A processing step which runs Python code to compile your model.
A processing step which runs bash code to prepare the folder structure and zip.
Optional other steps.
Hope this information helps.
[1] https://github.com/aws/amazon-sagemaker-examples/tree/main/sagemaker-triton
[3] https://docs.aws.amazon.com/sagemaker/latest/dg/pipelines.html
[4] https://docs.aws.amazon.com/sagemaker/latest/dg/build-and-manage-steps.html#step-type-processing
[6] https://docs.aws.amazon.com/sagemaker/latest/dg/build-your-own-processing-container.html
[7] https://docs.aws.amazon.com/sagemaker/latest/dg/prebuilt-containers-extend.html
Relevant content
- Accepted Answerasked a year ago
- asked 2 years ago
- AWS OFFICIALUpdated 4 months ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 4 months ago
- AWS OFFICIALUpdated 4 months ago