- Newest
- Most votes
- Most comments
Hello,
I understand that you would like to know if we can use requirements.txt to install libraries so that the preprocessing script can reference it when using default model monitoring.
As mentioned in the FAQs about 'requirements.txt' file that can be referenced using the source_dir parameter in SageMaker estimator, please note that this is for training jobs. I understand it may cause some confusion regarding model monitoring. Apologies for the inconvenience.
To install additional libraries when using default model monitoring, please note that unfortunately it is not possible to do it via ‘requirements.txt’ file because there is no way to make that file available to the pre-processing script. However, you can do a direct installation from the pre-processing script.
You can use custom preprocessing and postprocessing Python scripts by uploading them to S3 and can reference them when creating the model monitor. Please refer to [1] for more information on this. For example, you can refer to the below script to install additional packages as required using the preprocessing script-
import sys
import subprocess
subprocess.check_output(["/usr/bin/python3", "-m", "pip", "install", "pandas==1.3.5", "numpy=1.21.6", "python-dateutil==2.8.2", "contextvars ==2.4"])
import pandas as pd
def preprocess_handler():
...
Please refer to [2] for more details on this and sample preprocessing scripts. I would like to mention here that this can only be used if you have a few packages to be installed, if not it might result in a timeout issue.
However, if you would like to install large number of packages, Build Your Own Processing Container (BYOC) can be used where a Docker image that has your own code and dependencies to run the data processing and and model evaluation workloads can be used. Please refer to [3] for more information on this.
You can also refer to the example docker file given in [3] which can be used for the processing. You can add the required packages in the ‘requirement.txt’ file or manually add them in the docker file before creating the container image.
I hope this was helpful. If you face any other issues or require further assistance, please reach out to AWS Support [4] along with your use case in detail, and we would be happy to assist you further. Have a great day ahead!
References:
[1] https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-pre-and-post-processing.html
[3] https://docs.aws.amazon.com/sagemaker/latest/dg/build-your-own-processing-container.html
[4] https://console.aws.amazon.com/support/home#/case/create
Thanks for the answer, "subprocess.check_output" will be helpful. What if I have to reference few custom python files/classes from preprocessing script. Is there an option to import them to the preprocessing script, other than BYOC?
Relevant content
- asked 2 months ago
- asked 7 months ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 2 years ago
Thanks for the answer, "subprocess.check_output" will be helpful. What if I have to reference few custom python files/classes from preprocessing script. Is there an option to import them to the preprocessing script, other than BYOC?