- Newest
- Most votes
- Most comments
That SSH Helper library/sample is likely still the approach you want to take - just may need some extra hacking since it's built around the high-level SageMaker Python SDK as you mentioned. The SMPySDK is open source and uses the same service APIs as boto3 does under the hood, so it's just a matter of figuring out how the SSH connectivity works and replicating it in your low-level API calls.
Per the SSH Helper Readme, to use the helper with an endpoint you need to:
- Add the helper library as a dependency to the model, which equates to including it under a
/code
folder of yourmodel.tar.gz
(see re-pack decision here and repack_model implementation here in SMPySDK) - Edit your
inference.py
to import and set up the SSH helper library.
On the automated side, from the SSHModelWrapper source code it looks like the only modification the class makes to the Model
object is to add some environment variables to the model definition:
env.update({'START_SSH': str(self.bootstrap_on_start).lower(),
'SSH_SSM_ROLE': self.ssm_iam_role,
'SSH_OWNER_TAG': user_id,
'SSH_LOG_TO_STDOUT': str(self.log_to_stdout).lower(),
'SSH_WAIT_TIME_SECONDS': f"{self.connection_wait_time_seconds}"})
...So if you're creating and deploying your model objects via boto3, I believe you should be able to get an equivalent setup by doing the same steps:
- Ensuring your
model.tar.gz
contains the SSH helper library code and an appropriateinference.py
under thecode/
subfolder - Setting the
Environment
variables the SSH helper library expects when calling CreateModel.
The fastest way to debug/get this working may be to create a temporary model+endpoint using the SM Python SDK and then inspect the created model.tar.gz
and use DescribeModel / DescribeEndpointConfig / DescribeEndpoint to fully understand what configuration you need to replicate. (To clear up a misconception I've heard in the past: Yes, you can import a pre-trained model.tar.gz bundle using SM Python SDK Model
object... You don't need to start from an Estimator and run a training job from scratch)
If you're not using the ML framework containers (for example you're using a built-in SageMaker algorithm, or a JumpStart model, or a from-scratch custom container instead of the AWS-provided frameworks for e.g. PyTorch, TensorFlow, etc), then your serving stack might not support an inference.py script bundle in which case things are a bit more complicated: You'd need to bake the SSH library into the container image itself and edit the serving stack to make sure it gets initialized.
Relevant content
- asked a year ago
- asked a year ago
- AWS OFFICIALUpdated 4 months ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated 4 months ago
- AWS OFFICIALUpdated a year ago