HuggingFaceModel with fully local mode is still trying to access AWS API

0

Hello,

I am trying to test a HuggingFaceModel in local mode with Sagemaker. I would like to deploy the HF model for inference into my local Docker environment. I have the following code:

from sagemaker.huggingface import HuggingFaceModel

huggingface_model = HuggingFaceModel(
   model_data="file:///path/to/mymodel.tar.gz",  # path to your trained SageMaker model
   role='SageMakerRole',                                            # IAM role with permissions to create an endpoint
   transformers_version="4.26",                           # Transformers version used
   pytorch_version="1.13",                                # PyTorch version used
   py_version='py39',
)

huggingface_model.deploy(
    initial_instance_count=1,
    instance_type='local'
)

I have the following configuration set in ~/.sagemaker/config.yaml.

local:
    local_code: true
    region_name: "us-west-2"
    container_config:
        shm_size: "128M"

It's my understanding that this should be enough to invoke local mode, but when running the code I get the following trace:

---------------------------------------------------------------------------
ClientError                               Traceback (most recent call last)
/Users/jacob.windle/Projects/sagemaker_local_testing/TestLocalMode.ipynb Cell 3 line 9
      1 huggingface_model = HuggingFaceModel(
      2    model_data="file:///path/to/my/model.tar.gz",  # path to your trained SageMaker model
      3    role='SageMakerRole',                                            # IAM role with permissions to create an endpoint
   (...)
      6    py_version='py39',
      7 )
----> 9 huggingface_model.deploy(
     10     initial_instance_count=1,
     11     instance_type='local'
     12 )

File ~/Projects/sagemaker_local_testing/.venv/lib/python3.12/site-packages/sagemaker/huggingface/model.py:315, in HuggingFaceModel.deploy(self, initial_instance_count, instance_type, serializer, deserializer, accelerator_type, endpoint_name, tags, kms_key, wait, data_capture_config, async_inference_config, serverless_inference_config, volume_size, model_data_download_timeout, container_startup_health_check_timeout, inference_recommendation_id, explainer_config, **kwargs)
    308     inference_tool = "neuron" if instance_type.startswith("ml.inf1") else "neuronx"
    309     self.image_uri = self.serving_image_uri(
    310         region_name=self.sagemaker_session.boto_session.region_name,
    311         instance_type=instance_type,
    312         inference_tool=inference_tool,
    313     )
--> 315 return super(HuggingFaceModel, self).deploy(
    316     initial_instance_count,
    317     instance_type,
    318     serializer,
    319     deserializer,
    320     accelerator_type,
    321     endpoint_name,
    322     format_tags(tags),
    323     kms_key,
    324     wait,
    325     data_capture_config,
    326     async_inference_config,
    327     serverless_inference_config,
    328     volume_size=volume_size,
    329     model_data_download_timeout=model_data_download_timeout,
    330     container_startup_health_check_timeout=container_startup_health_check_timeout,
    331     inference_recommendation_id=inference_recommendation_id,
    332     explainer_config=explainer_config,
    333     endpoint_logging=kwargs.get("endpoint_logging", False),
    334     endpoint_type=kwargs.get("endpoint_type", None),
    335     resources=kwargs.get("resources", None),
    336     managed_instance_scaling=kwargs.get("managed_instance_scaling", None),
    337 )

File ~/Projects/sagemaker_local_testing/.venv/lib/python3.12/site-packages/sagemaker/model.py:1610, in Model.deploy(self, initial_instance_count, instance_type, serializer, deserializer, accelerator_type, endpoint_name, tags, kms_key, wait, data_capture_config, async_inference_config, serverless_inference_config, volume_size, model_data_download_timeout, container_startup_health_check_timeout, inference_recommendation_id, explainer_config, accept_eula, endpoint_logging, resources, endpoint_type, managed_instance_scaling, **kwargs)
   1607     return None
   1609 else:  # existing single model endpoint path
-> 1610     self._create_sagemaker_model(
   1611         instance_type=instance_type,
   1612         accelerator_type=accelerator_type,
   1613         tags=tags,
   1614         serverless_inference_config=serverless_inference_config,
   1615     )
   1616     serverless_inference_config_dict = (
   1617         serverless_inference_config._to_request_dict() if is_serverless else None
   1618     )
   1619     production_variant = sagemaker.production_variant(
   1620         self.name,
   1621         instance_type,
   (...)
   1627         container_startup_health_check_timeout=container_startup_health_check_timeout,
   1628     )

File ~/Projects/sagemaker_local_testing/.venv/lib/python3.12/site-packages/sagemaker/model.py:865, in Model._create_sagemaker_model(self, instance_type, accelerator_type, tags, serverless_inference_config, accept_eula)
    863         self.name = model_package.name
    864 else:
--> 865     container_def = self.prepare_container_def(
    866         instance_type,
    867         accelerator_type=accelerator_type,
    868         serverless_inference_config=serverless_inference_config,
    869         accept_eula=accept_eula,
    870     )
    872     if not isinstance(self.sagemaker_session, PipelineSession):
    873         # _base_name, model_name are not needed under PipelineSession.
    874         # the model_data may be Pipeline variable
    875         # which may break the _base_name generation
    876         self._ensure_base_name_if_needed(
    877             image_uri=container_def["Image"],
    878             script_uri=self.source_dir,
    879             model_uri=self._get_model_uri(),
    880         )

File ~/Projects/sagemaker_local_testing/.venv/lib/python3.12/site-packages/sagemaker/huggingface/model.py:514, in HuggingFaceModel.prepare_container_def(self, instance_type, accelerator_type, serverless_inference_config, inference_tool, accept_eula)
    505     deploy_image = self.serving_image_uri(
    506         region_name,
    507         instance_type,
   (...)
    510         inference_tool=inference_tool,
    511     )
    513 deploy_key_prefix = model_code_key_prefix(self.key_prefix, self.name, deploy_image)
--> 514 self._upload_code(deploy_key_prefix, repack=True)
    515 deploy_env = dict(self.env)
    516 deploy_env.update(self._script_mode_env_vars())

File ~/Projects/sagemaker_local_testing/.venv/lib/python3.12/site-packages/sagemaker/model.py:694, in Model._upload_code(self, key_prefix, repack)
    684 """Uploads code to S3 to be used with script mode with SageMaker inference.
    685 
    686 Args:
   (...)
    690         artifact should be repackaged into a new S3 object. (default: False).
    691 """
    692 local_code = utils.get_config_value("local.local_code", self.sagemaker_session.config)
--> 694 bucket, key_prefix = s3.determine_bucket_and_prefix(
    695     bucket=self.bucket,
    696     key_prefix=key_prefix,
    697     sagemaker_session=self.sagemaker_session,
    698 )
    700 if (self.sagemaker_session.local_mode and local_code) or self.entry_point is None:
    701     self.uploaded_code = None

File ~/Projects/sagemaker_local_testing/.venv/lib/python3.12/site-packages/sagemaker/s3_utils.py:147, in determine_bucket_and_prefix(bucket, key_prefix, sagemaker_session)
    145     final_key_prefix = key_prefix
    146 else:
--> 147     final_bucket = sagemaker_session.default_bucket()
    149     # default_bucket_prefix (if it exists) should be appended if (and only if) 'bucket' does not
    150     # exist and we are using the Session's default_bucket.
    151     final_key_prefix = s3_path_join(sagemaker_session.default_bucket_prefix, key_prefix)

File ~/Projects/sagemaker_local_testing/.venv/lib/python3.12/site-packages/sagemaker/session.py:586, in Session.default_bucket(self)
    584 default_bucket = self._default_bucket_name_override
    585 if not default_bucket:
--> 586     default_bucket = generate_default_sagemaker_bucket_name(self.boto_session)
    587     self._default_bucket_set_by_sdk = True
    589 self._create_s3_bucket_if_it_does_not_exist(
    590     bucket_name=default_bucket,
    591     region=region,
    592 )

File ~/Projects/sagemaker_local_testing/.venv/lib/python3.12/site-packages/sagemaker/session.py:7359, in generate_default_sagemaker_bucket_name(boto_session)
   7351 """Generates a name for the default sagemaker S3 bucket.
   7352 
   7353 Args:
   7354     boto_session (boto3.session.Session): The underlying Boto3 session which AWS service
   7355 """
   7356 region = boto_session.region_name
   7357 account = boto_session.client(
   7358     "sts", region_name=region, endpoint_url=sts_regional_endpoint(region)
-> 7359 ).get_caller_identity()["Account"]
   7360 return "sagemaker-{}-{}".format(region, account)

File ~/Projects/sagemaker_local_testing/.venv/lib/python3.12/site-packages/botocore/client.py:565, in ClientCreator._create_api_method.<locals>._api_call(self, *args, **kwargs)
    561     raise TypeError(
    562         f"{py_operation_name}() only accepts keyword arguments."
    563     )
    564 # The "self" in this scope is referring to the BaseClient.
--> 565 return self._make_api_call(operation_name, kwargs)

File ~/Projects/sagemaker_local_testing/.venv/lib/python3.12/site-packages/botocore/client.py:1021, in BaseClient._make_api_call(self, operation_name, api_params)
   1017     error_code = error_info.get("QueryErrorCode") or error_info.get(
   1018         "Code"
   1019     )
   1020     error_class = self.exceptions.from_code(error_code)
-> 1021     raise error_class(parsed_response, operation_name)
   1022 else:
   1023     return parsed_response

It looks like sagemaker still tries to determine an S3 bucket location even with local_code set to true? Did I misunderstand the documentation? I want to use my tarball on disk to test my inference script.

preguntada hace un mes74 visualizaciones
1 Respuesta
0

Hi,

As per this, you can either deploy a model from the hub OR deploy a model from a S3 location. Also as per this documentation, It appears the class sagemaker.huggingface.model.HuggingFaceModel, model_data only supports S3 as mentioned here: model_data (str or PipelineVariable) – The Amazon S3 location of a SageMaker model data .tar.gz file.

Hope this clarifies.

Thanks, Rama

profile pictureAWS
Rama
respondido hace un mes

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas