SageMaker Experiment tracking duplication

0

The detailed StackOverflow question can be found in this link

I would like to initialize one training job that will be attached to existent Experiment. As mentioned in the best practice guide, I would like to initialize the Experiment and the Run in the notebook, and run the Training Job remotely, using a sagemaker Estimator in script mode.

The problem is, when doing that, Sagemaker creates 2 separate Runs - One Run is being initialized when calling an estimator.fit() from the notebook:

with Run(experiment_name, run_name, sagemaker_session=sess) as run:
    estimator = PyTorch(entry_point="my_script.py", ...)
    estimator.fit(inputs={"train": s3_input_train, "validation": s3_input_test})

And another Run is being created when loading the Run from my entry_point script "my_script.py":

sess = Session(boto3.session.Session(region_name=my_region))
with load_run(sagemaker_session=sess) as run:
    ...

I've tried adding the experiment_name and run_name to the with_load() function as well, but nothing seem to work.. I'm getting two separate runs, where some of the parameters being saved to one run, and some parameters to the other.

Also, it seems like the name of the Run being initialized through the estimator.fit() adds "-aws-training-job" to the job name and the Type "SageMakerTrainingJob".

Can someone help me with that?

답변 없음

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠