All Content tagged with Amazon SageMaker Model Training

Amazon SageMaker reduces the time and cost to train and tune machine learning (ML) models without the need to manage infrastructure. With SageMaker, easily train and tune ML models using built-in tools to manage and track training experiments, automatically choose optimal hyperparameters, debug training jobs, and monitor the utilization of system resources such as GPUs, CPUs, and network bandwidth.

Content language: English

Select tags to filter
Sort by most recent
77 results
In CloudWatch, SageMaker training jobs are found in the log group: `/aws/sagemaker/TrainingJobs`. The log stream name has the format: "{sagemaker_training_job_name}/algo-1-..." How can I programmatic...
1
answers
0
votes
86
views
AWS
asked 3 months ago
I would like to save the logs from a SageMaker training job, following something similar to the code snippet below. ```python estimator = JumpStartEstimator( model_id = "...", environment = {...
1
answers
0
votes
39
views
AWS
asked 3 months ago
Hi everyone. When I run the augtogluon algorithm the following error appears after trying to read the entry_point : ``` UnexpectedStatusException: Error for Training job builtIn-example-autogluon-...
1
answers
0
votes
48
views
asked 4 months ago
I am trying to train a SageMaker built-in KMeans model on data stored in RecordIO-Protobuf format, using the Pipe input mode. However, the training job fails with the following error: ``` UnexpectedSt...
1
answers
0
votes
50
views
asked 4 months ago
Hi everybody. I'm stuck when calling describe_auto_ml_job_v2 method. Can't find the best Candidate because of a KeyError. Seems like when I print the method the following keys fail after sm.describe...
2
answers
0
votes
83
views
asked 6 months ago
Hi, I am using Sagemaker TrainingJob and it fails when it tries to upload the mode artifact to a bucket that has objectlock enabled. It throws this error: ClientError: Artifact upload failed:Error 7:...
1
answers
0
votes
162
views
AWS
asked 8 months ago
**I followed the instruction in : https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-projects-walkthrough-3rdgit.html** ![Enter image description here](/media/postImages/original/IMD2dEnWaoRY...
0
answers
0
votes
76
views
asked 10 months ago
I want to create a Training Job on Sagemaker and associate both performance metrics and a model artifact with it. However, I have two problems with this: * In the Sagemajer "experiments" section, I se...
1
answers
0
votes
288
views
asked a year ago
Hello, I have started running a command to train a model using Ultralytics YOLOv8.2.4. Most of the prerequisites should have already been installed. However whenever i run the cell, it will get stuck ...
1
answers
0
votes
482
views
asked a year ago
Hi team, I am currently working on developing an AWS application aimed at checking the compliance of identity photos with our organization's rules. This application will be utilized for various purpo...
1
answers
1
votes
656
views
asked a year ago
I have the following Q: * I trained a few Llama 2 7B models using the "Train" GUI in Amazon SageMaker Studio * This was around October/November 2023 * Back then, the "Target lora modules" hyperparame...
2
answers
0
votes
332
views
profile picture
asked a year ago
I have searched the re:Post forums as well as the other, well-known site that contains answers and solutions to problems. I am using a Sagemaker notebook with the Python SDK. Th version of the Sagema...
1
answers
0
votes
314
views
asked a year ago