Transformer step is completed and output is baseline.csv.out
Code for transformer step and model_quality_check_step
transformer = Transformer(
model_name=create_model_step.properties.ModelName,
instance_count=processing_instance_count,
instance_type="ml.m5.xlarge",
accept="text/csv",
assemble_with="Line",
output_path=transformer_output_s3,
sagemaker_session=pipeline_session,
)
transform_arg = transformer.transform(
data=processing_step.properties.ProcessingOutputConfig.Outputs["baseline"].S3Output.S3Uri,
content_type="text/csv",
split_type="Line",
join_source="Input",
input_filter="$[1:]",
output_filter="$[0,-1]",
)
transform_step = TransformStep(
name="TransformDataStep",
step_args=transform_arg,
cache_config=cache_config,
)
model_quality_check_config = ModelQualityCheckConfig(
baseline_dataset=transform_step.properties.TransformOutput.S3OutputPath,
dataset_format=DatasetFormat.csv(header=False),
output_s3_uri=model_quality_check_step_s3,
problem_type="BinaryClassification",
probability_attribute="_c1",
probability_threshold_attribute="0.5",
ground_truth_attribute="_c0",
)
model_quality_check_step = QualityCheckStep(
name="ModelQualityCheckStep",
skip_check=skip_check_model_quality,
register_new_baseline=register_new_baseline_model_quality,
quality_check_config=model_quality_check_config,
check_job_config=check_job_config,
model_package_group_name=model_package_group_name,
supplied_baseline_statistics=supplied_baseline_statistics_model_quality,
supplied_baseline_constraints=supplied_baseline_constraints_model_quality,
)
Error message in cloud watch:
2023-12-05 13:50:42,325 ERROR Main: Error: More than two classes are not supported in binary classification
2023-12-05 13:50:42,244 ERROR modelquality.BinaryClassificationAnalyzerImpl$: Binary classification dataset has classes (1.0,0,0.0,1), only up to two classes are supported
2023-12-05 13:50:42,397 - main - ERROR - Exception performing analysis: Command 'bin/spark-submit --master yarn --deploy-mode client --conf spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider --conf spark.serializer=org.apache.spark.serializer.KryoSerializer /opt/amazon/sagemaker-data-analyzer-1.0-jar-with-dependencies.jar --analytics_input /tmp/spark_job_config.json' returned non-zero exit status 1.