SageMaker - All metrics in statistics.json by Model Quality Monitor are "0.0 +/- 0.0", but confusion matrix is built correctly for multi-class classification!!

1

I have scheduled an hourly model-quality-monitoring job in AWS SageMaker. both the jobs, ground-truth-merge and model-quality-monitoring completes successfully without any errors. but, all the metrics calculated by the job are "0.0 +/- 0.0" while the confustion matrix gets calculated as expected.

I have done everything as mentioned in this notebook for model-quality-monitoring from sagemaker-examples with very few changes and they are:

  1. I have changed the model from xgboost churn to model trained on my data.
  2. my input to the endpoint was csv like in the example-notebook, but output was json.
  3. i have changed the problem-type from BinaryClassfication to MulticlassClassification wherever necessary.

confustion matrix was built successfully, but all metrics are 0 for some reason. So, I would like the monitoring job to calculate the multi-classification metrics on data properly.

All Logs

Here's the statistics.json file that model-quality-monitor saved to S3 with confustion matrix built, but with 0s in all the metrics:

{
  "version" : 0.0,
  "dataset" : {
    "item_count" : 4432,
    "start_time" : "2022-02-23T03:00:00Z",
    "end_time" : "2022-02-23T04:00:00Z",
    "evaluation_time" : "2022-02-23T04:13:20.193Z"
  },
  "multiclass_classification_metrics" : {
    "confusion_matrix" : {
      "0" : {
        "0" : 709,
        "2" : 530,
        "1" : 247
      },
      "2" : {
        "0" : 718,
        "2" : 497,
        "1" : 265
      },
      "1" : {
        "0" : 700,
        "2" : 509,
        "1" : 257
      }
    },
    "accuracy" : {
      "value" : 0.0,
      "standard_deviation" : 0.0
    },
    "weighted_recall" : {
      "value" : 0.0,
      "standard_deviation" : 0.0
    },
    "weighted_precision" : {
      "value" : 0.0,
      "standard_deviation" : 0.0
    },
    "weighted_f0_5" : {
      "value" : 0.0,
      "standard_deviation" : 0.0
    },
    "weighted_f1" : {
      "value" : 0.0,
      "standard_deviation" : 0.0
    },
    "weighted_f2" : {
      "value" : 0.0,
      "standard_deviation" : 0.0
    },
    "accuracy_best_constant_classifier" : {
      "value" : 0.3352888086642599,
      "standard_deviation" : 0.003252410977346705
    },
    "weighted_recall_best_constant_classifier" : {
      "value" : 0.3352888086642599,
      "standard_deviation" : 0.003252410977346705
    },
    "weighted_precision_best_constant_classifier" : {
      "value" : 0.1124185852154987,
      "standard_deviation" : 0.0021869336610830254
    },
    "weighted_f0_5_best_constant_classifier" : {
      "value" : 0.12965524348784485,
      "standard_deviation" : 0.0024239410000317335
    },
    "weighted_f1_best_constant_classifier" : {
      "value" : 0.16838092925822584,
      "standard_deviation" : 0.0028615098045768348
    },
    "weighted_f2_best_constant_classifier" : {
      "value" : 0.24009212108475822,
      "standard_deviation" : 0.003326031863819311
    }
  }
}

Here's how couple of lines of captured data looks like(prettified for readability, but each line has no tab spaces as shown below) :

{
    "captureData": {
        "endpointInput": {
            "observedContentType": "text/csv",
            "mode": "INPUT",
            "data": "0,1,628,210,30",
            "encoding": "CSV"
        },
        "endpointOutput": {
            "observedContentType": "application/json",
            "mode": "OUTPUT",
            "data": "{\"label\":\"Transfer\",\"prediction\":2,\"probabilities\":[0.228256680901919,0.0,0.7717433190980809]}\n",
            "encoding": "JSON"
        }
    },
    "eventMetadata": {
        "eventId": "a7cfba60-39ee-4796-bd85-343dcadef024",
        "inferenceId": "5875",
        "inferenceTime": "2022-02-23T04:12:51Z"
    },
    "eventVersion": "0"
}
{
    "captureData": {
        "endpointInput": {
            "observedContentType": "text/csv",
            "mode": "INPUT",
            "data": "0,3,628,286,240",
            "encoding": "CSV"
        },
        "endpointOutput": {
            "observedContentType": "application/json",
            "mode": "OUTPUT",
            "data": "{\"label\":\"Adoption\",\"prediction\":0,\"probabilities\":[0.99,0.005,0.005]}\n",
            "encoding": "JSON"
        }
    },
    "eventMetadata": {
        "eventId": "7391ac1e-6d27-4f84-a9ad-9fbd6130498a",
        "inferenceId": "5876",
        "inferenceTime": "2022-02-23T04:12:51Z"
    },
    "eventVersion": "0"
}

Here's couple of lines from my ground-truths that I have uploaded to S3 look like(prettified for readability, but each line has no tab spaces as shown below):

{
  "groundTruthData": {
    "data": "0",
    "encoding": "CSV"
  },
  "eventMetadata": {
    "eventId": "1"
  },
  "eventVersion": "0"
}
{
  "groundTruthData": {
    "data": "1",
    "encoding": "CSV"
  },
  "eventMetadata": {
    "eventId": "2"
  },
  "eventVersion": "0"
},

Here's couple of lines from the ground-truth-merged file look like(prettified for readability, but each line has no tab spaces as shown below). this file is created by the ground-truth-merge job, which is one of the two jobs that model-quality-monitoring schedule runs:

{
  "eventVersion": "0",
  "groundTruthData": {
    "data": "2",
    "encoding": "CSV"
  },
  "captureData": {
    "endpointInput": {
      "data": "1,2,1050,37,1095",
      "encoding": "CSV",
      "mode": "INPUT",
      "observedContentType": "text/csv"
    },
    "endpointOutput": {
      "data": "{\"label\":\"Return_to_owner\",\"prediction\":1,\"probabilities\":[0.14512373737373732,0.6597074314574313,0.1951688311688311]}\n",
      "encoding": "JSON",
      "mode": "OUTPUT",
      "observedContentType": "application/json"
    }
  },
  "eventMetadata": {
    "eventId": "c9e21f63-05f0-4dec-8f95-b8a1fa3483c1",
    "inferenceId": "4432",
    "inferenceTime": "2022-02-23T04:00:00Z"
  }
}
{
    "eventVersion": "0",
    "groundTruthData": {
        "data": "1",
        "encoding": "CSV"
    },
    "captureData": {
        "endpointInput": {
            "data": "0,2,628,5,90",
            "encoding": "CSV",
            "mode": "INPUT",
            "observedContentType": "text/csv"
        },
        "endpointOutput": {
            "data": "{\"label\":\"Adoption\",\"prediction\":0,\"probabilities\":[0.7029623691085284,0.0,0.29703763089147156]}\n",
            "encoding": "JSON",
            "mode": "OUTPUT",
            "observedContentType": "application/json"
        }
    },
    "eventMetadata": {
        "eventId": "5f1afc30-2ffd-42cf-8f4b-df97f1c86cb1",
        "inferenceId": "4433",
        "inferenceTime": "2022-02-23T04:00:01Z"
    }
}

Since, the confusion matrix was constructed properly, I presume that I fed the data to sagemaker-model-monitor the right-way. But, why are all the metrics 0.0, while confustion-matrix looks as expected?

EDIT 1:
Logs for the job are available here.

gefragt vor 2 Jahren127 Aufrufe
Keine Antworten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen