- Le plus récent
- Le plus de votes
- La plupart des commentaires
In AWS Glue, this error can occur when multiple Glue jobs are running concurrently and attempting to register the same metric name for their Spark accumulators. Accumulators in Spark are variables that are used to aggregate information across tasks. In the context of AWS Glue, which is built on top of Apache Spark, these accumulators might be used for metrics like tracking the number of records written to an S3 bucket.
The error message you're seeing:
ThroughputMetricsSource: Metric: s3://<my-bucket>/orchestration/logs.recordsWritten is already registered by a different accumulator. Retrying with suffix #1 java.lang.IllegalArgumentException: A metric named s3://<my-bucket>/orchestration/logs.recordsWritten already exists.
indicates that the metric named s3://<my-bucket>/orchestration/logs.recordsWritten
is being registered more than once, which is not allowed. This can happen when multiple Glue jobs are using the same metric name simultaneously.
To resolve this issue, you need to ensure that each Glue job uses a unique name for its metrics.
Contenus pertinents
- demandé il y a 6 mois
- demandé il y a un an
- demandé il y a 2 mois
- demandé il y a un an
- AWS OFFICIELA mis à jour il y a 5 ans
- AWS OFFICIELA mis à jour il y a 2 ans
- AWS OFFICIELA mis à jour il y a 4 mois
Thanks for your response, is there a way to resolve this using Glue Studio visual instead of scripting? All my job properties including concurrency is set in job details tab and the job itself is called from Step Functions
I think you can resolve the issue by modifying your Glue jobs to write output to unique S3 paths for each concurrent run, by incorporating dynamic elements like job run IDs or timestamps into the S3 output paths. But, using Glue Studio's visual interface alone, you cannot directly configure dynamic elements like job run IDs or timestamps into the S3 output paths. This functionality would typically require scripting or passing dynamic parameters to your job.
In Glue Studio, you can set job parameters and use them in your job script, but the generation of dynamic elements like timestamps would need to be handled within the script itself, rather than through the visual interface.