Use EvaluateDataQuality with PySpark DataFrame instead of Glue DynamicFrame?

0

Is there a way to use the class https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-crawler-pyspark-transforms-EvaluateDataQuality.html with a PySpark DataFrame instead of a Glue DynamicFrame (and also without conversion to it)? I noticed, that working with Spark only is much more stable and significantly faster than working with Glue DynamicFrames. Therefore I would like to omit Glue DynamicFrames completely in my code.

posta 6 mesi fa164 visualizzazioni
1 Risposta
0

No, added value features up to now have never been added to the standard DataFrame.
What you can do is convert to DynamicFrame just to evaluate the data quality and leave the rest of the code the same (or convert back to DataFrame), the overhead of converting from an back on most cases is minimal.

profile pictureAWS
ESPERTO
con risposta 6 mesi fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande