Use EvaluateDataQuality with PySpark DataFrame instead of Glue DynamicFrame?

0

Is there a way to use the class https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-crawler-pyspark-transforms-EvaluateDataQuality.html with a PySpark DataFrame instead of a Glue DynamicFrame (and also without conversion to it)? I noticed, that working with Spark only is much more stable and significantly faster than working with Glue DynamicFrames. Therefore I would like to omit Glue DynamicFrames completely in my code.

preguntada hace 6 meses164 visualizaciones
1 Respuesta
0

No, added value features up to now have never been added to the standard DataFrame.
What you can do is convert to DynamicFrame just to evaluate the data quality and leave the rest of the code the same (or convert back to DataFrame), the overhead of converting from an back on most cases is minimal.

profile pictureAWS
EXPERTO
respondido hace 6 meses

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas