Use EvaluateDataQuality with PySpark DataFrame instead of Glue DynamicFrame?

0

Is there a way to use the class https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-crawler-pyspark-transforms-EvaluateDataQuality.html with a PySpark DataFrame instead of a Glue DynamicFrame (and also without conversion to it)? I noticed, that working with Spark only is much more stable and significantly faster than working with Glue DynamicFrames. Therefore I would like to omit Glue DynamicFrames completely in my code.

質問済み 6ヶ月前164ビュー
1回答
0

No, added value features up to now have never been added to the standard DataFrame.
What you can do is convert to DynamicFrame just to evaluate the data quality and leave the rest of the code the same (or convert back to DataFrame), the overhead of converting from an back on most cases is minimal.

profile pictureAWS
エキスパート
回答済み 6ヶ月前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ