- Le plus récent
- Le plus de votes
- La plupart des commentaires
You are correct in your understanding. The transformation_ctx
parameter is primarily used to associate bookmark state with the source data, and it is not necessary to include it in all subsequent transformation steps like joins, selects, and ApplyMapping when using Glue bookmarks. The bookmark state should be relevant to the source data extraction step.
The purpose of the transformation_ctx
parameter is to enable Glue to track the progress of your ETL job, specifically for source data extraction, so that it knows which data has been processed and where to start in case of job interruption or failure.
This is how it typically works:
Source Data Extraction: In the initial step where you create dynamic frames by reading data from sources (e.g., S3, databases), you use the transformation_ctx
parameter to track the state of the source data. This allows Glue to record the last successfully processed data point for bookmarks.
Transformations: For subsequent transformation steps like joins, selects, ApplyMapping, and others, you do not need to include the transformation_ctx
. These transformations are applied to the data obtained from the initial source extraction. The bookmark state is associated with the source data itself, not the transformations applied to it.
Including transformation_ctx
in every transformation step is unnecessary and could potentially lead to confusion because it does not serve any meaningful purpose in those steps. You should include it only in the create dynamic frame methods for reading data from sources.
Contenus pertinents
- demandé il y a un an
- demandé il y a un an
- demandé il y a un an
- AWS OFFICIELA mis à jour il y a 3 ans
- AWS OFFICIELA mis à jour il y a 5 mois