AWS glue studio node long run time for data preview


Hi, I am using AWS glue studio to read from a DDB table with direct DDB connection. So far my visual diagram has two nodes:

  1. Source DDB table node -> Here preview takes 5 minutes for only 2 rows of dataset but at least shows result
  2. Transform- selectFields -> Here session runs for long time (>20 minutes) and fails with error of 'session not ready My DDB table is of 691 bytes with provisioned capacity units as 5 RCU and 5 WCU. The glue job details has below config:
  3. Glue version -> 4.0
  4. Language-> python3
  5. Worker Type -> G1X (automatic scale for number of workers is enabled)
  6. Max number of workers -> 11
  7. job timeout-> 2880

Considering this is a smaller data subset, can you please let me know why it is taking a long time to run? or where to look for any related insights? I am hoping to use this as a part of my production data-pipeline that will transform and move data to redshift for DW purposes. Unfortunately there isn't enough information available for glue studios.

preguntada hace 3 meses256 visualizaciones
1 Respuesta

First of all I would suggest using on-demand mode in DynamoDB, at least until you get it working correctly. When you have 5 RCU, Glue takes that number as a limit, and rate limits its requests as not to exceed it. But I suspect you may have other issues.

Moreover, DynamoDB is releasing ZeroETL with Redshift, which is now in private preview, so perhaps it's advisable not to spend too much time creating the wheel.

profile pictureAWS
respondido hace 3 meses
  • Hi Leeroy, thanks for the prompt response and redirecting towards zero ETL with Redshift blog. While our account gets allow-listed for the preview, can you please let me know what other parts of the config I should be looking at to speed up the preview of sample dataset? I have changed DDB tables to on-demand mode, but it's not really speed up yet.

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas