1 Resposta
- Mais recentes
- Mais votos
- Mais comentários
0
Hi, Did you look at Spark UI to see whether the job is spending most of its time? If you are not processing all the files every 10 minutes it would be great to move those processed objects to another bucket to improve listing time or use partitions in order to improve the reading of those files.
Please read this link (Handle large number of small files and partition sections) Glue Best Practices
Bests
respondido há 2 anos
Conteúdo relevante
- AWS OFICIALAtualizada há um ano