1 Risposta
- Più recenti
- Maggior numero di voti
- Maggior numero di commenti
0
Hi, Did you look at Spark UI to see whether the job is spending most of its time? If you are not processing all the files every 10 minutes it would be great to move those processed objects to another bucket to improve listing time or use partitions in order to improve the reading of those files.
Please read this link (Handle large number of small files and partition sections) Glue Best Practices
Bests
con risposta 2 anni fa
Contenuto pertinente
- AWS UFFICIALEAggiornata 2 anni fa
- AWS UFFICIALEAggiornata 2 anni fa