Issues with Spark history server and s3

0

I have set up and am using the Spark history server for AWS Glue as described here: https://docs.aws.amazon.com/glue/latest/dg/monitor-spark-ui-history.html However, this application is not able to see any of my completed logs, but it is able to see all of my incomplete logs. I tried to set up a Spark history server manually on an ec2 instance and ran into a similar issue. Looking at the logs for the history server, I am seeing a bunch of data like this, which I assume is what the CF-configured server is seeing (I've redacted the s3 bucket name but it is valid in the logs). Any ideas?

23/10/18 05:33:06 INFO FsHistoryProvider: Parsing s3a://*<redacted>*/sparkHistoryLogs/spark-application-1697461405003 for listing data... 23/10/18 05:33:06 INFO FsHistoryProvider: Looking for end event; skipping 83260370 bytes from s3a://*<redacted>*/sparkHistoryLogs/spark-application-1697461405003... 23/10/18 05:33:07 INFO FsHistoryProvider: Finished parsing s3a://*<redacted>*/sparkHistoryLogs/spark-application-1697461405003

Also I can confirm that the completed logs are valid - if I copy these to the local file system on the history server and point to that path, I can in fact see and use the completed logs in the Spark UI.

Alex T
gefragt vor 7 Monaten69 Aufrufe
Keine Antworten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen