Issues with Spark history server and s3

0

I have set up and am using the Spark history server for AWS Glue as described here: https://docs.aws.amazon.com/glue/latest/dg/monitor-spark-ui-history.html However, this application is not able to see any of my completed logs, but it is able to see all of my incomplete logs. I tried to set up a Spark history server manually on an ec2 instance and ran into a similar issue. Looking at the logs for the history server, I am seeing a bunch of data like this, which I assume is what the CF-configured server is seeing (I've redacted the s3 bucket name but it is valid in the logs). Any ideas?

23/10/18 05:33:06 INFO FsHistoryProvider: Parsing s3a://*<redacted>*/sparkHistoryLogs/spark-application-1697461405003 for listing data... 23/10/18 05:33:06 INFO FsHistoryProvider: Looking for end event; skipping 83260370 bytes from s3a://*<redacted>*/sparkHistoryLogs/spark-application-1697461405003... 23/10/18 05:33:07 INFO FsHistoryProvider: Finished parsing s3a://*<redacted>*/sparkHistoryLogs/spark-application-1697461405003

Also I can confirm that the completed logs are valid - if I copy these to the local file system on the history server and point to that path, I can in fact see and use the completed logs in the Spark UI.

Alex T
질문됨 7달 전69회 조회
답변 없음

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인