Step logs not available in console/S3 in EMR

0

Hi,

We have an EMR cluster with multiple concurrent steps gets executed seamlessly. Not sure what happened certainly, but the step logs, application logs are not published to s3 from yesterday. However they are exist in the primary node under /emr directory. It will be very difficult to let developers check logs in primary node for each step and this cluster executes more than 10 steps per day. It will be really great if any suggestions provided to troubleshoot. I don't find any relevant steps to troubleshoot in AWS documents.

Thanks in advance

Scott M
質問済み 5ヶ月前368ビュー
3回答
3
承認された回答

Hello,

Thanks for sharing the error stack. Please follow the below steps to fix the issue,

  1. Stop the logpusher service in primary node
sudo systemctl stop logpusher
  1. Move all the files from this location /emr/logpusher/db to different location. You might be seeing files starts with data*.
  2. Start the logpusher service back.
sudo systemctl start logpusher
  1. Check the latest logpusher file to see if the above exception has disappeared and the logs started uploading to s3 bucket. Wait for sometime to get all the logs available to s3 location. Let me know if you have the issue still exist.
AWS
サポートエンジニア
回答済み 5ヶ月前
  • Excellent!!. This fixed the issue and I see logs are started publishing to S3 and console. Thank you very much!!

3

Hello,

Seems the logpusher failed to push the logs to s3. Please note that logpusher is a deamon in EMR which publish the application logs to s3 every 5 mins. If the files not pushed for complete day, then perhaps the logpusher might be the issue. You can check the service status with below command. If it is running you can restart them and observe after sometime you might be able to view the files in S3.

sudo systemctl status logpusher
sudo systemctl restart logpusher

If you the above doest work and still the files not pushed to s3, you can go this location /emr/logpusher/log and take a look at the latest logpusher file to see if any issues explicitly reported.

AWS
サポートエンジニア
回答済み 5ヶ月前
1

Thanks for the response. I followed your steps to restart the logpusher and it didn't fix the issue. After restart and more than 30 minutes, the logs stays intact and not published to S3/console.

I also found below constraint violation exception several times in logpusher log file. Could you please let me know if this is causing the issue?

2024-01-08 21:34:01,048 ERROR logspusher-1: integrity constraint violation: unique constraint or index violation; SYS_PK_10100 table: LOGFILE
2024-01-08 23:34:01,048 WARN logspusher-1: SQLException doing action 'Performing a transaction': java.sql.BatchUpdateException: integrity constraint violation: unique constraint or index violation; SYS_PK_10100 table: LOGFILE
2024-01-08 23:34:01,048 WARN logspusher-1: SQLState: 23505
2024-01-08 23:34:01,048 WARN logspusher-1: VendorError: -104
2024-01-08 23:34:01,048 ERROR logspusher-1: Failed to schedule logs in logpusher in normal phase
org.hibernate.exception.ConstraintViolationException: Could not execute JDBC batch update
    
Scott M
回答済み 5ヶ月前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ