EC2 freezes at exactly one hour after reboot

0

I am running a EC2 (m6i.large) with Jenkins/PHP/MySQL + some basic services. It was running on 18.x Ubuntu fine and I upgraded it to 20.x and then 22.x yesterday. Since then the EC2 freezes (unable to SSH) and getting the error "instance reachability check failed" in AWS Console.

These are the last few logs from dmesg.

[    5.988941] audit: type=1400 audit(1713911132.732:28): apparmor="STATUS" operation="profile_load" profile="unconfined" name="snap.firefox.hook.configure" pid=425 comm="apparmor_parser"
[    5.993150] audit: type=1400 audit(1713911132.736:29): apparmor="STATUS" operation="profile_load" profile="unconfined" name="snap.firefox.hook.disconnect-plug-host-hunspell" pid=428 comm="apparmor_parser"
[    6.002181] audit: type=1400 audit(1713911132.744:30): apparmor="STATUS" operation="profile_load" profile="unconfined" name="snap.firefox.hook.post-refresh" pid=429 comm="apparmor_parser"
[    6.008875] parport_pc 00:03: reported by Plug and Play ACPI
[    6.235431] ppdev: user-space parallel port driver
[    9.785646] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
[    9.787920] Bridge firewalling registered
[    9.900718] loop8: detected capacity change from 0 to 8
[    9.984377] Initializing XFRM netlink socket

No error logs in /var/log/*.log before freeze.

htop shows normal or near zero CPU and memory usage before the freeze. No CPU spikes in the Monitoring tab in AWS Console.

I have disabled all the cron jobs.

This is the current OS version:

$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 22.04.4 LTS
Release:	22.04
Codename:	jammy

I have tried stopping all the services such as Jenkins/Docker/PHP/Nginx etc., but the server still freezes at 1 hour exactly (+ 3 to 6 seconds). I looked at ps -eaf and systemctl list-units --type=service --state=running and I don't see anything strange running there.

After I reboot, I am able to SSH and then it freezes again in 1 hour.

How do I go further investigate what's causing this?

질문됨 25일 전68회 조회
1개 답변
0

it's worth checking:

  • Insufficient disk space or inodes.
  • File system errors: Run a file system check (fsck) during a reboot.
  • Hardware issues: Consider changing the instance type temporarily to see if the problem is related to the specific virtual hardware of the m6i.large.

If it does not help, consider reaching out to AWS Support. They can provide insights from their end that are not visible to you, such as hardware failures or underlying network issues affecting your instance.

profile picture
전문가
답변함 24일 전
profile pictureAWS
전문가
검토됨 24일 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠