Skip to content

How do I use the atop and the atopsar tools to get historical usage statistics for processes on my EC2 Linux instance?

6 minute read
2

I want to learn how to download and use the atop and atopsar tools on my Amazon Elastic Compute Cloud (Amazon EC2) instance. I want to analyze log files to monitor historical CPU, memory, and disk I/O usage for processes that run on my instance.

Short description

You can use atop to record historical resource usage and real-time reports for CPU usage, memory consumption, and disk I/O for each process and thread. The atop tool stays active as a background service when it records the statistics. This allows for long-term server analysis, as the tool stores the data for 28 days by default. You can then use a utility such as atopsar to generate system activity reports about the data that atop collected.

Note: The atop tool only logs data after you install it. You can't retrieve historical performance data from before atop installation.

Resolution

Install atop

For installation instructions, see How do I configure the ATOP monitoring and SAR monitoring tools for my EC2 instance that runs Amazon Linux, Red Hat Enterprise Linux (RHEL), CentOS, or Ubuntu?

Create atop historical report logs

The atop tool creates log files with the atop_ccyymmdd name format in /var/log/atop. For example, atop_20210902 is the recording for September 2, 2021.

To access the log file, run the following command:

atop -r /var/log/atop/atop_ccyymmdd

Note: Replace ccyymmdd with the date that you want to review.

Example output:

atop -r /var/log/atop/atop_20210902 
ATOP - ip-172-20-139-91                2021/09/02  17:03:44                ----------------                 3h33m7s elapsed
PRC |  sys    6.51s  |  user   7.85s  |  #proc    103  |  #tslpi    81 |  #tslpu     0  |  #zombie    0  |  #exit      0  |
CPU |  sys     0%  |  user      3%  |  irq       0%  |  idle    197% |  wait      0%  |  ipc notavail  |  curscal   ?%  |
cpu |  sys     0%  |  user      1%  |  irq       0%  |  idle     98% |  cpu000 w  0%  |  ipc notavail  |  curscal   ?%  |
cpu |  sys     0%  |  user      1%  |  irq       0%  |  idle     98% |  cpu001 w  0%  |  ipc notavail  |  curscal   ?%  |

In the preceding example, atop first recorded a snapshot on September 2, 2021 at 17:03:44.

To move forward to the next snapshot, press the lowercase t key. To return to the previous snapshot, press the uppercase T. To analyze a specific time slot, press b and then enter the date and time. The atop tool proceeds to the time that you enter for Enter new time.

Example:

NET |  lo      ----  |  pcki       2  |  pcko       2  |  sp    0 Mbps |  si    0 Kbps  |  so    0 Kbps  |  erro       0  |
Enter new time (format [YYYYMMDD]hhmm):
  PID              TID              RDDSK              WRDSK             WCANCL              DSK             CMD        1/4

To view different statistics, press the designated shortcut key. For a full list of shortcut keys, see Atop cheatsheet - keys overview on the atoptool website.

To sort the process list, use the following shortcut keys:

  • Press C to sort by CPU activity.
  • Press M to sort by memory consumption.
  • Press D to sort by disk activity.
  • Press N to sort by network activity.
    Note: This key only works if you installed the netatop kernel.
  • Press A to sort by the most active system resource (auto mode).

To view help documentation, press h.

Create atop report logs for a specific time period

To access the log file and extract only a specific time period of performance data, run the following command:

atop -r /var/log/atop/atop_ccyymmdd -b starttime -e endtime -M

Note: Replace ccyymmdd with the date that you want to review. Also, replace starttime with the start time and endtime with the end time of the performance period. The r flag specifies the file, the b flag specifies start time, the e flag specifies end time, and the M flag specifies memory.

The following example command returns performance data for memory on April 22, 2024 between 08:00 and 08:10:

atop -r /var/log/atop/atop_20240422 -b 0800 -e 0810            -M

Use atopsar to generate system activity reports

You can use the atopsar command to generate system activity reports. If you use the -c flag, then atopsar generates a report about the current CPU usage of the system.

Example command:

atopsar -c 

The following example output shows the CPU usage at one-second intervals:

atopsar -c 1 2
ip-172-20-139-91  4.14.238-182.422.amzn2.x86_64  #1 SMP Tue Jul 20 20:35:54 UTC 2021  x86_64  2021/09/02

-------------------------- analysis date: 2021/09/02 --------------------------

18:50:16  cpu  %usr %nice %sys %irq %softirq  %steal %guest  %wait %idle  _cpu_
18:50:17  all     0     0    0    0        0       0      0      0   200
            0     0     0    0    0        0       0      0      0   100
            1     0     0    0    0        0       0      0      0   100
18:50:18  all     0     0    0    0        0       0      0      0   200
            0     0     0    0    0        0       0      0      0   100
            1     0     0    0    0        0       0      0      0   100

To analyze data within a specific timeframe, run the following command:

atopsar -A -b 13:00 -e 13:35

Note: The -A flag generates all reports that fit the criteria for the current day. Replace 13:00 with your start time and 13:35 with your end time.

To retrieve multiple outputs, combine the atopsar flags into a single command. The following example command queries cpu utilization, process(or) load, and processes & threads:

atopsar -cpP

Example output:

ip-172-31-89-231 6.1.84-99.169.amzn2023.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Apr 8 19:19:48 UTC 2024 x86_64 2024/04/22
-------------------------- analysis date: 2024/04/22 --------------------------

07:59:27 cpu %usr %nice %sys %irq %softirq %steal %guest %wait %idle cpu
08:00:27 all 0 0 0 0 0 0 0 4 95
08:01:27 all 0 0 0 0 0 0 0 0 100
08:02:27 all 0 0 0 0 0 0 0 0 100
08:03:27 all 0 0 0 0 0 0 0 0 100

-------------------------- analysis date: 2024/04/22 --------------------------

07:59:27 pswch/s devintr/s clones/s loadavg1 loadavg5 loadavg15 load
08:00:27 203 70 1.07 0.13 0.29 0.14
08:01:27 53 31 0.07 0.05 0.23 0.13
08:02:27 59 31 0.87 0.02 0.19 0.12
08:03:27 68 35 0.22 0.00 0.15 0.10

-------------------------- analysis date: 2024/04/22 --------------------------

07:59:27 clones/s pexit/s curproc curzomb thrrun thrslpi thrslpu procthr
08:00:27 1.07 1.07 114 0 1 83 58
08:01:27 0.07 0.07 114 0 1 83 58
08:02:27 0.87 0.88 109 0 1 83 53
08:03:27 0.22 0.28 105 0 1 76 52

For a detailed list of flags and output values that atopsar retrieves and displays, see atopsar on the Linux man website.

Related information

Why is my EC2 Linux instance becoming unresponsive due to over-utilization of resources?

AWS OFFICIALUpdated 10 months ago