How do I troubleshoot a failed Patch Manager (Linux) operation?

4 minute read
0

I want to troubleshoot my failed Patch Manager (Linux) operation.

Short description

Patching operations might fail for multiple reasons, and how you troubleshoot for errors varies based on the operating system (OS). You can find error messages in the AWS Management Console or API response. However, console output is truncated at 48,000 characters and can limit visibility into certain issues. For these issues, it's a best practice to review the full output that's stored on the managed node. You can also use the output that's sent to Amazon CloudWatch and Amazon Simple Storage Service (Amazon S3) to further troubleshoot.

Resolution

To troubleshoot error messages that you receive when your Patch Manager (Linux) operation fails, complete the following tasks.

Use the AWSSupport-TroubleshootPatchManagerLinux runbook

Use AWSSupport-TroubleshootPatchManagerLinux to troubleshoot patching failures for Linux-based managed nodes.

Review your partitions and permissions

You receive the following example error message when /var/lib/amazon is mounted with noexec permissions:

"/var/lib/amazon/ssm/<instanceid>/document/orchestration/<commandid>/PatchLinux/_script.sh: Permission deniedfailed to run commands: exit status 126"

To resolve this issue, configure exclusive partitions to /var/log/amazon and /var/lib/amazon, and mount them with exec permissions.

Review your managed node permissions

You receive the following example error message when the managed node doesn't have the required permissions to access the specified Amazon S3 bucket:

"Unable to download payload: https://s3.DOC-EXAMPLE-BUCKET.region.amazonaws.com/aws-ssm-region/patchbaselineoperations/linux/payloads/patch-baseline-operations-X.XX.tar.gz.failed to run commands: exit status 156"

To resolve this issue, update your network configuration and make sure that you can reach the AWS Regional Amazon S3 endpoint. For more information, see  AWS Systems Manager Agent (SSM Agent) communications with AWS managed S3 buckets.

Review your Run Command tasks and directory space

You receive the following example error message when two commands simultaneously run AWS-RunPatchBaseline on the same managed node. You can also receive this error message when there's no available disk space on the /var directory.

Example error:

"IOError: [Errno 2] No such file or directory: 'patch-baseline-operations-X.XX.tar.gz'Unable to extract tar file: /var/log/amazon/ssm/patch-baseline-operations/patch-baseline-operations-1.75.tar.gz.
failed to run commands: exit status 155
Unable to load and extract the content of payload, abort.
failed to run commands: exit status 152"

To resolve this issue, complete the following tasks:

  • Make sure that no maintenance window has two or more Run Command tasks that run AWS-RunPatchBaseline. The tasks can't have the same priority level and run on the same target IDs. Change the priority levels if necessary.
  • Make sure that only one State Manager association runs AWS-RunPatchBaseline on the same schedule and targets the same managed nodes.
  • Make more disk space available under the /var directory.

Review the processes that run on a managed node

You receive the following example error message when AWS-RunPatchBaseline runs on a managed node where yum is already running and another process locked the database:

"MM/DD/YYYY HH:MM:SS root [INFO]: another process has acquired yum lock, waiting 2 s and retry."

To resolve this issue, complete the following tasks:

  • Make sure that no State Manager association, maintenance window tasks, or other configurations that run AWS-RunPatchBaseline on a schedule simultaneously target the same managed node.
  • Make sure that no manual yum operations run at the same time.

Review the server's Python version

You receive the following example error message when a supported version of Python 3 isn't installed on the Red Hat Enterprise Linux (RHEL), Debian Server, Raspberry Pi, or Ubuntu Server instance:

"An unsupported package manager and python version combination was found. Dnf requires Python 2 or Python 3 to be installed."

To resolve this issue, install a version of Python 3 (3.0 - 3.9) on the required server.

Review your OS

You receive the following example error message when an OS isn't supported:

"An error occurred (UnsupportedOperatingSystem) when calling the GetDeployablePatchSnapshotForInstance operation: patch_common.exceptions. PatchManagerError: ('Unsupported Operating System', 146)"

To resolve this issue, use a supported OS for Patch Manager.

Review the output of the instance

After the patching operation fails and the console output is truncated, you might not be able to view the entire output.

To resolve this issue, review the entire output on the instance at the following location:

"/var/lib/amazon/ssm/<example-instance-id>/document/orchestration/<example-command-id>/awsrunShellScript/PatchLinux/stdout"

Note: Replace all example strings with your values.

You can also configure the operation to send the Run Command output to Amazon S3 or CloudWatch.

AWS OFFICIAL
AWS OFFICIALUpdated 3 months ago