- 最新
- 最多得票
- 最多評論
Indeed I posted just an small excerpt of the script for the sake of simplicity and focus in what I think may be the problem.
Here is a fully functional excerpt of the script, followed by the errors seen at /var/log/cloud-init-output.log
Content-Type: multipart/mixed; boundary="//" MIME-Version: 1.0 --// Content-Type: text/x-shellscript; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="userdata.txt" #!/bin/bash AWS_REGION=us-east-1 echo "AWS_REGION=${!AWS_REGION}" UPDATE_RELEASE_VERSION=`dnf check-release-update 2>&1 |grep Version|grep -v Available|awk -F"[ :]" '{print $4}'` if [[ -n "${!UPDATE_RELEASE_VERSION}" ]] then echo "Updating Amazon Linux release to ${!UPDATE_RELEASE_VERSION}" sudo dnf --assumeyes --releasever=${!UPDATE_RELEASE_VERSION} update else echo "There is no newer release of Amazon Linux than the current one. Skipping update." fi dnf install --assumeyes java-1.8.0-amazon-corretto ruby3.2 jq # Create user Kafka and Install Kafka from Apache repo adduser kafka pushd /home/kafka/ curl "https://archive.apache.org/dist/kafka/2.4.1/kafka_2.13-2.4.1.tgz" --create-dirs -o "./downloads/kafka_2.13-2.4.1.tgz" tar -xvzf ./downloads/kafka_2.13-2.4.1.tgz mkdir /home/kafka/.aws DEST=/home/kafka/.aws/config echo "[default]" >> $DEST echo "output = json" >> $DEST echo "region = ${!AWS_REGION}" >> $DEST chown -R kafka:kafka ./downloads ./kafka_2.13-2.4.1 ./.bash_profile ./.aws popd # Install codeploy agent pushd /home/ec2-user wget https://aws-codedeploy-us-east-1.s3.us-east-1.amazonaws.com/latest/install chmod +x ./install sudo ./install auto popd --//
The script finish run and inspecting /var/log/cloud-init-output.log show these errors while running commands at line 7 and 11 (dnf and yum installs respectively)
Cloud-init v. 22.2.2 running 'init' at Thu, 11 May 2023 17:33:00 +0000. Up 9.44 seconds.
ci-info: ++++++++++++++++++++++++++++++++++++++Net device info++++++++++++++++++++++++++++++++++++++
ci-info: +--------+------+----------------------------+---------------+--------+-------------------+
ci-info: | Device | Up | Address | Mask | Scope | Hw-Address |
... <log suppressed to avoid large message> ...
Cloud-init v. 22.2.2 running 'modules:config' at Thu, 11 May 2023 17:33:05 +0000. Up 14.36 seconds.
Cloud-init v. 22.2.2 running 'modules:final' at Thu, 11 May 2023 17:33:06 +0000. Up 15.59 seconds.
AWS_REGION=us-east-1
Updating Amazon Linux release to 2023.0.20230503
Amazon Linux 2023 repository 18 MB/s | 13 MB 00:00
Amazon Linux 2023 Kernel Livepatch repository 387 kB/s | 156 kB 00:00
Dependencies resolved.
================================================================================
Package Arch Version Repository Size
================================================================================
Installing:
kernel x86_64 6.1.25-37.47.amzn2023 amazonlinux 31 M
Upgrading:
amazon-linux-repo-s3 noarch 2023.0.20230503-0.amzn2023 amazonlinux 18 k
bind-libs x86_64 32:9.16.38-1.amzn2023.0.1 amazonlinux 1.3 M
bind-license noarch 32:9.16.38-1.amzn2023.0.1 amazonlinux 16 k
... <log suppressed to avoid large message> ...
(22/24): bind-license-9.16.38-1.amzn2023.0.1.no 542 kB/s | 16 kB 00:00
(23/24): grub2-common-2.06-61.amzn2023.0.6.noar 25 MB/s | 1.8 MB 00:00
(24/24): grub2-pc-modules-2.06-61.amzn2023.0.6. 9.6 MB/s | 913 kB 00:00
--------------------------------------------------------------------------------
Total 29 MB/s | 42 MB 00:01
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
RPM: error: can't create transaction lock on /var/lib/rpm/.rpm.lock (Resource temporarily unavailable)
The downloaded packages were saved in cache until the next successful transaction.
You can remove cached packages by executing 'dnf clean packages'.
Error: Could not run transaction.
... <log suppressed to avoid large message> ...
The error shown above correspond to the execution of:
sudo dnf --assumeyes --releasever=${!UPDATE_RELEASE_VERSION} update
The other dnf command this time run successfully.
Doing some tests I tried to add a sleep 60 at begin of the script. Then commands runs without problems. Which makes me think that something in the bootstraping process is generating the rpm lock.
Perhaps some AWS initialization that takes place at same time as userdata get executed? In an older Amazon Linux 2 instance that I setup with similar script I saw yum complaining about lock too, but that version kept trying to acquire the lock until it succeeds. This dnf equivalent does not.
The sleep is not the ideal solution but can give some time to focus on other aspects of the project. Nevertheless, any solution to the lock problem will be appreciated.
When using user data via cloud-init it is possible that other processes will be running simultaneously. If those processes are out of your control, it is best to just implement the required retry mechanisms in your own scripts. Here is an example of retrying yum installs:
max_attempts=5 attempt_num=1 success=false while [ $success = false ] && [ $attempt_num -le $max_attempts ]; do echo "Trying yum install" yum update -y yum install java-1.8.0 java-17-amazon-corretto-devel.x86_64 wget telnet -y # Check the exit code of the command if [ $? -eq 0 ]; then echo "Yum install succeeded" success=true else echo "Attempt $attempt_num failed. Sleeping for 3 seconds and trying again..." sleep 3 ((attempt_num++)) fi done
This isn’t a solution. It’s a workaround at best. The issue has been accepted as a bug by AWS: https://github.com/amazonlinux/amazon-linux-2023/issues/397#issuecomment-1773090340
@TheOtherRob thank you for sharing that link. I am unclear on your feedback on why you don't consider this a solution. OP was asking for way to run their commands when RPM was locked, so I provided a method that is tested and working
This is likely just something that has crept in during a cut & paste, but your command that starts sudo dnf --assumeyes
has got a dot at the end of it, which shouldn't be there.
This userdata script won't run anyway because cloud-init won't recognise it as a script, it needs to start with a shebang, e.g. #!/bin/bash
.
I've tried standing up an EC2 with that same AMI & userdata, and the presence or absence of a shebang meant the software was or wasn't installed/updated.
Admittedly the error in my /var/log/cloud-init-output.log
on those occasions was __init__.py[WARNING]: Unhandled non-multipart (text/x-not-multipart) userdata: 'b'## This script is intend'...'
and not your one about the .rpm.lock
file. So there may be more just the shebang that's a factor here. Is there more to your userdata script than just the ten lines you posted above?
RWC I posted a reply as a new answer because this comment space limited the characters I can write...
I've encountered the same issue moving some elastic beanstalk platform hooks to al2023. The issue in my case is you're not able to run rpm within a script that's already being run by rpm hence the lock file being unavailable.
We use platform hooks to execute arbitrary shell scripts at different points in the EBs life cycle and in AL2023 those life cycle hooks are actually ran by rpm I can only guess so it immediately limits us from installing packages with rpm as opposed to just some package or utility already existing in AL2023 repository via dnf/yum.
Hopefully this helps I found this same issue explanation on fedora when people run rpm scripts inside rpm scripts .
@localpath I fixed this by moving scripts from prebuild -> predeploy
I have the same issue. I run dnf from a script called from user data and the dnf command fails with:
(43/45): glibc-headers-x86-2.34-52.amzn2023.0.3 24 MB/s | 448 kB 00:00
(44/45): perl-File-Find-1.37-477.amzn2023.0.5.n 795 kB/s | 26 kB 00:00
(45/45): gcc-11.3.1-4.amzn2023.0.3.x86_64.rpm 70 MB/s | 32 MB 00:00
--------------------------------------------------------------------------------
Total 64 MB/s | 59 MB 00:00
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
RPM: error: can't create transaction lock on /var/lib/rpm/.rpm.lock (Resource temporarily unavailable)
The downloaded packages were saved in cache until the next successful transaction.
You can remove cached packages by executing 'dnf clean packages'.
Error: Could not run transaction.
In other words, the DNF command runs for a while, downloads artifacts, and then consistently fails. The is Amazon Linux 2023 al2023-ami-2023.1.20230809.0-kernel-6.1-x86_64.
Here is the source for the script:
#!/bin/env bash
# turn on echoing of commands and exit on errors
set -o xtrace
set -e
set -o pipefail
################### install_httpd.sh: script that will install httpd on the new instance
#===== Some initial output of state for debugging
# echo out all of the ENV variables that are supposed to be already set
echo "install_httpd.sh: Proof of life that the script ran" >> /var/opt/infor-install/userscript.txt
echo "install_httpd.sh: Proof of life that the script ran appended to /var/opt/infor-install/userscript.txt"
# echo "install_httpd.sh: Supposed to be set, INFOR_INSTALL_TOMCAT_VERSION=$INFOR_INSTALL_TOMCAT_VERSION"
#===== Do some package updates and install httpd. Amazon Linux uses "dnf" which is the new "yum"
echo "install_httpd.sh: dnf update"
while pgrep yum || pgrep rpm || pgrep dnf; do sleep 5; echo "sleep"; done
dnf update -y
sleep 10
echo "install_httpd.sh: dnf install httpd"
while pgrep yum || pgrep rpm || pgrep dnf; do sleep 5; echo "sleep"; done
dnf install -y httpd-devel httpd
I added the pgrep command to try to make sure that yum, rpm, and dnf were not already running, but it does not help. And, the word "sleep" is not in the output.
Was wondering if there was a solution for this. Having the same issue.
#!/bin/bash set -e set -x sudo dnf update sudo sleep 5 sudo dnf install docker -y
Output
... ... Running transaction check Transaction check succeeded. Running transaction test Transaction test succeeded. Running transaction RPM: error: can't create transaction lock on /var/lib/rpm/.rpm.lock (Resource temporarily unavailable) The downloaded packages were saved in cache until the next successful transaction. You can remove cached packages by executing 'dnf clean packages'. Error: Could not run transaction.
Actually maybe found possible solution
sudo dnf upgrade --refresh rpm glibc
sudo rm /var/lib/rpm/.rpm.lock
dnf -y update
dnf install <MY PACKAGES>
Ref: https://github.com/amazonlinux/amazon-linux-2023/issues/397
I tried this and many other suggested methods and the only thing that worked for me was to stop the SSM agent as the very first command in userdata. Then start SSM agent at the end in userdata or cfn-init configset services. BTW using cloudformation and ami-03e34865d6f563985
UserData: 'Fn::Base64': !Sub | #!/bin/bash -xe systemctl stop amazon-ssm-agent dnf update -y aws-cfn-bootstrap ... dnf install <something> ... systemctl start amazon-ssm-agent
services: sysvinit: amazon-ssm-agent: enabled: true ensureRunning: true
I've been trying to get to the bottom of this issue over the last 48 hours and I think it is caused by the SSM agent updater running at the same time as the user data scripts.
I've disabled this in SSM Fleet Manager and the issue has gone away.
If you do disable the automatic SSM agent updates, it's important that you know the consequences of this and implement the updates in another way!
I have the same issue with cloud-init
simple cloud-config like this
#cloud-config
repo_update: true
repo_upgrade: true
package_reboot_if_required: true
packages:
- docker
- postgresql15
- python3-boto3
locale: en_AU
timezone: Australia/Brisbane
Gives me error
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
RPM: error: can't create transaction lock on /var/lib/rpm/.rpm.lock (Resource temporarily unavailable)
The downloaded packages were saved in cache until the next successful transaction.
You can remove cached packages by executing 'dnf clean packages'.
Error: Could not run transaction.
And I have approved this is random. Sometimes it works. Just recently, it becomes an issue. We are always using the latest AL2023.
Disable repo_upgrade
doesn't help.
Disable SSM auto upgrade doesn't help as well.
相關內容
- 已提問 8 個月前
- 已提問 7 個月前
- AWS 官方已更新 1 年前
Perhaps some AWS initialization that takes place at same time as userdata get executed
That could well be it, it shouldn't happen, but maybe on the odd occasion it does.
Could you do the kafka download & install - which must take a few tens of seconds - before the
dnf update
, which would givednf
time to straighten out whatever it needs to do in the background.In the other issue the
dnf update
should have completely finished and cleaned up after itself beforednf install
starts, but it's obviously not. If you put asleep 2
before thednf install
does the problem persist?RWC good suggestion doing kafka install first. Will try that and post here the result. Perhaps it gives more time to liberate the lock.