I am just trying to build a cluster with hadoop, hive and spark preinstalled on emr-5.23.0. Till today morning it was working fine and suddenly it started failing with the below error and the cluster getting terminated: Terminated with errors Bootstrap failure.
I tried to capture the errors before the cluster dies off:
master.log
[ec2-user@ip-00-000-000-00 ~]$ cat /mnt/var/log/bootstrap-actions/master.log
2019-04-16 19:08:42,443 INFO i-078302fec7edd117b: new instance started
2019-04-16 19:08:42,451 ERROR i-078302fec7edd117b: failed to start. bootstrap action 1 failed with non-zero exit code.
2019-04-16 19:09:25,805 INFO i-02fc7b0d9e6815b8d: new instance started
2019-04-16 19:09:27,826 INFO i-02fc7b0d9e6815b8d: all bootstrap actions complete and instance ready
2019-04-16 19:10:00,812 INFO i-0261bda9ec7074d13: new instance started
2019-04-16 19:10:03,482 INFO i-0261bda9ec7074d13: all bootstrap actions complete and instance ready
cloud-init
Apr 16 19:04:23 cloud-init[4713]: util.py[WARNING]: Running module yum-configure (<module 'cloudinit.config.cc_yum_configure' from '/usr/lib/python2.7/dist-packages/cloudinit/config/cc_yum_configure.pyc'>) failed
Apr 16 19:04:23 cloud-init[4713]: util.py[DEBUG]: Running module yum-configure (<module 'cloudinit.config.cc_yum_configure' from '/usr/lib/python2.7/dist-packages/cloudinit/config/cc_yum_configure.pyc'>) failed
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/cloudinit/stages.py", line 660, in _run_modules
cc.run(run_name, mod.handle, func_args, freq=freq)
File "/usr/lib/python2.7/dist-packages/cloudinit/cloud.py", line 63, in run
return self._runners.run(name, functor, args, freq, clear_on_fail)
File "/usr/lib/python2.7/dist-packages/cloudinit/helpers.py", line 197, in run
results = functor(args)
File "/usr/lib/python2.7/dist-packages/cloudinit/config/cc_yum_configure.py", line 53, in handle
raise errors[-1]
KeyError: 'regional'
Apr 16 19:04:23 cloud-init[4713]: stages.py[DEBUG]: Running module yum-add-repo (<module 'cloudinit.config.cc_yum_add_repo' from '/usr/lib/python2.7/dist-packages/cloudinit/config/cc_yum_add_repo.pyc'>) with frequency once-per-instance
Apr 16 19:04:23 cloud-init[4713]: util.py[DEBUG]: Writing to /var/lib/cloud/instances/i-078302fec7edd117b/sem/config_yum_add_repo - wb: [644] 20 bytes
Apr 16 19:04:23 cloud-init[4713]: helpers.py[DEBUG]: Running config-yum-add-repo using lock (<FileLock using file '/var/lib/cloud/instances/i-078302fec7edd117b/sem/config_yum_add_repo'>)
Apr 16 19:04:23 cloud-init[4713]: cc_yum_add_repo.py[INFO]: Skipping repo emr-platform, file /etc/yum.repos.d/emr-platform.repo already exists!
Apr 16 19:04:23 cloud-init[4713]: cc_yum_add_repo.py[INFO]: Skipping repo emr-apps, file /etc/yum.repos.d/emr-apps.repo already exists!
Apr 16 19:04:23 cloud-init[4713]: stages.py[DEBUG]: Running module package-update-upgrade-install (<module 'cloudinit.config.cc_package_update_upgrade_install' from '/usr/lib/python2.7/dist-packages/cloudinit/config/cc_package_update_upgrade_install.pyc'>) with frequency once-per-instance
Apr 16 19:04:23 cloud-init[4713]: util.py[DEBUG]: Writing to /var/lib/cloud/instances/i-078302fec7edd117b/sem/config_package_update_upgrade_install - wb: [644] 20 bytes
Apr 16 19:04:23 cloud-init[4713]: helpers.py[DEBUG]: Running config-package-update-upgrade-install using lock (<FileLock using file '/var/lib/cloud/instances/i-078302fec7edd117b/sem/config_package_update_upgrade_install'>)
Apr 16 19:04:23 cloud-init[4713]: amazon.py[DEBUG]: Upgrade level: security
Apr 16 19:04:23 cloud-init[4713]: util.py[DEBUG]: Running command ['yum', '-t', '-y', '--exclude=kernel', '--exclude=nvidia', '--exclude=cudatoolkit', '--security', '--sec-severity=critical', '--sec-severity=important', 'upgrade'] with allowed return codes [0] (shell=False, capture=False)
Apr 16 19:05:24 cloud-init[4713]: util.py[WARNING]: Package upgrade failed
Apr 16 19:05:24 cloud-init[4713]: util.py[DEBUG]: Package upgrade failed
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/cloudinit/config/cc_package_update_upgrade_install.py", line 81, in handle
cloud.distro.upgrade_packages(upgrade_level, upgrade_exclude)
File "/usr/lib/python2.7/dist-packages/cloudinit/distros/amazon.py", line 50, in upgrade_packages
return self.package_command('upgrade', args=args)
File "/usr/lib/python2.7/dist-packages/cloudinit/distros/rhel.py", line 211, in package_command
util.subp(cmd, capture=False, pipe_cat=True, close_stdin=True)
File "/usr/lib/python2.7/dist-packages/cloudinit/util.py", line 1626, in subp
cmd=args)
ProcessExecutionError: Unexpected error while running command.
Command: ['yum', '-t', '-y', '--exclude=kernel', '--exclude=nvidia*', '--exclude=cudatoolkit', '--security', '--sec-severity=critical', '--sec-severity=important', 'upgrade']
Exit code: 1
Reason: -
Stdout: ''
Stderr: ''
Apr 16 19:05:24 cloud-init[4713]: cc_package_update_upgrade_install.py[WARNING]: 1 failed with exceptions, re-raising the last one
Apr 16 19:05:24 cloud-init[4713]: util.py[WARNING]: Running module package-update-upgrade-install (<module 'cloudinit.config.cc_package_update_upgrade_install' from '/usr/lib/python2.7/dist-packages/cloudinit/config/cc_package_update_upgrade_install.pyc'>) failed
Apr 16 19:05:24 cloud-init[4713]: util.py[DEBUG]: Running module package-update-upgrade-install (<module 'cloudinit.config.cc_package_update_upgrade_install' from '/usr/lib/python2.7/dist-packages/cloudinit/config/cc_package_update_upgrade_install.pyc'>) failed
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/cloudinit/stages.py", line 660, in _run_modules
cc.run(run_name, mod.handle, func_args, freq=freq)
File "/usr/lib/python2.7/dist-packages/cloudinit/cloud.py", line 63, in run
return self._runners.run(name, functor, args, freq, clear_on_fail)
File "/usr/lib/python2.7/dist-packages/cloudinit/helpers.py", line 197, in run
results = functor(args)
File "/usr/lib/python2.7/dist-packages/cloudinit/config/cc_package_update_upgrade_install.py", line 111, in handle
raise errors[-1]
ProcessExecutionError: Unexpected error while running command.
Command: ['yum', '-t', '-y', '--exclude=kernel', '--exclude=nvidia', '--exclude=cudatoolkit', '--security', '--sec-severity=critical', '--sec-severity=important', 'upgrade']
Exit code: 1
Reason: -
Stdout: ''
Stderr: ''
Apr 16 19:05:24 cloud-init[4713]: stages.py[DEBUG]: Running module timezone (<module 'cloudinit.config.cc_timezone' from '/usr/lib/python2.7/dist-packages/cloudinit/config/cc_timezone.pyc'>) with frequency once-per-instance
Apr 16 19:05:24 cloud-init[4713]: util.py[DEBUG]: Writing to /var/lib/cloud/instances/i-078302fec7edd117b/sem/config_timezone - wb: [644] 20 bytes
Apr 16 19:05:24 cloud-init[4713]: helpers.py[DEBUG]: Running config-timezone using lock (<FileLock using file '/var/lib/cloud/instances/i-078302fec7edd117b/sem/config_timezone'>)
Apr 16 19:05:24 cloud-init[4713]: cc_timezone.py[DEBUG]: Skipping module named timezone, no 'timezone' specified
Apr 16 19:05:24 cloud-init[4713]: stages.py[DEBUG]: Running module puppet (<module 'cloudinit.config.cc_puppet' from '/usr/lib/python2.7/dist-packages/cloudinit/config/cc_puppet.pyc'>) with frequency once-per-instance
Apr 16 19:05:24 cloud-init[4713]: util.py[DEBUG]: Writing to /var/lib/cloud/instances/i-078302fec7edd117b/sem/config_puppet - wb: [644] 20 bytes
Apr 16 19:05:24 cloud-init[4713]: helpers.py[DEBUG]: Running config-puppet using lock (<FileLock using file '/var/lib/cloud/instances/i-078302fec7edd117b/sem/config_puppet'>)
Apr 16 19:05:24 cloud-init[4713]: cc_puppet.py[DEBUG]: Skipping module named puppet, no 'puppet' configuration found
Apr 16 19:05:24 cloud-init[4713]: stages.py[DEBUG]: Running module disable-ec2-metadata (<module 'cloudinit.config.cc_disable_ec2_metadata' from '/usr/lib/python2.7/dist-packages/cloudinit/config/cc_disable_ec2_metadata.pyc'>) with frequency always
Apr 16 19:05:24 cloud-init[4713]: helpers.py[DEBUG]: Running config-disable-ec2-metadata using lock (<cloudinit.helpers.DummyLock object at 0x7fb9c0cad810>)
Apr 16 19:05:24 cloud-init[4713]: cc_disable_ec2_metadata.py[DEBUG]: Skipping module named disable-ec2-metadata, disabling the ec2 route not enabled
Apr 16 19:05:24 cloud-init[4713]: stages.py[DEBUG]: Running module runcmd (<module 'cloudinit.config.cc_runcmd' from '/usr/lib/python2.7/dist-packages/cloudinit/config/cc_runcmd.pyc'>) with frequency once-per-instance
Apr 16 19:05:24 cloud-init[4713]: util.py[DEBUG]: Writing to /var/lib/cloud/instances/i-078302fec7edd117b/sem/config_runcmd - wb: [644] 20 bytes
Apr 16 19:05:24 cloud-init[4713]: helpers.py[DEBUG]: Running config-runcmd using lock (<FileLock using file '/var/lib/cloud/instances/i-078302fec7edd117b/sem/config_runcmd'>)
Apr 16 19:05:24 cloud-init[4713]: cc_runcmd.py[DEBUG]: Skipping module named runcmd, no 'runcmd' key in configuration
Apr 16 19:05:24 cloud-init[4713]: cloud-init[DEBUG]: Ran 10 modules with 2 failures
Apr 16 19:05:24 cloud-init[4713]: util.py[DEBUG]: Reading from /proc/uptime (quiet=False)
Apr 16 19:05:24 cloud-init[4713]: util.py[DEBUG]: Read 15 bytes from /proc/uptime
Apr 16 19:05:24 cloud-init[4713]: util.py[DEBUG]: cloud-init mode 'modules' took 140.880 seconds (140.88)
Apr 16 19:08:10 cloud-init[5562]: util.py[DEBUG]: Cloud-init v. 0.7.6 running 'modules:final' at Tue, 16 Apr 2019 19:08:10 +0000. Up 483.41 seconds.
Apr 16 19:08:10 cloud-init[5562]: stages.py[DEBUG]: Using distro class <class 'cloudinit.distros.amazon.Distro'>
Apr 16 19:08:10 cloud-init[5562]: stages.py[DEBUG]: Running module scripts-per-once (<module 'cloudinit.config.cc_scripts_per_once' from '/usr/lib/python2.7/dist-packages/cloudinit/config/cc_scripts_per_once.pyc'>) with frequency once
Apr 16 19:08:10 cloud-init[5562]: util.py[DEBUG]: Writing to /var/lib/cloud/sem/config_scripts_per_once.once - wb: [644] 20 bytes
Apr 16 19:08:10 cloud-init[5562]: helpers.py[DEBUG]: Running config-scripts-per-once using lock (<FileLock using file '/var/lib/cloud/sem/config_scripts_per_once.once'>)
Apr 16 19:08:10 cloud-init[5562]: stages.py[DEBUG]: Running module scripts-per-boot (<module 'cloudinit.config.cc_scripts_per_boot' from '/usr/lib/python2.7/dist-packages/cloudinit/config/cc_scripts_per_boot.pyc'>) with frequency always
Apr 16 19:08:10 cloud-init[5562]: helpers.py[DEBUG]: Running config-scripts-per-boot using lock (<cloudinit.helpers.DummyLock object at 0x7f35a40e08d0>)
Apr 16 19:08:10 cloud-init[5562]: stages.py[DEBUG]: Running module scripts-per-instance (<module 'cloudinit.config.cc_scripts_per_instance' from '/usr/lib/python2.7/dist-packages/cloudinit/config/cc_scripts_per_instance.pyc'>) with frequency once-per-instance
Apr 16 19:08:10 cloud-init[5562]: util.py[DEBUG]: Writing to /var/lib/cloud/instances/i-078302fec7edd117b/sem/config_scripts_per_instance - wb: [644] 20 bytes
Apr 16 19:08:10 cloud-init[5562]: helpers.py[DEBUG]: Running config-scripts-per-instance using lock (<FileLock using file '/var/lib/cloud/instances/i-078302fec7edd117b/sem/config_scripts_per_instance'>)
Apr 16 19:08:10 cloud-init[5562]: stages.py[DEBUG]: Running module scripts-user (<module 'cloudinit.config.cc_scripts_user' from '/usr/lib/python2.7/dist-packages/cloudinit/config/cc_scripts_user.pyc'>) with frequency once-per-instance
Apr 16 19:08:10 cloud-init[5562]: util.py[DEBUG]: Writing to /var/lib/cloud/instances/i-078302fec7edd117b/sem/config_scripts_user - wb: [644] 20 bytes
Apr 16 19:08:10 cloud-init[5562]: helpers.py[DEBUG]: Running config-scripts-user using lock (<FileLock using file '/var/lib/cloud/instances/i-078302fec7edd117b/sem/config_scripts_user'>)
Apr 16 19:08:10 cloud-init[5562]: stages.py[DEBUG]: Running module ssh-authkey-fingerprints (<module 'cloudinit.config.cc_ssh_authkey_fingerprints' from '/usr/lib/python2.7/dist-packages/cloudinit/config/cc_ssh_authkey_fingerprints.pyc'>) with frequency once-per-instance
Apr 16 19:08:10 cloud-init[5562]: util.py[DEBUG]: Writing to /var/lib/cloud/instances/i-078302fec7edd117b/sem/config_ssh_authkey_fingerprints - wb: [644] 20 bytes
Apr 16 19:08:10 cloud-init[5562]: helpers.py[DEBUG]: Running config-ssh-authkey-fingerprints using lock (<FileLock using file '/var/lib/cloud/instances/i-078302fec7edd117b/sem/config_ssh_authkey_fingerprints'>)
Apr 16 19:08:10 cloud-init[5562]: util.py[DEBUG]: Reading from /etc/ssh/sshd_config (quiet=False)
Apr 16 19:08:10 cloud-init[5562]: util.py[DEBUG]: Read 3935 bytes from /etc/ssh/sshd_config
Apr 16 19:08:10 cloud-init[5562]: util.py[DEBUG]: Reading from /home/ec2-user/.ssh/authorized_keys (quiet=False)
Apr 16 19:08:10 cloud-init[5562]: util.py[DEBUG]: Read 389 bytes from /home/ec2-user/.ssh/authorized_keys
Apr 16 19:08:10 cloud-init[5562]: stages.py[DEBUG]: Running module keys-to-console (<module 'cloudinit.config.cc_keys_to_console' from '/usr/lib/python2.7/dist-packages/cloudinit/config/cc_keys_to_console.pyc'>) with frequency once-per-instance
Apr 16 19:08:10 cloud-init[5562]: util.py[DEBUG]: Writing to /var/lib/cloud/instances/i-078302fec7edd117b/sem/config_keys_to_console - wb: [644] 20 bytes
Apr 16 19:08:10 cloud-init[5562]: helpers.py[DEBUG]: Running config-keys-to-console using lock (<FileLock using file '/var/lib/cloud/instances/i-078302fec7edd117b/sem/config_keys_to_console'>)
Apr 16 19:08:10 cloud-init[5562]: util.py[DEBUG]: Running command ['/usr/libexec/cloud-init/write-ssh-key-fingerprints', '', 'ssh-dss'] with allowed return codes [0] (shell=False, capture=True)
Apr 16 19:08:10 cloud-init[5562]: stages.py[DEBUG]: Running module phone-home (<module 'cloudinit.config.cc_phone_home' from '/usr/lib/python2.7/dist-packages/cloudinit/config/cc_phone_home.pyc'>) with frequency once-per-instance
Apr 16 19:08:10 cloud-init[5562]: util.py[DEBUG]: Writing to /var/lib/cloud/instances/i-078302fec7edd117b/sem/config_phone_home - wb: [644] 20 bytes
Apr 16 19:08:10 cloud-init[5562]: helpers.py[DEBUG]: Running config-phone-home using lock (<FileLock using file '/var/lib/cloud/instances/i-078302fec7edd117b/sem/config_phone_home'>)
Apr 16 19:08:10 cloud-init[5562]: cc_phone_home.py[DEBUG]: Skipping module named phone-home, no 'phone_home' configuration found
Apr 16 19:08:10 cloud-init[5562]: stages.py[DEBUG]: Running module final-message (<module 'cloudinit.config.cc_final_message' from '/usr/lib/python2.7/dist-packages/cloudinit/config/cc_final_message.pyc'>) with frequency always
Apr 16 19:08:10 cloud-init[5562]: helpers.py[DEBUG]: Running config-final-message using lock (<cloudinit.helpers.DummyLock object at 0x7f35a3d848d0>)
Apr 16 19:08:10 cloud-init[5562]: util.py[DEBUG]: Reading from /proc/uptime (quiet=False)
Apr 16 19:08:10 cloud-init[5562]: util.py[DEBUG]: Read 15 bytes from /proc/uptime
Apr 16 19:08:10 cloud-init[5562]: util.py[DEBUG]: Cloud-init v. 0.7.6 finished at Tue, 16 Apr 2019 19:08:10 +0000. Datasource DataSourceEc2. Up 483.59 seconds
Apr 16 19:08:10 cloud-init[5562]: util.py[DEBUG]: Writing to /var/lib/cloud/instance/boot-finished - wb: [644] 52 bytes
Apr 16 19:08:10 cloud-init[5562]: stages.py[DEBUG]: Running module power-state-change (<module 'cloudinit.config.cc_power_state_change' from '/usr/lib/python2.7/dist-packages/cloudinit/config/cc_power_state_change.pyc'>) with frequency once-per-instance
Apr 16 19:08:10 cloud-init[5562]: util.py[DEBUG]: Writing to /var/lib/cloud/instances/i-078302fec7edd117b/sem/config_power_state_change - wb: [644] 20 bytes
Apr 16 19:08:10 cloud-init[5562]: helpers.py[DEBUG]: Running config-power-state-change using lock (<FileLock using file '/var/lib/cloud/instances/i-078302fec7edd117b/sem/config_power_state_change'>)
Apr 16 19:08:10 cloud-init[5562]: cc_power_state_change.py[DEBUG]: no power_state provided. doing nothing
Apr 16 19:08:10 cloud-init[5562]: cloud-init[DEBUG]: Ran 9 modules with 0 failures
Apr 16 19:08:10 cloud-init[5562]: util.py[DEBUG]: Creating symbolic link from '/run/cloud-init/result.json' => '../../var/lib/cloud/data/result.json'
Apr 16 19:08:10 cloud-init[5562]: util.py[DEBUG]: Reading from /proc/uptime (quiet=False)
Apr 16 19:08:10 cloud-init[5562]: util.py[DEBUG]: Read 15 bytes from /proc/uptime
Apr 16 19:08:10 cloud-init[5562]: util.py[DEBUG]: cloud-init mode 'modules' took 0.182 seconds (0.18)
There is no additional bootstarp action added by me.
Edited by: BurnswickNYC on Apr 16, 2019 1:53 PM