By using AWS re:Post, you agree to the AWS re:Post Terms of Use

How do I install and troubleshoot Python libraries in Amazon EMR and Amazon EMR Serverless clusters?

3 minute read
0

I want to install and troubleshoot Python libraries in Amazon EMR and Amazon EMR Serverless clusters.

Resolution

Install Python libraries in Amazon EMR clusters

To install python libraries in Amazon EMR clusters, use a bootstrap action.

Amazon EMR uses puppet, an Apache BigTop deployment mechanism, to configure and initialize applications on instances. Instance-controller is an Amazon EMR software component that runs on every cluster instance. Instance-controller initializes, and then provisions instances based on the instance configuration.

To start NodeProvisioner at the cluster startup, the instance controller runs the provision node script /usr/share/aws/emr/node-provisioner/bin/provision-node. Then, NodeProvisioner provisions all of the Amazon EMR distribution applications for the node and cluster configuration. NodeProvisioner is a final bootstrap action that runs after all other bootstrap actions run on each cluster node.

For the latest Amazon EMR clusters, bootstrap actions run before Amazon EMR installs applications that are specified at the cluster creation. Also, the bootstrap action runs before cluster nodes process data. If you add nodes to a running cluster, then bootstrap actions run on those nodes. You can create custom bootstrap actions and specify applications to install when you create your cluster.

Install Python libraries in Amazon EMR Serverless clusters

To install Python libraries and use their capabilities within your Spark jobs and notebooks, use one of the following methods based on your use case:

Troubleshoot Python libraries

Python libraries that are installed by bootstrap actions might be overrode by Amazon EMR default libraries. To resolve this issue, create a delayed bootstrap actions or a second stage bootstrap action as a running code. Or, install the packages after you receive the NODEPROVISIONSTATE SUCCESSFUL message.

The following bootstrap action upgrades the library after the application provisioning stage. Add this script as a bootstrap script that runs in the background and exits so that cluster provisioning continues. This script continues to monitor node provisioning and upgrades the library after provisioning.

Example script that upgrades the NumPy version:

#!/bin/bash
set -x

cat > /var/tmp/fix-bootstap.sh <<'EOF'
#!/bin/bash
set -x

while true; do
    NODEPROVISIONSTATE=`sed -n '/localInstance [{]/,/[}]/{
    /nodeProvisionCheckinRecord [{]/,/[}]/ {
    /status: / { p }
    /[}]/a
    }
    /[}]/a
    }' /emr/instance-controller/lib/info/job-flow-state.txt | awk ' { print $2 }'`

    if [ "$NODEPROVISIONSTATE" == "SUCCESSFUL" ]; then
        echo "Running my post provision bootstrap"
        # your code here
      sudo /mnt/notebook-env/bin/pip install pandas==1.3.5
      sudo /mnt/notebook-env/bin/pip install boto==2.49.0 
      sudo /mnt/notebook-env/bin/pip install boto3==1.25.0 
        exit
    else
        echo "Sleeping Till Node is Provisioned"
        sleep 10
    fi
done

EOF

chmod +x /var/tmp/fix-bootstap.sh
nohup /var/tmp/fix-bootstap.sh  2>&1 &

Note: YARN containers that run a Python package might not use an updated package that can be installed with the preceding resolution. As a result, you'll receive module not found errors when you attempt to install an updated package. To prevent module not found errors, poll the nodemanager service state. Then, run the desired bootstrap action when the nodemanager starts.

AWS OFFICIAL
AWS OFFICIALUpdated 3 months ago
1 Comment

There are downsides/risks to using this method as failure on running this will not be reported or notified to the customer and they will have to develop monitoring separately or added to this script to handle failure and possibly stop nodemanager to prevent application failure etc.

The better alternative to this would be to bake this in the puppet script itself as then the failure would result in node provision failure hence visible to customer.

AWS
SUPPORT ENGINEER
replied 2 months ago