How do I install libraries in my Amazon MWAA environment?

3 minute read
0

I want to install libraries in my Amazon Managed Workflows for Apache Airflow (Amazon MWAA) environment.

Short description

To install Python libraries on Amazon MWAA environments, use either requirements.txt or plugins.zip.

When you use requirements.txt, pip installs the listed packages from the Python Package Index (PyPI) by default.

If you install custom libraries or packages with compiled artifacts as .whl files, then use plugins.zip. When you install custom Apache Airflow operations, hooks, sensors, or interfaces, you must use plugins.zip. Plugins export environment variables, authentication, and config files such as, .crt and .yaml.

Amazon MWAA offers public network and private network web server options. When you use the public network, you have access to an internet route. When you use the private network, you don't have internet access. The Private server is located in the Amazon MWAA service virtual private cloud (VPC).

Note: For the earlier Amazon MWAA 2.0.2 and 1.10.12 versions, requirements and plugins aren't installed on the web server by default.

Resolution

To install Python dependencies on an Amazon MWAA environment with private web server versions 2.2.2 and later, use Python wheels (.whl).

Set up your Amazon MWAA local environment

Complete the following steps:

  1. Build the Docker image, and then set up an Amazon MWAA local environment. See aws-mwaa-local-runner on the GitHub website. Amazon MWAA repository provides a command line interface (CLI) utility that locally replicates an Amazon MWAA environment.
  2. Add the Python library and dependences to a requirements.txt file.
  3. Use the following script to test the requirements.txt file:
    #aws-mwaa-local-runner % ./mwaa-local-env test-requirements

The output looks similar to the following example:

Installing requirements.txt
Collecting aws-batch (from -r /usr/local/airflow/dags/requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/5d/11/3aedc6e150d2df6f3d422d7107ac9eba5b50261cf57ab813bb00d8299a34/aws_batch-0.6.tar.gz
Collecting awscli (from aws-batch->-r /usr/local/airflow/dags/requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/07/4a/d054884c2ef4eb3c237e1f4007d3ece5c46e286e4258288f0116724af009/awscli-1.19.21-py2.py3-none-any.whl (3.6MB)
    100% |████████████████████████████████| 3.6MB 365kB/s 
...
...
...
Installing collected packages: botocore, docutils, pyasn1, rsa, awscli, aws-batch
  Running setup.py install for aws-batch ... done
Successfully installed aws-batch-0.6 awscli-1.19.21 botocore-1.20.21 docutils-0.15.2 pyasn1-0.4.8 rsa-4.7.2

For more information, see Installing Python dependencies using PyPi.org requirements file format.

Build the .whl files from the requirements.txt file

Run the following package-requirements local-runner command:

#aws-mwaa-local-runner % ./mwaa-local-env package-requirements

The command downloads all .whl files into the aws-mwaa-local-runner/plugin folder.

Create a plugins.zip file, including .whl files and an Amazon MWAA constraint

Download the constraints.txt, copy the text, and enter it into the plugin's directory. Then, run the following command to create the plugins.zip file:

#aws-mwaa-local-runner % zip -j requirements/plugins.zip plugins/constraints.txt

Create a new requirements.txt file that points to the .whl files that are packaged in the plugins.zip file

Complete the following steps:

  1. Use a text tool to create the new requirements.txt file in the following format:

    =========new requirements.txt==========
    --find-links /usr/local/airflow/plugins
    
    --no-index
    
    --constraint "/usr/local/airflow/plugins/constraints.txt"
    
    aws-batch==0.6
    ====================================
  2. Upload the plugins.zip files and requirements.txt files to the Amazon Simple Storage Service (Amazon S3) bucket of your Amazon MWAA cluster.

  3. Update the environment.

Troubleshoot package installation

Use the aws-mwaa-local-runner to test DAGs, custom plugins, and Python dependencies. View the log file from the Apache Airflow Worker or Scheduler log group.

Important: Before you install the packages or plugins.zip file, use the Amazon MWAA CLI utility to test Python dependencies and the plugins.zip file.

Related information

Option two: Python wheels (.whl)

Plugins

AWS OFFICIAL
AWS OFFICIALUpdated 21 days ago