I have a memory-intensive application with a large memory footprint. I want to use the HugePages feature in Amazon Elastic Kubernetes Service (Amazon EKS) worker nodes to improve the performance of the app.
Short description
Huge pages improve performance for workloads that require large amounts of memory access. Instead of the default 4 KB allocation, the HugePages feature turns on the allocation of 2 MB and 1 GB memory pages in the Linux kernel. Huge pages are configurable on supported instance types (such as Nitro Enclaves) that use large contiguous memory regions.
Prerequisites
This procedure requires eksctl version 0.187.0 or later. Download and install the latest version from the eksctl website.
Resolution
Use user-data to configure your Amazon EC2 worker nodes to allocate huge pages for your workload consumption. For more information, see Huge pages and transparent huge pages on the redhat.com website.
Note that you can configure huge pages with or without a launch template.
Turn on the HugePages feature with a launch template
Create a launch template to launch the worker nodes with the HugePages feature turned on:
-
Create a .txt file with the following content. Save the file with the name eks-hugepage-user-data.txt.
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="==MYBOUNDARY=="
--==MYBOUNDARY==
Content-Type: text/x-shellscript; charset="us-ascii"
#!/bin/bash -e
# Check if HugePages is activated
sudo cat /proc/sys/vm/nr_hugepages
# activate HugePages and set the kernel parameter value to 2048
sudo sysctl -w vm.nr_hugepages=2048
# Ensure HugePages is allocated after reboot
sudo echo "vm.nr_hugepages=2048" >> /etc/sysctl.conf
sudo grep Huge /proc/meminfo
echo "hugepages user data script has finished successfully."
--==MYBOUNDARY==
-
Use the console to convert the user data to base64.
Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshooting errors for the AWS CLI. Also, make sure that you're using the most recent AWS CLI version.
export BASE64_RANDOM_OUTPUT=$(cat eks-hugepage-user-data.txt | base64)
echo $BASE64_RANDOM_OUTPUT
The random output generated will be saved as a variable in the next step.
-
Create the launch template with the following commands.
$BASE64_RANDOM_OUTPUT in the following code will be replaced with the output from step 2.
LAUNCH_TEMPLATE=$(aws ec2 create-launch-template \
--launch-template-name ekshugepages \
--version-description 'Using Huge Pages with Amazon EKS' \
--launch-template-data "{\"UserData\":\"$BASE64_RANDOM_OUTPUT\",\"InstanceType\": \"m5.2xlarge\",\"TagSpecifications\":[{\"ResourceType\":\"instance\",\"Tags\":[{\"Key\":\"purpose\",\"Value\":\"hugepages\"}]}]}" --query 'LaunchTemplate.LaunchTemplateId' --output text)
echo $LAUNCH_TEMPLATE
Note the launch template ID (for example, lt-01234567890abcdef) because you will need it in the next step.
-
Create a text file with the following contents. Save the file as eks-nodegroup.yaml on your device.
Replace LAUNCH_TEMPLATE_ID with the launch template value that you noted 7 in step 3.
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: my-cluster
region: region-code
version: "1.29"
managedNodeGroups:
# Launch templates - Amazon Linux 2023
- name: hg-al2023
labels: { use-case: large-memory-access }
amiFamily: AmazonLinux2023 # or specify 'AmazonLinux2' for Amazon Linux 2
launchTemplate:
id: LAUNCH_TEMPLATE_ID
version: "1"
Note: Managed node groups created in clusters on version 1.30 or newer automatically default to use Amazon Linux 2023.
-
Create the node group in your existing Amazon EKS cluster with the following command:
eksctl create nodegroup --config-file eks-nodegroup.yaml
Turn on the HugePages feature without a launch template
Before you bootstrap an instance to the cluster, use preBootstrapCommands to turn on the HugePages feature. Create a HugePages manager node group with the following eksctl config file:
...
managedNodeGroups:
- name: hg
labels: { use-case: large-memory-access }
instanceType: m5.2xlarge
preBootstrapCommands:
# enable huge pages
- "sudo sysctl -w vm.nr_hugepages=2048"
- "sudo echo 'vm.nr_hugepages=2048' >> /etc/sysctl.conf"
Verify HugePages nodes
After you turn on HugePages nodes in your Amazon EKS cluster, verify the nodes with the following steps.
First, describe the nodes in your cluster with the following command:
kubectl describe node node_name | egrep 'Capacity|Allocatable' -A5
Your output will look similar to the following example:
Capacity:
cpu: 8
ephemeral-storage: 83873772Ki
hugepages-1Gi: 0
hugepages-2Mi: 4Gi
memory: 32386520Ki
--
Allocatable:
cpu: 7910m
ephemeral-storage: 76224326324
hugepages-1Gi: 0
hugepages-2Mi: 4Gi
memory: 27175384Ki
Next, review how huge pages are allocated to workloads in your cluster:
kubectl describe nodes node_name | grep 'Allocated' -A9
Your output will look similar to the following example:
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 150m (1%) 0 (0%)
memory 100Mi (0%) 100Mi (0%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 100Mi (2%) 100Mi (2%)
Events:
The example output shows that the hugepages-2Mi resource is currently consumed by a pod running on that node.
Related information
Manage HugePages at the kubernetes.io website