My Amazon OpenSearch Service cluster has high Java Virtual Machine (JVM) memory pressure, and I want to reduce it.
Short description
By default, OpenSearch Service uses 50% of an Amazon Elastic Compute Cloud (Amazon EC2) instance's RAM for JVM heaps up to 32 GiB. The JVM memory pressure specifies the percentage of the Java heap in a cluster node.
The following scenarios can cause high JVM memory pressure:
- Spikes in the numbers of requests to the cluster
- Aggregations, wildcards, and wide time ranges in the queries
- Unbalanced shard allocations across nodes or too many shards in a cluster
- Field data or index mapping explosions
- Amazon EC2 instance types that can't manage inbound load
Resolution
Monitor patterns in your data
To resolve high JVM memory pressure issues, reduce traffic to the cluster.
Run the following command to get node level statistics about your cluster and identify nodes that experience memory pressure or excessive garbage collection:
GET _nodes/stats/jvm
To further identify faulty requests, activate slow logs. For more information, see Shard slow logs on the OpenSearch website. Make sure that the JVM memory pressure is below 90%. For more information about slow queries, see Advanced tuning: finding and fixing slow Elasticsearch queries on the Elasticsearch website.
Use Amazon CloudWatch to monitor JVM memory usage and garbage collection behavior over time. Use this information to detect patterns and take action before you experience cluster instability. Also, configure CloudWatch alarms to proactively detect and resolve high JVM memory pressure.
Check your cache settings
To clear the field data cache, run the following query:
POST /index_name/_cache/clear?fielddata=true
Note: When you clear the cache, you might disrupt queries that are in progress.
If you exceed the JVM circuit breakers and your memory usage remains unchecked, then you receive a JVM OutOfMemoryError. To resolve this issue, modify the parent circuit breaker field data cache allocation or the request circuit breaker settings based on your configuration's requirements. For information about how to modify these cluster-level settings, see Cluster settings API on the OpenSearch website.
Optimize your configuration
Use the following best practices to optimize your configuration:
For more information about how to troubleshoot high JVM memory pressure, see Why did my OpenSearch Service node crash?
Understand the effects of high JVM memory pressure
The following scenarios show how OpenSearch Service manages different JVM memory pressure percentages:
- If JVM memory pressure reaches 75%, then OpenSearch Service initiates the Concurrent Mark Sweep (CMS) garbage collector for x86 instance types. ARM-based Graviton instance types use the Garbage-First (G1) garbage collector that uses additional short pauses and heap defragmentation.
Note: The garbage collection is a CPU-intensive process. If memory usage continues to grow, then you might encounter "ClusterBlockException", "JVM OutOfMemoryError", or other cluster performance issues.
- If JVM memory pressure exceeds 92% for 30 minutes, then OpenSearch Service blocks all write operations.
- If JVM memory pressure reaches 100%, then OpenSearch Service exits and eventually restarts the instance with the "OutOfMemory (OOM)" error message.
Related information
Troubleshooting OpenSearch Service
Get started with OpenSearch Service: How many shards do I need?