I want to troubleshoot disk usage in my Amazon OpenSearch Service that's skewed because of unevenly distributed disc space across the nodes.
Short description
The following issues cause skewed disk usage:
- Shard sizes are uneven in a cluster. OpenSearch Service evenly distributes the number of shards across nodes, but different shard sizes require different amounts of disk space.
- There isn't enough available disk space on a node. For more information, see Disk-based shard allocation settings on the Elastic website.
- The Elasticsearch shard allocation strategy is uneven.
Resolution
To rebalance the shard allocation in your OpenSearch Service cluster, check your shard allocation, shard sizes, and index sharding strategy. Then, complete the following resolution steps based on your findings.
Check the shard allocation, shard sizes, and index sharding strategy
To check how many shards OpenSearch Service allocated to each node and the amount of disk space each node uses, run the following API operation:
GET _cat/allocation?v
To check which shards OpenSearch Service allocated to each node and the size of each shard, run the following API operation:
GET _cat/shards?v
Note: The output of the preceding API operation shows whether the size of shards can vary for different indices.
To check the sharding strategy for indices, run the following API operation:
GET _cat/indices?v
Check that shards are of equal size across the indices
If the index size varies significantly, then use the rollover API to create a new index when your index reaches a specified size. For more information, see Roll over to a new index on the Elastic website. Or, use the Index State Management (ISM) to create a new index for OpenSearch Service versions 7.1 and later. For more information about how to use ISM to roll over an alias, see rollover on the Open Distro website.
Use appropriate shard sizes for your instance
If you have a large class of Amazon Elastic Compute Cloud (Amazon EC2) instances, then use the Petabyte scale for OpenSearch Service to determine the maximum shard size for your instance. For example, an OpenSearch Service domain with several i3.16xlarge.search instances supports shard sizes of up to 100 GB because there are more resources available. For most instances, Keep shard sizes between 10 GB and 50 GB. For more information about sharding strategy, see the Choosing the number of shards.
Add more data nodes to your OpenSearch Service cluster
If your OpenSearch Service cluster has high disk usage levels, then add more data nodes to your cluster. The addition of data nodes also adds more resources to improve cluster performance.
Note: OpenSearch Service doesn't automatically rebalance the cluster when the cluster lacks sufficient storage space. As a result, if a data node runs out of unused storage space, then the cluster blocks writes. For more information about disk space management, see How do I troubleshoot low storage space in my OpenSearch Service domain?
Update your sharding strategy
By default, OpenSearch Service has a sharding strategy of 5:1 and divides each index into five primary shards. Within each index, each primary shard also has a replica. OpenSearch Service automatically assigns primary shards and replica shards to separate data nodes and provides a backup in case of failure.
To modify the default behavior of OpenSearch Service, design your indices so that OpenSearch Service equally distributes your shards by size.
For existing indices, use the reindex API operation to change the number of primary shards. For more information, see Reindex documents on the Elastic website. The reindex API can merge smaller indices into a bigger index or split up larger indexes. When a large index is split into more primary shards, the shard sizes decrease.
For new indices, use the index template API to define the number of primary and replica shards for your strategy. For more information, see Create or update index template on the Elastic website.
Then, update the indices settings for your shards. For more information, see Update index settings on the Elastic website.
Delete old or unused indices to free up disk space
Note: OpenSearch Service or Elasticsearch version 6.8 or later support ISM.
Use ISM to define custom management policies that delete old or unused indices after a duration that you specify.
Related information
Get started with Amazon Elasticsearch Service: How many shards do I need?