- Newest
- Most votes
- Most comments
The memory utilization behavior you're observing with Amazon OpenSearch Service is actually expected and not necessarily a cause for concern. Let me address your questions:
-
Data loss conditions: Data loss in OpenSearch typically doesn't occur simply due to high memory or CPU usage. The data loss you experienced with t3.small and t3.medium instances was likely due to these being burstable instance types that aren't recommended for production workloads. These smaller instances have limited resources that can be exhausted quickly, potentially leading to cluster instability.
-
Young GC Time: Increasing Young Garbage Collection time is normal as the JVM manages memory. While excessive GC can impact performance, it doesn't directly cause data loss. As you observed, larger instances show slower growth in GC time, which is expected.
-
Memory utilization at 90%: High memory utilization in OpenSearch is by design. OpenSearch is built to use available memory efficiently, with a significant portion allocated to the JVM heap for caching and query processing. The 90% utilization you're seeing is likely the JVM using its allocated memory, which is normal even without data. This doesn't necessarily pose a risk to data persistence if you're using appropriately sized instances.
-
High memory utilization without data: OpenSearch pre-allocates memory for its operations, including the JVM heap, field data cache, and system processes. Even without indexes or documents, the service reserves memory for these functions, which is why you see high utilization.
For your use case, I would recommend:
- Consider using r-type instances (like r6g or r7g) which are memory-optimized and better suited for OpenSearch workloads
- Ensure you have at least 3 data nodes for production environments to provide redundancy
- Enable dedicated master nodes for clusters that will handle significant workloads
- Monitor performance metrics beyond just memory utilization, such as cluster health, indexing rate, and search latency
The m7g instances you've been testing are general-purpose instances powered by AWS Graviton processors. While they can work for OpenSearch, r-type instances might be more appropriate for your production environment to ensure data persistence and optimal performance.
Sources
Data-Sorting Process Takes time with OpenSearch | AWS re:Post
Valkey-, Memcached-, and Redis OSS-Compatible Cache – Amazon ElastiCache Previous Generation - AWS
CacheNode - Amazon ElastiCache
How ZS built a clinical knowledge repository for semantic search using Amazon OpenSearch Service and Amazon Neptune | AWS Big Data Blog
