Whether job notifications and downtime enabled during DocumentDB maintenance are available

0

Hello I am running Document DB of engine version 5.00 in production account.

Around 4:57 a.m. (KST) on April 24 (DB cluster maintenance period) We received a notification that the connection with the Document DB was cut off for about 10 seconds due to our service server notification.

Related application logs. 2024-04-24T04:59:14.252+09:00 WARN 1 --- o.s.b.a.data.mongo.MongoHealthIndicator : MongoDB health check failed org.springframework.dao.DataAccessResourceFailureException: Prematurely reached end of stream at

Based on Document DB monitoring indicators, the following phenomena were discovered at the same time.

    • increased VolumeReadIOPs ( 0 -> 1789)
    • increased ReadThroughput
    • increased ReadLatency
    • decreased IndexBufferCacheHitRatio (100->99.9)
    • decreased EngineUptime (6.5M => 785)

There was no notification on CloudTrail, AWS Healthcheck Dashboard, or e-mail, so I searched the event history through Document DB API.

{
    "SourceIdentifier": "example-documentdb",
    "SourceType": "db-cluster",
    "Message" : "Database cluster engine version upgrade started.",
    "EventCategories": [
    "maintenance"
    ],
    "Date": "2024-04-23T19:58: 31. 292000+00: 00",
    "SourceArn": "arn:aws:rds: ap-northeast-2:01234567890: cluster:example-documentdb"
}
{
    "SourceIdentifier": "example-documentdb2",
    "SourceType": "db-instance"
    "Message": "DB instance shutdown",
    "EventCategories": [
    "availability"
    ],
    "Date" : "2024-04-23T19:59:13.484000+00: 00",
    "SourceArn": "arn:aws:rds: ap-northeast-2:01234567890:db: example-documentdb2"
}
{
    "SourceIdentifier": "example-documentdb3",
    "SourceType" : "db-instance"
    "Message": "Read replica has been disconnected from master. Restarting database.",
    "EventCategories": [],
    "Date": "2024-04-23T19:59:20.830000+00:00",
"S  ourceArn": "arn:aws:rds: ap-northeast-2:01234567890: db:example-documentdb3"
}
{
    "SourceIdentifier": "example-documentdb",
    "SourceType" : "db-instance"
    "Message": "Read replica has been disconnected from master. Restarting database.",
    "EventCategories": [],
    "Date" : "2024-04-23T19:59:22.047000+00:00",
"S  ourceArn" : "arn: aws: rds: ap-northeast-2:01234567890:db: example-documentdb"
}
{
    "SourceIdentifier": "example-documentdb2",
    "SourceType" : "db-instance"
    "Message": "DB instance restarted",
    "EventCategories": [
    "availability"
    "Date": "2024-04-23T19:59: 25. 460000+00: 00",
    "SourceArn": "arn:aws:rds: ap-northeast-2:01234567890:db: example-documentdb2"
}
{
    "SourceIdentifier": "example-documentdb3",
    "SourceType" : "db-instance"
    "Message": "DB instance restarted",
    "EventCategories": [
    "availability"
    "Date": "2024-04-23719:59:33. 382000+00:00"
    "SourceArn": "arn:aws:rds: ap-northeast-2:01234567890:db: example-documentdb3"
}
{
    "SourceIdentifier": "example-documentdb",
    "SourceType": "db-instance"
    "Message": "DB instance restarted",
    "EventCategories": [
    "availability"
    "Date": "2024-04-23719:59:33.986000+00:00",
    "SourceArn": "arn:aws:rds: ap-northeast-2:01234567890:db: example-documentdb"
}
{
    "SourceIdentifier": "example-documentdb",
    "SourceType" : "db-cluster",
    "Message": "Database cluster engine version has been upgraded.",
    "EventCategories": [
    "maintenance"
    ],
    "Date": "2024-04-23T20:00:36. 305000+00:00",
    "SourceArn" : "arn:aws:rds: ap-northeast-2:01234567890:cluster :example-documentdb"
}

I checked the document DBs of the same engine version in another account There was a Pending Maintenance called "Bugfix."

When you run a bugfix operation immediately, similar event history and monitoring indicators were recorded Those DBs were not clustered, they had no downtime. There was no message saying "Read replica has been disconnected from master. Restarting database"

About the problem

  1. Whether the "Bugfix" operation is the cause of the application connection failure
  2. Why didn't I get a notification for a job like "Bug fix"
  3. What kind of setup is needed I'm curious about .
hailey
asked a month ago39 views
No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions