RDS CloudFormation Stack Delete fails with Option Group in use

0

We are consistently getting the following errors when deleting RDS Cloudformation stacks:

DELETE_FAILED The option group 'XXXXXXXXXX' cannot be deleted because it is in use. (Service: Rds, Status Code: 400, Request ID: XXXXXXXXXX)

The Option Group is not in use; it is associated with only 1 RDS instance and no manual snapshots. The instance was deleted at 09:23:27; almost 2 full minutes before the failed OptionGroup delete.

It happens even with manual deletions (not just CloudFormation stack deletions). I find that I must wait anywhere from 1 - 5 minutes after deleting the DB resource before I can successfully delete the Option Group.

After waiting a few minutes, a subsequent stack delete will remove the Option Group and the stack. I suspect some type of "race" condition is causing this error.

Any ideas on how to address this issue?

Thanks.

  • Thanks Jacco. I'm not crazy about the extra complexity of the wait condition; I think I will open a support request. I appreciate your help.

  • Hi LBC, did you ever get this resolved or find a solution? I am also experiencing the same thing!

2 Answers
0

Hello,

It might be possible that the CloudFormation stack you are using does not include a proper depency relationship between the RDS Instance and the Option Group.

Such a relation can be achieved in two ways:

  • implicitly: using a !Ref OptionGroupName in your template
  • explicitly: adding a DependsOn property to the RDS Instance indicating that it depends on the Option Group

If you do not have this dependency relation CloudFormation does not "know" the resources are related and will create/destroy them in random order.

Deleting a RDS Instance might take considerable time and independent resources are deleted in parallel. That's why the Option Group is being deleted earlier than the RDS Instance.

Also this probably went unnoticed because while creating because the Option Group was ready when the RDS Instance needed it.

It could in theory be something else but this is my best guess without more information.

Regards, Jacco

profile picture
JaccoPK
answered a year ago
  • Jacco,

    We are using an implicit dependency in our template. Here is a snippet of our templates:

    Resources: ORADB: Type: 'AWS::RDS::DBInstance' Properties: DBInstanceIdentifier: !Ref DBInstanceName DBName: !Ref DBInstanceName ... AllocatedStorage: !Ref DBAllocatedStorage MaxAllocatedStorage: 16000 DBParameterGroupName: !Ref ORADBParameterGroup OptionGroupName: !Ref ORAOptionGroup

    and further down in the template we have the definition for the option group:

    ORAOptionGroup: Type: AWS::RDS::OptionGroup Properties: OptionGroupName: !Ref DBInstanceName OptionConfigurations: - ...

    Thanks.

  • Ah, ok the it must be a bug :-) The only suspicious thing I see from your code is the !Ref DBInstanceName being used a bit much, it shouldn't matter but changing the parameter value would cause a disaster :-)

    Maybe it is worth looking into https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-waitcondition.html?tag=softwarepri0d-20

    Atleast until they fix it.

    I would definitely raise a support ticket.

0

Hi There

The reason this is happening is likely because the option group is associated with the automated snapshots. When you delete the RDS instance, the automated snapshots eventually get cleaned up but theres no specific documented time when this happens. You cant manually delete automated snapshots. You could try changing the automated snapshot retention period to zero before deleting the stack, this should remove the automated snapshots immediately.

See https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_DeleteSnapshot.html

If you have automated DB snapshots that you want to delete without deleting the DB instance, change the backup retention period for the DB instance to 0. The automated snapshots are deleted when the change is applied. You can apply the change immediately if you don't want to wait until the next maintenance period.

profile pictureAWS
EXPERT
Matt-B
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions