Don't Let CloudFormation Drop Your Tables: A Complete Guide to DeletionPolicy and UpdateReplacePolicy

/images/blog/posts/cloudformation-deletion-policy.png

One wrong CloudFormation update can delete your production database. Learn how to use DeletionPolicy, UpdateReplacePolicy, and Stack Policies to protect your stateful resources from accidental destruction.

/images/blog/posts/cloudformation-deletion-policy.png

The Nightmare Scenario

Picture this: You’re updating your CloudFormation stack to add a new GSI to your DynamoDB table. You’ve done this dozens of times. You run aws cloudformation update-stack, grab a coffee, and return to find your table—along with 18 months of production data—has been deleted and replaced with a shiny new empty table.

This isn’t hypothetical. It happens regularly to teams who don’t understand how CloudFormation handles resource updates. Some changes trigger replacement, not modification, and without proper protection, your data vanishes.

This guide covers everything you need to prevent this disaster: DeletionPolicy, UpdateReplacePolicy, Stack Policies, and Termination Protection.


Understanding Resource Replacement

Before diving into protection mechanisms, you need to understand why CloudFormation deletes resources in the first place.

The Three Update Behaviors

When you modify a resource in your CloudFormation template, one of three things happens:

Update TypeWhat HappensData Impact
No InterruptionResource updated in placeData preserved
Some InterruptionBrief service disruptionData preserved
ReplacementOld resource deleted, new resource createdData lost

What Triggers Replacement?

Each AWS resource has properties that, when changed, force replacement. Common examples:

DynamoDB Tables:

  • Changing the primary key (partition/sort key)
  • Changing TableName

RDS Instances:

  • Changing DBInstanceIdentifier
  • Changing Engine
  • Changing AvailabilityZone (single-AZ deployments)

S3 Buckets:

  • Changing BucketName

EC2 Instances:

  • Changing InstanceType (in some cases)
  • Changing AvailabilityZone

The only way to know for certain is to create a Change Set before updating:

aws cloudformation create-change-set \
  --stack-name my-stack \
  --template-body file://template.yaml \
  --change-set-name my-change-set

aws cloudformation describe-change-set \
  --change-set-name my-change-set \
  --stack-name my-stack

If you see "Replacement": "True" for a resource, that resource will be deleted and recreated.


DeletionPolicy: Your First Line of Defense

The DeletionPolicy attribute tells CloudFormation what to do with a resource when:

  1. The stack is deleted
  2. The resource is removed from the template

Available Options

PolicyBehaviorUse Case
DeleteResource is deleted (default)Non-critical, recreatable resources
RetainResource is preserved, but orphanedCritical data that must survive stack deletion
SnapshotSnapshot created before deletionDatabases, EBS volumes
RetainExceptOnCreateRetain unless create failsRecommended for most stateful resources

The RetainExceptOnCreate Game-Changer

Introduced in July 2023, RetainExceptOnCreate solves a long-standing problem with Retain: what happens when the initial resource creation fails?

With Retain, if your stack creation fails and rolls back, you’re left with an orphaned, empty resource. With RetainExceptOnCreate:

  • If stack creation fails → resource is deleted (it was empty anyway)
  • If stack deletion or update happens → resource is retained

This should be your default for stateful resources.

Examples by Resource Type

DynamoDB Table

Resources:
  UsersTable:
    Type: AWS::DynamoDB::Table
    DeletionPolicy: RetainExceptOnCreate
    Properties:
      TableName: users-prod
      AttributeDefinitions:
        - AttributeName: userId
          AttributeType: S
      KeySchema:
        - AttributeName: userId
          KeyType: HASH
      BillingMode: PAY_PER_REQUEST

RDS Instance (with Snapshot)

Resources:
  ProductionDatabase:
    Type: AWS::RDS::DBInstance
    DeletionPolicy: Snapshot
    Properties:
      DBInstanceIdentifier: prod-mysql
      Engine: mysql
      DBInstanceClass: db.r5.large
      AllocatedStorage: 100
      MasterUsername: admin
      MasterUserPassword: !Ref DBPassword

When deleted, CloudFormation creates a final snapshot named something like my-stack-ProductionDatabase-XXXXXXXXXXXX.

S3 Bucket

Resources:
  DataBucket:
    Type: AWS::S3::Bucket
    DeletionPolicy: Retain
    Properties:
      BucketName: company-data-prod

Important: S3 buckets can only be deleted if empty. If you use Delete policy on a non-empty bucket, the stack deletion will fail.


UpdateReplacePolicy: Protection During Updates

Here’s the critical insight: DeletionPolicy doesn’t protect against replacement during stack updates.

If CloudFormation needs to replace a resource (delete old, create new), DeletionPolicy doesn’t apply—the resource is being replaced, not deleted.

That’s where UpdateReplacePolicy comes in.

How It Works

PolicyDuring Replacement
DeleteOld resource deleted (default)
RetainOld resource preserved
SnapshotSnapshot created before deletion

The Complete Protection Pattern

For critical stateful resources, use both policies:

Resources:
  CriticalDatabase:
    Type: AWS::RDS::DBInstance
    DeletionPolicy: Snapshot
    UpdateReplacePolicy: Snapshot
    Properties:
      DBInstanceIdentifier: critical-db
      Engine: postgres
      DBInstanceClass: db.r5.xlarge
      AllocatedStorage: 500

This ensures:

  • Stack deletion → Final snapshot created
  • Resource replacement → Snapshot created before old instance deleted

DynamoDB Complete Example

Resources:
  OrdersTable:
    Type: AWS::DynamoDB::Table
    DeletionPolicy: RetainExceptOnCreate
    UpdateReplacePolicy: Retain
    Properties:
      TableName: orders-production
      AttributeDefinitions:
        - AttributeName: orderId
          AttributeType: S
        - AttributeName: customerId
          AttributeType: S
      KeySchema:
        - AttributeName: orderId
          KeyType: HASH
      GlobalSecondaryIndexes:
        - IndexName: CustomerIndex
          KeySchema:
            - AttributeName: customerId
              KeyType: HASH
          Projection:
            ProjectionType: ALL
      BillingMode: PAY_PER_REQUEST
      PointInTimeRecoverySpecification:
        PointInTimeRecoveryEnabled: true

Stack Policies: Prevent Accidental Updates

DeletionPolicy and UpdateReplacePolicy protect your data if something goes wrong. Stack Policies prevent wrong things from happening in the first place.

A Stack Policy is a JSON document that defines which resources can be updated and how.

Default Behavior (No Stack Policy)

Without a stack policy, all resources can be updated or replaced by anyone with permission to update the stack.

Creating a Protective Stack Policy

{
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "Update:*",
      "Principal": "*",
      "Resource": "*"
    },
    {
      "Effect": "Deny",
      "Action": "Update:Replace",
      "Principal": "*",
      "Resource": "LogicalResourceId/ProductionDatabase"
    },
    {
      "Effect": "Deny",
      "Action": "Update:Delete",
      "Principal": "*",
      "Resource": "LogicalResourceId/ProductionDatabase"
    }
  ]
}

This policy:

  • Allows all updates by default
  • Denies any update that would replace or delete the ProductionDatabase resource

Applying a Stack Policy

aws cloudformation set-stack-policy \
  --stack-name my-stack \
  --stack-policy-body file://stack-policy.json

Temporarily Overriding Stack Policy

When you legitimately need to replace a protected resource:

aws cloudformation update-stack \
  --stack-name my-stack \
  --template-body file://template.yaml \
  --stack-policy-during-update-body file://temporary-policy.json

Stack Termination Protection

The final safety net: prevent the entire stack from being deleted.

Enable Termination Protection

# On new stack
aws cloudformation create-stack \
  --stack-name my-stack \
  --template-body file://template.yaml \
  --enable-termination-protection

# On existing stack
aws cloudformation update-termination-protection \
  --stack-name my-stack \
  --enable-termination-protection

When Someone Tries to Delete

An error occurred (ValidationError) when calling the DeleteStack operation:
Stack [my-stack] cannot be deleted while TerminationProtection is enabled

They must explicitly disable protection first—a deliberate action that’s harder to do accidentally.


Environment-Specific Policies

You probably want different protection levels for development vs production. CloudFormation supports conditions:

Parameters:
  Environment:
    Type: String
    AllowedValues:
      - dev
      - staging
      - prod

Conditions:
  IsProduction: !Equals [!Ref Environment, prod]

Resources:
  Database:
    Type: AWS::RDS::DBInstance
    DeletionPolicy: !If [IsProduction, Snapshot, Delete]
    UpdateReplacePolicy: !If [IsProduction, Snapshot, Delete]
    Properties:
      DBInstanceIdentifier: !Sub "${Environment}-database"
      # ... other properties

Now:

  • Production: Snapshots on deletion/replacement
  • Dev/Staging: Clean deletion (no orphaned resources)

The Complete Protection Checklist

For Every Stateful Resource

  1. Add DeletionPolicy

    • RetainExceptOnCreate for tables, buckets
    • Snapshot for RDS, EBS, ElastiCache, Redshift
  2. Add UpdateReplacePolicy

    • Match your DeletionPolicy
    • Protects against replacement during updates
  3. Enable Point-in-Time Recovery (where available)

    • DynamoDB: PointInTimeRecoverySpecification
    • RDS: EnableIAMDatabaseAuthentication, automated backups

For Production Stacks

  1. Create Stack Policy

    • Deny Update:Replace and Update:Delete on critical resources
  2. Enable Termination Protection

    • Prevents accidental stack deletion
  3. Always Use Change Sets

    • Review changes before applying
    • Look for "Replacement": "True" warnings

Example: Fully Protected Stack

AWSTemplateFormatVersion: '2010-09-09'
Description: Production stack with full protection

Parameters:
  Environment:
    Type: String
    Default: prod

Resources:
  # DynamoDB with full protection
  UsersTable:
    Type: AWS::DynamoDB::Table
    DeletionPolicy: RetainExceptOnCreate
    UpdateReplacePolicy: Retain
    Properties:
      TableName: !Sub "${Environment}-users"
      AttributeDefinitions:
        - AttributeName: userId
          AttributeType: S
      KeySchema:
        - AttributeName: userId
          KeyType: HASH
      BillingMode: PAY_PER_REQUEST
      PointInTimeRecoverySpecification:
        PointInTimeRecoveryEnabled: true

  # RDS with snapshot protection
  Database:
    Type: AWS::RDS::DBInstance
    DeletionPolicy: Snapshot
    UpdateReplacePolicy: Snapshot
    Properties:
      DBInstanceIdentifier: !Sub "${Environment}-db"
      Engine: postgres
      EngineVersion: '15.4'
      DBInstanceClass: db.r5.large
      AllocatedStorage: 100
      StorageEncrypted: true
      BackupRetentionPeriod: 30
      DeletionProtection: true
      MasterUsername: admin
      MasterUserPassword: !Ref DBPassword

  # S3 bucket retained
  DataBucket:
    Type: AWS::S3::Bucket
    DeletionPolicy: Retain
    UpdateReplacePolicy: Retain
    Properties:
      BucketName: !Sub "${Environment}-data-${AWS::AccountId}"
      VersioningConfiguration:
        Status: Enabled

Recovery: What If It Already Happened?

If you’ve already lost data, your options depend on the resource type:

ResourceRecovery Options
DynamoDBPoint-in-time recovery (if enabled), S3 exports
RDSAutomated backups, manual snapshots, read replicas
S3Versioning (if enabled), cross-region replication
EBSSnapshots

For future protection, enable these features before you need them.


Conclusion

CloudFormation is powerful but unforgiving. A single misplaced property change can trigger resource replacement, and without proper protection, your data disappears.

The minimum viable protection for any production stack:

  1. DeletionPolicy: RetainExceptOnCreate or Snapshot on stateful resources
  2. UpdateReplacePolicy: Retain or Snapshot on the same resources
  3. Stack Termination Protection enabled
  4. Always review Change Sets before updating

These four practices take minutes to implement and can save you from catastrophic data loss.


Related Reading:

References: