Snapshot automation for Amazon Redshift Serverless

4 minute read
Content level: Expert
2

Automate storing the snapshots for Amazon Redshift Serverless for longer duration

Short description

Redshift Serverless Snapshot Automation is an utility that aims to help Amazon Redshift Serverless users to automatize the snapshots. This tool will help you to automatize the backups given a schedule (every day, every hour, etc) and choose their retention period.

Architecture

Redshift Serverless Snapshot Automation Architecture

Solution overview

The solution consists on a AWS Lambda Python script based on Boto3, an AWS SDK for Python that can access the Amazon Redshift Serverless API. This script will take the snapshots of all Namespaces listed on the AWS CloudFormation template based on the scheduling and keep them up to the retention (all variables that will be an input for the template).

Download and save the cloudformation in your local. For this post, I have save this file in my local as snapshot.cft.

AWSTemplateFormatVersion: '2010-09-09'
Description: Lambda function to automate the snapshot of redshift serverless namespace.
Parameters:
  inputNamespace:
    Description: Enter the list of Redshift Serverless namespace for which snapshot needs to be created
    Type: String
    Default: 'namespace-rs-64, namespace-rs-65'
  inputRetentionPeriod:
    Description: Enter the retention period for the snapshot
    Type: String
    Default: '10'
  inputScheduleExpression:
    Description: Add the schedule details in format rate(value unit) unit:minute | minutes | hour | hours | day | days
    Type: String
    Default: 'rate(2 minutes)'
Metadata:
  AWS::CloudFormation::Interface:
    ParameterGroups:
      -
        Label:
          default: "Input Parameters"
        Parameters:
        - inputNamespace
        - inputRetentionPeriod

Resources:
  redshiftServerlessSnapshotAutomationRole :
      Type: AWS::IAM::Role
      Properties:
        Description : IAM Role for lambda functions to access Redshift and create cloud watch logs
        ManagedPolicyArns:
          - arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
        AssumeRolePolicyDocument:
            Version: 2012-10-17
            Statement:
              -
                Effect: Allow
                Principal:
                  Service:
                    - lambda.amazonaws.com
                Action:
                  - sts:AssumeRole
        Path: /
        Policies:
            -
              PolicyName: redshiftServerlessSnapshotAutomationPolicy
              PolicyDocument :
                Version: 2012-10-17
                Statement:
                  -
                    Effect: Allow
                    Action:
                    - redshift-serverless:createsnapshot
                    Resource: "*"

  redshiftServerlessSnapshotAutomation:
    DependsOn:
        - redshiftServerlessSnapshotAutomationRole
    Type: AWS::Lambda::Function
    Properties:
      Runtime: python3.9
      Role: !GetAtt 'redshiftServerlessSnapshotAutomationRole.Arn'
      Handler: index.lambda_handler
      Code:
          ZipFile: |
            import json
            import time
            import os
            import logging
            logger = logging.getLogger(__name__)
            import sys
            from pip._internal import main

            main(['install', '-I', '-q', 'boto3', '--target', '/tmp/', '--no-cache-dir', '--disable-pip-version-check'])
            sys.path.insert(0,'/tmp/')
            import boto3
            from botocore.exceptions import ClientError

            def lambda_handler(event, context):

                serverless_client = boto3.client('redshift-serverless')
                namespace = os.getenv("namespace")
                retention_period = int(os.getenv("retention_period"))

                namespace_list = [namespace_list_temp.strip() for namespace_list_temp in namespace.split(",") if namespace_list_temp]

                for i_namespace in namespace_list:
                    snapshot_timestamp = time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime(time.time()))
                    snapshot_name = i_namespace + '-' + 'snapshot' + '-' + snapshot_timestamp
                    try:
                        response = serverless_client.create_snapshot(namespaceName=i_namespace,retentionPeriod=retention_period,snapshotName=snapshot_name)
                        status = 'Snapshot creation initiated for ' + i_namespace

                    except Exception as e:
                        msg = e.response['Error']['Code']
                        if msg == 'ConflictException':
                            status = 'Snapshot creation is already in progress for ' + i_namespace
                            logger.error(status)
                            continue
                        elif msg == 'ResourceNotFoundException':
                            status = i_namespace + ' namespace does not exist'
                            logger.error(status)
                            continue
                        else:
                            raise
                return status
      Description: This function will initiate creating of snapshot
      MemorySize: 128
      Timeout: 180
      Environment:
        Variables:
          namespace: !Ref inputNamespace
          retention_period: !Ref inputRetentionPeriod
      TracingConfig:
        Mode: Active

  redshiftServerlessSnapshotAutomationScheduledRule: 
    Type: AWS::Events::Rule
    Properties: 
      Description: "Schedule rule to invoke the lambda function to create the snapshot"
      ScheduleExpression: !Ref inputScheduleExpression
      State: "ENABLED"
      Targets: 
        - 
          Arn: 
            Fn::GetAtt: 
              - "redshiftServerlessSnapshotAutomation"
              - "Arn"
          Id: "redshiftServerlessSnapshotAutomation"
  PermissionForEventsToInvokeLambda: 
    Type: AWS::Lambda::Permission
    Properties: 
      FunctionName: !Ref redshiftServerlessSnapshotAutomation
      Action: "lambda:InvokeFunction"
      Principal: "events.amazonaws.com"
      SourceArn: 
        Fn::GetAtt: 
          - "redshiftServerlessSnapshotAutomationScheduledRule"
          - "Arn"

Deployment using AWS CloudFormation

This solution uses AWS CloudFormation to automatically provision all the required resources in your AWS accounts. By deploying the cloudformation attached above, it will create on your account the following resources:

  1. AWS Lambda
  2. AWS IAM role
  3. Amazon EventBridge rule

Launch cloud formation service from AWS Service explorer

  1. Choose "create stack (with new resources)" Enter image description here

  2. Click on "upload a template file" to upload the saved file from the above step. Enter image description here

  3. Provide the namespace, retention period and schedule information Enter image description here

ParameterDescriptionExample
inputNamespaceEnter the list of namespace for which snapshot needs to be created separeted by commadefault default, namespace1 namespace-rs-64, namespace-rs-65
inputRetentionPeriodEnter the retention period for the snapshot in days10
inputScheduleExpressionAdd the schedule details in format rate(value unit) unit:minute / minutes / hour / hours / day / days1 minute 30 minutes12 hours 1 day
profile pictureAWS
EXPERT
published 5 months ago1228 views