Codementor Events

Cross Region AWS RDS Backup using SQS and Lambda

Published Jun 29, 2020

Having a Disaster Recovery(DR) environment for the production environment is a standard practice in today’s IT world. In this AWS tutorial I will show you how to perform the cross-region backup of RDS snapshots using SQS and Lambda for DR purpose. I am assuming that the DR environment will be present in the same account but in a different region for this tutorial.
To implement this I have my RDS instances created in the us-east-1 region and I will copy snapshots to us-west-2 region i.e. our DR region.

_ NOTE:- _ Although AWS Backup Service also provides the backup solution for RDS, it does not support Aurora at the time of writing this blog. Due to this constraint I have to implement an alternate approach that supports Aurora and other RDS DB engines too.

This tutorial is divided into the following sections:-

  1. RDS setup
  2. Cloudwatch Rule
  3. SQS Queue
  4. Lambda Function
  5. Lambda IAM Policy

RDS setup

I have three databases running at present in my account. Two of these are Aurora Clusters and one is MySQL instance. We will perform a cross-region backup of all three of these. All three databases present are created with standard settings only.

image

Cloudwatch Rule

AWS generates events for almost every API operation that can be performed on it. So I created a Cloudwatch rule which gets triggered when RDS snapshot related events are generated.

image

The above rule has the SQS queue registered as a target which is present in the next section. Also, I have to write a custom event pattern as AWS does not provide both cluster and instance snapshot events under one event type.

{
  "source": [
    "aws.rds"
  ],
  "detail-type": [
    "RDS DB Cluster Snapshot Event",
    "RDS DB Snapshot Event"
  ]
}

SQS Queue

The above Cloudwatch rule will send events to the SQS queue. Although there is an option to directly send events to lambda from Cloudwatch rule I am using SQS as it adds fault tolerance to our design. In case lambda fails due to any error that event will still be present in SQS and I will not miss on copying that snapshot.
Note that SQS also support dead letter queues so that if lambda keeps on failing on an event for a certain number of times then that event can be moved to another queue and once the error is fixed these lambda function can process these events. I am not configuring dead letter queues for this example but you should definitely set it for your production use cases. This SQS queue is created with default settings only.

image

Lambda Function

Above SQS queue is triggering a lambda function that will perform cross-region copy operation on DB snapshots. This lambda function is created with default settings and SQS as a trigger is added after the creation of lambda function.

image

The above lambda function is running this code.

import json
import boto3
import os

def lambda_handler(event, context):

    # source_region is the region where rds snapshot exists.
    source_region = os.environ['source_region']
    # destination region is the region where snapshot needs to get copied.
    destination_region = os.environ['destination_region']
    # initializing boto3 client.
    destination_client = boto3.client('rds', region_name = destination_region)

    for record in event['Records']:

        rds_event = json.loads(record['body'])
        snapshot_source_arn = rds_event['detail']['SourceArn']
        snapshot_source_identifier = rds_event['detail']['SourceIdentifier']

        # colon is present in automated snaphosts which needs to be removed.
        if ":" in snapshot_source_identifier:
            snapshot_source_identifier = snapshot_source_identifier.replace(':','')
        
        # checking if cluster snapshot is created.
        if "cluster snapshot created" in rds_event['detail']['Message']:
            destination_client.copy_db_cluster_snapshot(
                SourceDBClusterSnapshotIdentifier = snapshot_source_arn,
                TargetDBClusterSnapshotIdentifier = snapshot_source_identifier,
                SourceRegion = source_region,
                KmsKeyId = 'alias/aws/rds'
            )

       # checking if instance snapshot is created.
        elif "snapshot created" in rds_event['detail']['Message']:
            destination_client.copy_db_snapshot(
                SourceDBSnapshotIdentifier = snapshot_source_arn,
                TargetDBSnapshotIdentifier = snapshot_source_identifier,
                SourceRegion = source_region,
                KmsKeyId = 'alias/aws/rds'
            )
    
    return {
        'statusCode': 200,
        'body': json.dumps('Hello from Lambda!')
    }

The above code expects two environment variables to be set up i.e. source_region and destination_region. It can be set up in Lambda section on the AWS console as shown in the image below

image

Lambda IAM Policy

Lambda function needs permissions to perform a cross-region copy of snapshots. In AWS permissions are given via IAM Roles which have policies attached to them. A policy is a set of rules defining various permissions, in order to grant access to resources. More about policies can be found here.
Here I have created the IAM policy present below which I added to the IAM role created by AWS along with the above lambda function.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "sqs:DeleteMessage",
                "sqs:ReceiveMessage",
                "sqs:GetQueueAttributes"
            ],
            "Resource": [
                "arn:aws:sqs:us-east-1:<AWS_ACCOUNT_ID>:RdsSnapshotCopy"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "rds:CopyDBSnapshot",
                "rds:CopyDBClusterSnapshot"
            ],
            "Resource": [
                "arn:aws:rds:*:*:snapshot:*",
                "arn:aws:rds:*:*:cluster-snapshot:*",
            ]
        }
    ]
}

This concludes the setup required to enable a cross-region copy of RDS snapshots in the AWS account. Now if RDS snapshot is created either manually or using automated way for either RDS cluster or RDS instances setup I created above will copy that snapshot to DR region.

Discover and read more posts from shiv
get started
post commentsBe the first to share your opinion
Show more replies