Handling service roles for DMS in AWS CDK Constructs


Introduction

While making my cdk-dms-replication construct, I wanted to make sure that the user could make all of their DMS dependencies with a single call to new DmsMigrationPipeline(). Ultimately that meant solving a big problem: creating the dms-vpc-role and dms-cloudwatch-logs-role IAM roles within the construct, in spite of them being account-level objects. That led me to another unsolved problem: if the user has already created one DMS migration using the construct, how can they create another one without seeing the following error?

EntityAlreadyExists: Role with name dms-vpc-role already exists.

In the most recent release of cdk-dms-replication, I solved this problem by using an AwsCustomResource, which allows the construct to be idempotent. This leaves open a new problem: resources deployed by the stack are no longer tracked by it, so destroying the stack with cdk destroy will not remove them. How do we make the decision to break the tracking of resources by the stack? As with anything in software architecture, the decision comes down to “It depends”.

Why do we need dms-vpc-role and dms-cloudwatch-logs-role and why are they problematic?

AWS DMS needs dms-vpc-role because it is a managed service that runs replication instances (which are ultimately EC2 instances) within your VPC. This role gives these instances permission to call EC2 APIs that govern access to the migration’s source and target endpoints. The dms-cloudwatch-logs-role role is necessary to give DMS access to CloudWatch so that it can log the status of the migration.

If you create your DMS migration on the AWS Console, these roles are automatically created for you silently behind the scenes when you create a Replication Instance. However, if you create the migration elsewhere, you will need to create these roles yourself.

How cdk-dms-replication addresses the roles problem

With cdk-dms-replication I aimed to make it possible to provision a DMS migration with a single line. This required me to have it automatically make the required IAM roles. However, in the initial release those roles were being created using new iam.Role(). This meant that if you made two migrations on the same account using cdk-dms-replication with its default values, you would get an error from trying to create resources that already existed.

Starting with version 0.1.0, I am instead using AwsCustomResource to create these roles, which allows the construct to ignore the exception if the resource already exists (AWS CDK has no functionality for checking for a resource’s existence during create). Hence if you create a second DMS stack on the same account, it will re-use the existing dms-vpc-role and dms-cloudwatch-logs-role roles without throwing an error.

What happens when we delete the stack?

The downside of this approach is that deleting the stack now will have no effect on these roles. Even if you delete every cdk-dms-replication stack on an account, if you used the default settings to create a stack, you will continue to have these two roles, and will have to delete them manually using the CLI or the Console.

Ultimately this is the right design decision if the intent is to make it possible to provision the entire DMS migration without external dependencies. I was forced to make a decision on whether it was more important to maintain the stack’s control of the resources or to keep the construct’s behavior idempotent without external dependencies, and ultimately the latter view won. Because these roles are well defined and there is no negative consequence for them remaining when the migrations are completed and removed, I did not consider it important to keep them associated with the stack.

But that is an opinionated approach coming from my own goals for the project, and maybe your goals are different. Maybe you need to keep these roles associated with the stack, so that you can delete the roles when you call cdk destroy.

How can I have the stack track the roles?

If you need stack-level cleanup of the roles with cdk destroy, or if you have concerns with orphaned resources, or if you have least-privilege access concerns with DMS continuing to have access to CloudWatch when DMS is no longer in use, you can define these roles separately in your stack and use the createDmsServiceRoles: false option to prevent cdk-dms-replication from automatically making these roles.

// Create dms-vpc-role as a stack-tracked resource.
// DMS looks for this exact role name when placing replication instances in a VPC.
const dmsVpcRole = new iam.Role(this, 'DmsVpcRole', {
  roleName: 'dms-vpc-role',
  assumedBy: new iam.ServicePrincipal('dms.amazonaws.com'),
  managedPolicies: [
    iam.ManagedPolicy.fromAwsManagedPolicyName('service-role/AmazonDMSVPCManagementRole'),
  ],
});

// Create dms-cloudwatch-logs-role as a stack-tracked resource.
// DMS looks for this exact role name when publishing replication task logs to CloudWatch.
const dmsCloudWatchLogsRole = new iam.Role(this, 'DmsCloudWatchLogsRole', {
  roleName: 'dms-cloudwatch-logs-role',
  assumedBy: new iam.ServicePrincipal('dms.amazonaws.com'),
  managedPolicies: [
    iam.ManagedPolicy.fromAwsManagedPolicyName('service-role/AmazonDMSCloudWatchLogsRole'),
  ],
});

new DmsMigrationPipeline(this, 'Migration', {
  vpc,
  vpcSubnets: { subnetType: ec2.SubnetType.PRIVATE_ISOLATED },
  replicationInstanceClass: ReplicationInstanceClass.T3_MEDIUM,
  migrationType: MigrationType.FULL_LOAD,
  createDmsServiceRoles: false, // roles are managed above
  sourceEndpoint: {
    engine: EndpointEngine.MYSQL,
    serverName: db.dbInstanceEndpointAddress,
    port: 3306,
    username: 'dms_user',
    password: db.secret!.secretValueFromJson('password'),
    databaseName: 'yourdb',
  },
  targetEndpoint: {
    engine: EndpointEngine.S3,
    s3Settings: {
      bucketName: dmsBucket.bucketName,
      serviceAccessRoleArn: dmsS3Role.roleArn,
    },
  },
  tableMappings: new TableMappings().includeSchema('%').toJson(),
});

If you go this route and deploy multiple stacks that include a DMS Migration Pipeline, you will need to keep track of which stack is maintaining the roles, so that you do not accidentally delete the roles when they are needed by another stack. If you only intend to have one migration at any point in time on an account, however, this would be a good way of keeping those roles tracked for deletion.

Conclusion

In software architecture there are many times when there is no single right or wrong decision outside of the context of your application or your company. Because making sure that the roles were created in the CDK as they are in the Console is important to me, I made my construct handle that. But for you it may be more important to keep from having stray resources that aren’t cleaned up when you destroy your construct. The createDmsServiceRoles: false option gives you that control, but if you have more to contribute, feel free to open an issue or make a contribution on GitHub.