A Fluent Builder for DynamoDB Object Mappings in cdk-dms-replication v0.2.0


Introduction

My earlier post, migrating MySQL to DynamoDB with DMS, revealed the hardest part about migrating to DynamoDB using DMS: writing the tableMappings JSON. Every entity needs its own object-mapping rule with the same shape repeated over and over — rule-type, rule-action, object-locator, mapping-parameters, and a list of attribute-mappings for the partition key, sort key, and any GSI keys. The cdk-dms-replication construct simplified a lot of the parts that took a lot of trial and error in earlier migrations, but it left this one big pain point unaddressed.

In cdk-dms-replication v0.2.0, I’m adding a fluent builder method, mapToDynamoDb(), that produces the same object-mapping rules with Typescript’s compile-time type safety. This post walks through the new API, the design choices behind it, and what the original migration’s table mappings look like when rewritten in the new style.

What’s in v0.2.0

The heart of this change is the new method on TableMappings:

TableMappings.mapToDynamoDb(
  schemaName: string,
  tableName: string,
  options: DynamoDbObjectMappingOptions,
): TableMappings

The DynamoDbObjectMappingOptions interface defines attributes for the target table, the partition key, an optional sort key, and any non-key attributes. There’s also an excludeColumns list for the source columns that are not meant to be carried over. This is useful if you’ve added helper columns to the source (e.g. a sharding hash) that you don’t want in the target, or if you have columns that make sense in a relational schema but not in DynamoDB.

interface DynamoDbObjectMappingOptions {
  readonly targetTableName: string;
  readonly partitionKey: DynamoDbKeyMapping;
  readonly sortKey?: DynamoDbKeyMapping;
  readonly excludeColumns?: string[];
  readonly attributeMappings?: DynamoDbAttributeMapping[];
}

The DynamoDbKeyMapping determines where the value comes from and where it goes in the new table:

interface DynamoDbKeyMapping {
  readonly sourceColumn?: string;       // for `${col}` mappings
  readonly value?: string;              // for composite expressions
  readonly targetAttributeName: string;
  readonly attributeSubType: DynamoDbAttributeSubType;
}

sourceColumn and value are mutually exclusive — exactly one must be set. Use sourceColumn when you want a straight column copy, and value when you want a composite expression like 'CUSTOMER#${customer_id}' — particularly useful in single-table design. If you set neither or both, the builder throws at synth time rather than letting DMS reject it at deploy time.

A simple example

Here’s a one-table mapping with just a partition key:

import { TableMappings, DynamoDbAttributeSubType } from 'cdk-dms-replication';

const tableMappings = new TableMappings()
  .mapToDynamoDb('yourdb', 'language', {
    targetTableName: 'migration-target-table',
    partitionKey: {
      sourceColumn: 'language_id',
      targetAttributeName: 'pk',
      attributeSubType: DynamoDbAttributeSubType.STRING,
    },
  })
  .toJson();

That call produces this rule in the JSON shape that DMS expects:

{
  "rule-type": "object-mapping",
  "rule-id": "1",
  "rule-action": "map-record-to-record",
  "object-locator": { "schema-name": "yourdb", "table-name": "language" },
  "target-table-name": "migration-target-table",
  "mapping-parameters": {
    "partition-key-name": "pk",
    "attribute-mappings": [
      {
        "target-attribute-name": "pk",
        "attribute-type": "scalar",
        "attribute-sub-type": "string",
        "value": "${language_id}"
      }
    ]
  }
}

rule-id is assigned automatically and increments as you chain more selection, transformation, and object-mapping rules off the same builder. It has to be unique per rule, which is a common tripping point when writing this mapping in JSON by hand, as an errant copy-paste can repeat the same rule-id unintentionally.

Composite key values for single-table design

DynamoDB mappings are almost never one-to-one column copies if you’re using single-table design. Instead, they’re composites like LANGUAGE#${language_id} and #METADATA that group related entities into item collections for efficient querying. Use the value field on the key mapping to define that expression:

new TableMappings()
  .mapToDynamoDb('yourdb', 'film_actor', {
    targetTableName: 'migration-target-table',
    partitionKey: {
      value: 'FILM#${film_id}',
      targetAttributeName: 'pk',
      attributeSubType: DynamoDbAttributeSubType.STRING,
    },
    sortKey: {
      value: 'ACTOR#${actor_id}',
      targetAttributeName: 'sk',
      attributeSubType: DynamoDbAttributeSubType.STRING,
    },
    attributeMappings: [
      {
        value: 'ACTOR#${actor_id}',
        targetAttributeName: 'gsi1pk',
        attributeSubType: DynamoDbAttributeSubType.STRING,
      },
      {
        value: 'FILM#${film_id}',
        targetAttributeName: 'gsi1sk',
        attributeSubType: DynamoDbAttributeSubType.STRING,
      },
    ],
  });

Validating excludeColumns at synth time

excludeColumns lets you drop source columns that shouldn’t make it to DynamoDB — internal flags, debug fields, soft-delete markers, anything that isn’t relevant to the target item. That can lead to an easy mistake: listing the same column in both excludeColumns and as a sourceColumn for a key or attribute mapping. DMS itself doesn’t error on this directly, and depending on rule ordering you can end up with surprising behavior at runtime.

In v0.2.0 the builder catches this at synth time:

new TableMappings()
  .mapToDynamoDb('yourdb', 'customer', {
    targetTableName: 'migration-target-table',
    partitionKey: {
      sourceColumn: 'customer_id',
      targetAttributeName: 'pk',
      attributeSubType: DynamoDbAttributeSubType.STRING,
    },
    excludeColumns: ['customer_id'], // throws: customer_id is in excludeColumns
  });

The validation only applies to sourceColumn-based mappings. A composite value expression that happens to reference an excluded column is left alone, as this has some real use cases in migrations. In one migration, I needed to use partition sharding to increase write speed for a DynamoDB table, so I created a hash in the source table that I carried over in the composite key. Since I didn’t actually need that column in the target table, I had it in the excludeColumns as well.

Rewriting the original migration

Here’s the language and film_actor portion of the JSON blob from the original migration post, expressed with the builder:

const tableMappings = new TableMappings()
  .includeSchema('yourdb')
  .excludeTable('yourdb', 'film_text')
  .mapToDynamoDb('yourdb', 'language', {
    targetTableName: 'migration-target-table',
    partitionKey: {
      value: 'LANGUAGE#${language_id}',
      targetAttributeName: 'pk',
      attributeSubType: DynamoDbAttributeSubType.STRING,
    },
    sortKey: {
      value: '#METADATA',
      targetAttributeName: 'sk',
      attributeSubType: DynamoDbAttributeSubType.STRING,
    },
  })
  .mapToDynamoDb('yourdb', 'film_actor', {
    targetTableName: 'migration-target-table',
    partitionKey: {
      value: 'FILM#${film_id}',
      targetAttributeName: 'pk',
      attributeSubType: DynamoDbAttributeSubType.STRING,
    },
    sortKey: {
      value: 'ACTOR#${actor_id}',
      targetAttributeName: 'sk',
      attributeSubType: DynamoDbAttributeSubType.STRING,
    },
    attributeMappings: [
      {
        value: 'ACTOR#${actor_id}',
        targetAttributeName: 'gsi1pk',
        attributeSubType: DynamoDbAttributeSubType.STRING,
      },
      {
        value: 'FILM#${film_id}',
        targetAttributeName: 'gsi1sk',
        attributeSubType: DynamoDbAttributeSubType.STRING,
      },
    ],
  })
  .toJson();

The builder isn’t doing the hard work: thinking carefully about single-table design. Once the design is settled, however, the builder can help you get the transformation right before you deploy your DMS task.

Conclusion

The fluent builders on TableMappings and TaskSettings are a huge help to getting the JSON right for DMS migrations, and the new mapToDynamoDb() helper should ease the process for DynamoDB migrations specifically. The newest version of cdk-dms-replication, v0.2.0, is out now on npm, PyPI, Maven Central, NuGet, and Go. As always, feedback and contributions are welcome on GitHub, and you can find all the information you need to use cdk-dms-replication on ConstructHub.

View on Construct Hub

Comments

View & reply on Bluesky →

No comments yet. Be the first to reply on Bluesky!