Is your data secure? Find out with our free IBM security assessment! Learn More →

Services
Focus Areas

Areas of Expertise
Engagements

Discover

Build

Support
Areas of Expertise

App Modernization

Public Sector

Serverless

IoT

DevOps

Migration

Data and Machine Learning (ML)

Enterprise Architecture

24/7 Monitoring

Team Support

Datadog

Overview

Are you taking advantage of modernizing your AWS apps to protect your cloud investments?

Overview

Our mission is to accelerate high-quality cloud adoption across the Public Sector.

Overview

Whether you are new to serverless or looking to scale, Trek10 allows you to focus on building applications, not managing servers.

Related Content

AWS Lambda

With AWS Lambda, you can run code without the need for managing servers in a cost-effective manner.

Blog

What is Serverless and Why Does it Matter?

Overview

Whether you’re looking to gain visibility into plant floor machinery or seeking to enhance process efficiency, Trek10 can help.

Related Content

Blog

Serverless Architectures: IoT

Blog

Is IoT Device Shadow Right for You?

or should you build-your-own with DynamoDB?

Overview

Shorten the development lifecycle, increase reliability, and release software faster.

Related Content

AWS CloudFormation

AWS CloudFormation helps you save time and money by configuring and managing resources for you.

Containers on AWS

Containers on AWS makes managing container registries easy, autonomous, reliable, and safe from anywhere.

Overview

At Trek10, we rapidly migrate your applications with a focus on cost-effectiveness

Related Content

Amazon WorkSpaces

Amazon WorkSpaces allows you to quickly scale according to your virtual desktop needs.

Containers on AWS

Containers on AWS makes managing container registries easy, autonomous, reliable, and safe from anywhere.

Overview

Uncover insights from your data no matter where you are in your analytics journey.

Related Content

Machine Learning Ops

MLOps constitute best practices for developing, deploying, and monitoring high precision Machine Learning models.

Amazon SageMaker

Amazon SageMaker enables developers and data scientists to easily build ML models.

Overview

Enterprise Architecture (EA) combines business and technology in a proven industry recognized framework to deliver business focused results based on your industry, environment, competition and the ever increasing capabilities of cloud technologies.

Related Content

Developer Acceleration

A series of in-person architect-led training modules designed to help your team develop the necessary skills and best practices to modernize your applications.

Overview

Maximize the uptime and security of your most critical applications.

Related Content

Amazon CloudWatch

Amazon CloudWatch makes performance monitoring simple for you and your business.

Disaster Recovery

Prevent downtime, strengthen resilience, and avoid unanticipated costs with a comprehensive Disaster Recovery Plan.

Overview

Experienced solutions architects and developers at your service, on-demand.

Related Content

Amazon CloudWatch

Amazon CloudWatch makes performance monitoring simple for you and your business.

Disaster Recovery

Prevent downtime, strengthen resilience, and avoid unanticipated costs with a comprehensive Disaster Recovery Plan.

Overview

Let Trek10 help you hit the ground running with Datadog.

Related Content

AWS Premier Partner

Discover

Cloud-Native Immersion Day

Developer Acceleration

Retail | Industry Overview

SaaS on AWS

Serverless Workshop

Overview

Trek10's Cloud-Native Immersion Days are focused, high impact training sessions that will drench your teams in knowledge of the latest tech and best-practices.

Overview

Trek10’s expert-led Developer Acceleration workshops help enterprise teams quickly and safely jump-start their serverless journey.

Overview

Leveraging the vast capabilities of the AWS ecosystem, Trek10 provides retail businesses with solutions tailored to their unique needs, enabling them to innovate at speed and scale.

Overview

Trek10 helps companies migrate and build their SaaS offering on AWS with a cloud-native approach.

Overview

Whether it’s a greenfield project or re-architecting legacy, Trek10 is your guide to adopting cloud native architectures.

Build

DevOps Transformation

Internet of Things (IoT) Applications

Security

Overview

At Trek10, we leverage the best AWS native and third party tools for code-defined infrastructure, continuous integration, and automated deployment pipelines.

Overview

Trek10 helps you deliver on the promise of IoT by guiding you through the process of connecting your devices to AWS and by designing, implementing, and fully supporting your AWS cloud infrastructure.

Overview

Trek10’s security solutions and services will secure your AWS APIs and infrastructure. Schedule a meeting today to see if you qualify for a free security scan and report.

Support

CloudOps 24/7 Monitoring & Support

CloudOps Team Support

Overview

Trek10 brings managed services to the cloud. Our team works hard to reduce noise and maximize uptime in every AWS environment we manage.

Overview

Trek10 Team Support augments your team’s skills with access to a team of experienced and focused AWS solutions architects and cloud developers that specialize in leveraging AWS to the fullest.

Overview

Everyone who moves to AWS wants to secure their environment, but knowing where to start is hard. That is where Trek10 can help.
Case Studies
About
Careers
AWS Premier Partner
Community
CloudProse Blog

Spotlight

Serverless

Cost and Pricing Analysis

Cloud Native

Developer Experience

Databases

News

IoT

Monitoring, Ops & DevOps

Containers

Security and IAM

Generative AI and Machine Learning (ML)

Search Trek10

Serverless

Making Legacy Databases Into Event Sources for Serverless

Andy Warzon | Nov 16 2018
5 min read

Fri, 16 Nov 2018

In this post I’ll outline a generalized approach to building an event-driven, serverless architecture on AWS when you rely on data from a legacy system. If you are trying to figure out how to build new cutting edge architectures in AWS but are reliant on data in a legacy database, this is for you.

Event Driven Systems and Legacy SQL

In the world of serverless, it’s important to think differently about architecture: goodbye monolith, hello event-driven, or even further, stream-based architecture. Your various components of compute (i.e. Lambda), data storage (i.e. DynamoDB or S3), and routing (i.e. SNS or API Gateway) need to relate to each other through events. Event-driven thinking can also help you understand your core business processes and make it easier to extend your architecture. If you fall back on old monolithic constructs you miss many of the benefits of serverless architectures: scalability, efficiency, and flexibility.

That said, rarely are new architectures designed in a vacuum. Almost always there is some other data in your enterprise that is important for your new application. Usually that data is sitting in a SQL database; often that database is on-premise. At a high level, here is the problem:

This new challenge is one that we have dealt with many times at Trek10. How do you get your on-premise data into your new serverless application in near real-time? The approaches for addressing this are all over the map and always custom: on-premise agents querying data and pushing it into AWS, custom replication to a database in AWS, or vendor-specific solutions. These all have their flaws, but more importantly, they are always custom built from scratch, increasing project time and cost.

A generalized pattern to solving this challenge could be applied quickly and easily to the vast majority of scenarios. This would result in faster implementation and a more battle-tested solution. Here is a concept for one such pattern…

DMS Event Bus Architecture

With this architecture, all changes in your legacy SQL database (supporting a wide variety of DB vendors) are pushed onto a Kinesis event stream, ready to be consumed. The only thing left to build is your custom business logic that consumes these events… for example to use Kinesis Analytics to run alerting, process the data with a Lambda function, or archive it into an S3 data lake with Kinesis Firehose.

All of your database changes will come onto the stream with a representation of the old data (if any) and the new data, both in JSON format. The beauty of this is that if you choose to replicate * columns in your tables, schema updates automatically flow through the system: it automatically identifies new columns and adds keys to the JSON.

So at the end of the day, your row-oriented SQL data will end up in a Kinesis stream for Lambda functions and other consumers as JSON elements with “old” and “new” records for every record added, edited, and deleted.

(If you are familiar with these components, you may be wondering, why not just drive Lambda off of a DynamoDB stream? We’ll explain more below, but the short answer is: unifying multiple tables into a single stream.)

Let’s go through some of the system’s components in more detail.

Database Migration Service

This is our key to extracting data in a generalized way. As some at AWS will tell you, this service has been misnamed: it was originally intended to focus on one-off migrations, but so many customers have begun to use it for long-term replication that it should change its name to “Database Replication Service”. AWS has embraced this, adding features needed for production-ready replication like multi-AZ failover. It is now a common choice for long-term replication.

Of course, DMS supports a diverse list of sources, including all the legacy favorites like Oracle and SQL Server. Critically, because it is using Change Data Capture, it adds very minimal load onto the source servers. This approach is far more efficient than building a custom agent that is adding reads to your database.

DynamoDB

DynamoDB provides us a convenient waypoint for the data for a few reasons:

It is supported by DMS as a destination data store
Being NoSQL does not impose any schema setup requirements
It has a feature called DynamoDB Streams which will push all changes made to a DynamoDB table into a stream of events that can be consumed by a Lambda function.

This feature is our link to the next component of the system.

Kinesis

If your processing needs are simple, you could skip this and just do your data processing in a Lambda function off of DynamoDB Streams. However in our reference architecture, we use a Lambda function as glue between DynamoDB Streams and Kinesis, by having this simple function receive the events from the DynamoDB stream and drop them on to the Kinesis Stream. This has a few benefits:

For each SQL table, DMS creates a new DynamoDB table, which has its own stream. You would need to consume many streams to get all of your data. However the Lambda function can put all events from all tables & streams on to a single Kinesis stream, making downstream consumption simpler.
Kinesis supports multiple simultaneous consumers and a few built-in consumer types. So you can easily make multiple uses of that event simultaneously, including routing it to…
- Kinesis Analytics to do streaming analytics for example for alerting or real-time dashboarding
- Kinesis Firehose, to archive into Redshift or S3
- IOT Core to push data or commands back down to IOT devices, or into IOT Analytics or any of the other AWS services natively supported by IOT Core.
- Or of course a Lambda function for custom processing

Things to Watch Out For

A few important things to keep in mind:

There are often a few tweaks required to your source database for DMS to be able to work utilizing Change Data Capture. Read the DMS documentation carefully to be sure you can address these prerequisites.
DMS is decidedly not serverless, unfortunately! You have to make sure you set disk space and instance size appropriately to handle your data volume.
Watch out for DynamoDB capacity. When DMS sets up the tables, it defaults to some pretty high numbers (200 reads & writes), which will result in excessive costs if you have many tables. On the flip side, a spike in source database activity could lead to DynamoDB write throttling. You will probably want to turn on DynamoDB autoscaling to right-size these costs.

Next Steps

All of the components here support CloudFormation, so a great next step would be to template out this architecture so it could be applied repeatedly. There is some added complexity in this case because DMS dynamically creates the destination DynamoDB tables, so the automation would have to be flexible enough to adapt to dynamically created tables. But once that is complete, this is a system that could be up and running in minutes, allowing the builder to focus on source system configuration and business logic to process the incoming events. Trek10 hopes to open source a solution along these lines in the coming months.

We’d love to hear your thoughts about this proposed solution. See any holes in it? Have you tried something like this and can report on your experience? Let us know @Trek10Inc, and keep building!