<< All Sessions

RDBMS and Apache Geode Data Movement: Low Latency ETL Pipeline By Using Cloud-Native Event Driven Microservices

Data, Databases

Day: Thursday
Time: 11:50
Room: 2018

Extract, transform, load (ETL) has always been complex and expensive for moving massive data sets from one data source to another. This is especially true if the source system is a traditional RDBMS with complicated relationships between tables. Most of the time, traditional ETL processes are implemented with batch, monolithic, and tightly coupled approaches. As the result, traditional ETL processes are often considered fragile, hard to maintain, not easy to tune, and often introduce high data latency between source and destination systems.

In this session, Paul and Heather will cover how to create cloud-native event driven microservices (ETL pipeline) for RDBMS and Apache Geode by using Cloud Foundry, Spring Cloud Stream, and RabbitMQ/Kafka. The pipelines can handle high volume data sets and complex database queries, yet with low data latency between the source RDBMS and Apache Geode. In addition, the design is highly tunable and scalable. The session will also cover analysis of performance metrics based on the implementations of real world use cases.


Paul Warren
Senior Engineer

Heather Riddle
Senior Engineer