SlideShare a Scribd company logo
www.flydata.com
What Change Data
Capture (CDC) is and
Why It’s Important
What is CDC?
Copyright © 2015 FlyData Inc. All rights reserved.
CDC captures just the
change data so that
replication can be done more
efficiently.
www.flydata.com
www.flydata.com
Why do we need CDC?
• Increase in data collected
• Queries take longer because database size
is growing
Need for BI Databases (Data Warehousing)
&
Need for Data Replication growing
Copyright © 2015 FlyData Inc. All rights reserved.
www.flydata.com
Methods for Syncing Data
▪ Dumping the whole DB
▪ Change Data Capture
(CDC)
▪ Other approaches
Copyright © 2015 FlyData Inc. All rights reserved.
www.flydata.com
Benefits of CDC
▪ No need for full
export/import
▪ Near real-time replication
is possible => data always
available for queries
Copyright © 2015 FlyData Inc. All rights reserved.
www.flydata.com
Steps for CDC
1. Capture Change Data
2. Transform data to format
supported on destination
3. Upload to destination
4. (Handle any errors as they
occur)
Copyright © 2015 FlyData Inc. All rights reserved.
www.flydata.com
1. Methods for Change Data Capture
• Transaction Logs
• Triggers
• Custom Solution at Application Layer
Copyright © 2015 FlyData Inc. All rights reserved.
www.flydata.com
1. Methods for Capturing Change Data
Method Pros Cons
Transaction Logs • Minimal performance impact
• No change to tables or application
necessary
• Transaction logs are in binary
format.
• Lacking good documentation
Triggers • Handle everything at the SQL level • Significant performance impact to
the master DB
• Need to define triggers on each
table
Custom Solutions • If INSERTs only, relatively easy to
implement
• With UPDATEs and DELETEs,
keeping data integrity becomes
much harder.
Copyright © 2015 FlyData Inc. All rights reserved.
www.flydata.com
2. Transforming Change Data
Destination DB may have certain data type
restrictions:
• Identify such differences in data type
• Convert the data
Example of differences between MySQL and Amazon Redshift:
• Case sensitive table/column names: MySQL - yes, Redshift - no
• TIME column value beyond 23:59:59: MySQL - yes, Redshift - no
• VARCHAR length: MySQL - character length, Redshift - byte length
• Date ‘0000-00-00’: MySQL - allowed, Redshift - not allowed
• BINARY, VARBINARY: MySQL - supported, Redshift - not supported
• ENUM, SET: MySQL - supported, Redshift - not supported
Copyright © 2015 FlyData Inc. All rights reserved.
www.flydata.com
3. Upload to Destination
• Uploading data to BI databases
could be tricky.
• Error Handling Crucial for CDC
data integrity:
• Recognize when errors occur
• Buffer change data after the error (order
important!)
• Recognize when error is handled
• Resume uploading
• Do this in parallel to scale
Copyright © 2015 FlyData Inc. All rights reserved.
www.flydata.com www.flydata.com
Check us out!
-> http://flydata.com
sales@flydata.com
Toll Free: 1-855-427-9787
http://flydata.com
We are an official data integration
partner of Amazon Redshift
Ad

More Related Content

What's hot (20)

Change data capture
Change data captureChange data capture
Change data capture
Ron Barabash
 
Data Vault and DW2.0
Data Vault and DW2.0Data Vault and DW2.0
Data Vault and DW2.0
Empowered Holdings, LLC
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Flink Forward
 
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin AmbardDelta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Paris Data Engineers !
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
Databricks
 
Spark with Delta Lake
Spark with Delta LakeSpark with Delta Lake
Spark with Delta Lake
Knoldus Inc.
 
Oracle data guard for beginners
Oracle data guard for beginnersOracle data guard for beginners
Oracle data guard for beginners
Pini Dibask
 
Snowflake Datawarehouse Architecturing
Snowflake Datawarehouse ArchitecturingSnowflake Datawarehouse Architecturing
Snowflake Datawarehouse Architecturing
Ishan Bhawantha Hewanayake
 
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Databricks
 
Microservices Patterns with GoldenGate
Microservices Patterns with GoldenGateMicroservices Patterns with GoldenGate
Microservices Patterns with GoldenGate
Jeffrey T. Pollock
 
Understanding Azure Data Factory: The What, When, and Why (NIC 2020)
Understanding Azure Data Factory: The What, When, and Why (NIC 2020)Understanding Azure Data Factory: The What, When, and Why (NIC 2020)
Understanding Azure Data Factory: The What, When, and Why (NIC 2020)
Cathrine Wilhelmsen
 
Demystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFWDemystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFW
Kent Graziano
 
Oracle Transparent Data Encryption (TDE) 12c
Oracle Transparent Data Encryption (TDE) 12cOracle Transparent Data Encryption (TDE) 12c
Oracle Transparent Data Encryption (TDE) 12c
Nabeel Yoosuf
 
Dataguard presentation
Dataguard presentationDataguard presentation
Dataguard presentation
Vimlendu Kumar
 
Nifi workshop
Nifi workshopNifi workshop
Nifi workshop
Yifeng Jiang
 
Optimize the performance, cost, and value of databases.pptx
Optimize the performance, cost, and value of databases.pptxOptimize the performance, cost, and value of databases.pptx
Optimize the performance, cost, and value of databases.pptx
IDERA Software
 
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Alluxio, Inc.
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
James Serra
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
James Serra
 
Logical Data Fabric: Architectural Components
Logical Data Fabric: Architectural ComponentsLogical Data Fabric: Architectural Components
Logical Data Fabric: Architectural Components
Denodo
 
Change data capture
Change data captureChange data capture
Change data capture
Ron Barabash
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Flink Forward
 
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin AmbardDelta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Paris Data Engineers !
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
Databricks
 
Spark with Delta Lake
Spark with Delta LakeSpark with Delta Lake
Spark with Delta Lake
Knoldus Inc.
 
Oracle data guard for beginners
Oracle data guard for beginnersOracle data guard for beginners
Oracle data guard for beginners
Pini Dibask
 
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Databricks
 
Microservices Patterns with GoldenGate
Microservices Patterns with GoldenGateMicroservices Patterns with GoldenGate
Microservices Patterns with GoldenGate
Jeffrey T. Pollock
 
Understanding Azure Data Factory: The What, When, and Why (NIC 2020)
Understanding Azure Data Factory: The What, When, and Why (NIC 2020)Understanding Azure Data Factory: The What, When, and Why (NIC 2020)
Understanding Azure Data Factory: The What, When, and Why (NIC 2020)
Cathrine Wilhelmsen
 
Demystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFWDemystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFW
Kent Graziano
 
Oracle Transparent Data Encryption (TDE) 12c
Oracle Transparent Data Encryption (TDE) 12cOracle Transparent Data Encryption (TDE) 12c
Oracle Transparent Data Encryption (TDE) 12c
Nabeel Yoosuf
 
Dataguard presentation
Dataguard presentationDataguard presentation
Dataguard presentation
Vimlendu Kumar
 
Optimize the performance, cost, and value of databases.pptx
Optimize the performance, cost, and value of databases.pptxOptimize the performance, cost, and value of databases.pptx
Optimize the performance, cost, and value of databases.pptx
IDERA Software
 
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Alluxio, Inc.
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
James Serra
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
James Serra
 
Logical Data Fabric: Architectural Components
Logical Data Fabric: Architectural ComponentsLogical Data Fabric: Architectural Components
Logical Data Fabric: Architectural Components
Denodo
 

Similar to What is Change Data Capture (CDC) and Why is it Important? (20)

[db tech showcase Tokyo 2016] E34: Oracle SE - RAC, HA and Standby are Still ...
[db tech showcase Tokyo 2016] E34: Oracle SE - RAC, HA and Standby are Still ...[db tech showcase Tokyo 2016] E34: Oracle SE - RAC, HA and Standby are Still ...
[db tech showcase Tokyo 2016] E34: Oracle SE - RAC, HA and Standby are Still ...
Insight Technology, Inc.
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?
Precisely
 
Big Iron + Big Data = BIG DEAL! Unlock The Power of Your Mainframe Data
Big Iron + Big Data = BIG DEAL! Unlock The Power of Your Mainframe DataBig Iron + Big Data = BIG DEAL! Unlock The Power of Your Mainframe Data
Big Iron + Big Data = BIG DEAL! Unlock The Power of Your Mainframe Data
CA Technologies
 
Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...
Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...
Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...
DataWorks Summit
 
Unlock Hadoop Success with Cloudera Navigator Optimizer
Unlock Hadoop Success with Cloudera Navigator OptimizerUnlock Hadoop Success with Cloudera Navigator Optimizer
Unlock Hadoop Success with Cloudera Navigator Optimizer
Cloudera, Inc.
 
Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?
Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?
Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?
TechWell
 
Unlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQLUnlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQL
Matt Lord
 
Google take on heterogeneous data base replication
Google take on heterogeneous data base replication Google take on heterogeneous data base replication
Google take on heterogeneous data base replication
Svetlin Stanchev
 
Data Warehouse Design Considerations
Data Warehouse Design ConsiderationsData Warehouse Design Considerations
Data Warehouse Design Considerations
Ram Kedem
 
Big Data with KNIME is as easy as 1, 2, 3, ...4!
Big Data with KNIME is as easy as 1, 2, 3, ...4!Big Data with KNIME is as easy as 1, 2, 3, ...4!
Big Data with KNIME is as easy as 1, 2, 3, ...4!
KNIMESlides
 
Big Data as easy as 1, 2, 3, ... 4 ... with KNIME
Big Data as easy as 1, 2, 3, ... 4 ... with KNIMEBig Data as easy as 1, 2, 3, ... 4 ... with KNIME
Big Data as easy as 1, 2, 3, ... 4 ... with KNIME
Rosaria Silipo
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
DataWorks Summit
 
The Good, The Bad and the Ugly
The Good, The Bad and the UglyThe Good, The Bad and the Ugly
The Good, The Bad and the Ugly
Roy Salazar
 
Migrating to Cassandra
Migrating to CassandraMigrating to Cassandra
Migrating to Cassandra
Instaclustr
 
Module 9: CDB Technical Intro
 Module 9: CDB Technical Intro Module 9: CDB Technical Intro
Module 9: CDB Technical Intro
Tail-f Systems
 
Building Data Warehouse in SQL Server
Building Data Warehouse in SQL ServerBuilding Data Warehouse in SQL Server
Building Data Warehouse in SQL Server
Antonios Chatzipavlis
 
How to configure SQL Server like a pro
How to configure SQL Server like a proHow to configure SQL Server like a pro
How to configure SQL Server like a pro
SolarWinds
 
ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013
Mark Rittman
 
What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?
DataWorks Summit
 
Hive 3 New Horizons DataWorks Summit Melbourne February 2019
Hive 3 New Horizons DataWorks Summit Melbourne February 2019Hive 3 New Horizons DataWorks Summit Melbourne February 2019
Hive 3 New Horizons DataWorks Summit Melbourne February 2019
alanfgates
 
[db tech showcase Tokyo 2016] E34: Oracle SE - RAC, HA and Standby are Still ...
[db tech showcase Tokyo 2016] E34: Oracle SE - RAC, HA and Standby are Still ...[db tech showcase Tokyo 2016] E34: Oracle SE - RAC, HA and Standby are Still ...
[db tech showcase Tokyo 2016] E34: Oracle SE - RAC, HA and Standby are Still ...
Insight Technology, Inc.
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?
Precisely
 
Big Iron + Big Data = BIG DEAL! Unlock The Power of Your Mainframe Data
Big Iron + Big Data = BIG DEAL! Unlock The Power of Your Mainframe DataBig Iron + Big Data = BIG DEAL! Unlock The Power of Your Mainframe Data
Big Iron + Big Data = BIG DEAL! Unlock The Power of Your Mainframe Data
CA Technologies
 
Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...
Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...
Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...
DataWorks Summit
 
Unlock Hadoop Success with Cloudera Navigator Optimizer
Unlock Hadoop Success with Cloudera Navigator OptimizerUnlock Hadoop Success with Cloudera Navigator Optimizer
Unlock Hadoop Success with Cloudera Navigator Optimizer
Cloudera, Inc.
 
Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?
Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?
Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?
TechWell
 
Unlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQLUnlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQL
Matt Lord
 
Google take on heterogeneous data base replication
Google take on heterogeneous data base replication Google take on heterogeneous data base replication
Google take on heterogeneous data base replication
Svetlin Stanchev
 
Data Warehouse Design Considerations
Data Warehouse Design ConsiderationsData Warehouse Design Considerations
Data Warehouse Design Considerations
Ram Kedem
 
Big Data with KNIME is as easy as 1, 2, 3, ...4!
Big Data with KNIME is as easy as 1, 2, 3, ...4!Big Data with KNIME is as easy as 1, 2, 3, ...4!
Big Data with KNIME is as easy as 1, 2, 3, ...4!
KNIMESlides
 
Big Data as easy as 1, 2, 3, ... 4 ... with KNIME
Big Data as easy as 1, 2, 3, ... 4 ... with KNIMEBig Data as easy as 1, 2, 3, ... 4 ... with KNIME
Big Data as easy as 1, 2, 3, ... 4 ... with KNIME
Rosaria Silipo
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
DataWorks Summit
 
The Good, The Bad and the Ugly
The Good, The Bad and the UglyThe Good, The Bad and the Ugly
The Good, The Bad and the Ugly
Roy Salazar
 
Migrating to Cassandra
Migrating to CassandraMigrating to Cassandra
Migrating to Cassandra
Instaclustr
 
Module 9: CDB Technical Intro
 Module 9: CDB Technical Intro Module 9: CDB Technical Intro
Module 9: CDB Technical Intro
Tail-f Systems
 
Building Data Warehouse in SQL Server
Building Data Warehouse in SQL ServerBuilding Data Warehouse in SQL Server
Building Data Warehouse in SQL Server
Antonios Chatzipavlis
 
How to configure SQL Server like a pro
How to configure SQL Server like a proHow to configure SQL Server like a pro
How to configure SQL Server like a pro
SolarWinds
 
ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013
Mark Rittman
 
What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?
DataWorks Summit
 
Hive 3 New Horizons DataWorks Summit Melbourne February 2019
Hive 3 New Horizons DataWorks Summit Melbourne February 2019Hive 3 New Horizons DataWorks Summit Melbourne February 2019
Hive 3 New Horizons DataWorks Summit Melbourne February 2019
alanfgates
 
Ad

More from FlyData Inc. (11)

What's So Unique About a Columnar Database?
What's So Unique About a Columnar Database?What's So Unique About a Columnar Database?
What's So Unique About a Columnar Database?
FlyData Inc.
 
Three Things to Consider When Making Investments in Your Big Data Infrastructure
Three Things to Consider When Making Investments in Your Big Data InfrastructureThree Things to Consider When Making Investments in Your Big Data Infrastructure
Three Things to Consider When Making Investments in Your Big Data Infrastructure
FlyData Inc.
 
Cognitive Biases in Data Science
Cognitive Biases in Data ScienceCognitive Biases in Data Science
Cognitive Biases in Data Science
FlyData Inc.
 
How to Extract Data from Amazon Redshift
How to Extract Data from Amazon RedshiftHow to Extract Data from Amazon Redshift
How to Extract Data from Amazon Redshift
FlyData Inc.
 
Amazon Redshift - Create an Amazon Redshift Cluster
Amazon Redshift - Create an Amazon Redshift ClusterAmazon Redshift - Create an Amazon Redshift Cluster
Amazon Redshift - Create an Amazon Redshift Cluster
FlyData Inc.
 
The Internet of Things
The Internet of ThingsThe Internet of Things
The Internet of Things
FlyData Inc.
 
Create an Amazon Redshift Cluster with FlyData!
Create an Amazon Redshift Cluster with FlyData!Create an Amazon Redshift Cluster with FlyData!
Create an Amazon Redshift Cluster with FlyData!
FlyData Inc.
 
Near Real-Time Data Analysis With FlyData
Near Real-Time Data Analysis With FlyData Near Real-Time Data Analysis With FlyData
Near Real-Time Data Analysis With FlyData
FlyData Inc.
 
FlyData Autoload: 事例集
FlyData Autoload: 事例集FlyData Autoload: 事例集
FlyData Autoload: 事例集
FlyData Inc.
 
Scalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query SpeedScalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query Speed
FlyData Inc.
 
Amazon Redshift ベンチマーク Hadoop + Hiveと比較
Amazon Redshift ベンチマーク  Hadoop + Hiveと比較 Amazon Redshift ベンチマーク  Hadoop + Hiveと比較
Amazon Redshift ベンチマーク Hadoop + Hiveと比較
FlyData Inc.
 
What's So Unique About a Columnar Database?
What's So Unique About a Columnar Database?What's So Unique About a Columnar Database?
What's So Unique About a Columnar Database?
FlyData Inc.
 
Three Things to Consider When Making Investments in Your Big Data Infrastructure
Three Things to Consider When Making Investments in Your Big Data InfrastructureThree Things to Consider When Making Investments in Your Big Data Infrastructure
Three Things to Consider When Making Investments in Your Big Data Infrastructure
FlyData Inc.
 
Cognitive Biases in Data Science
Cognitive Biases in Data ScienceCognitive Biases in Data Science
Cognitive Biases in Data Science
FlyData Inc.
 
How to Extract Data from Amazon Redshift
How to Extract Data from Amazon RedshiftHow to Extract Data from Amazon Redshift
How to Extract Data from Amazon Redshift
FlyData Inc.
 
Amazon Redshift - Create an Amazon Redshift Cluster
Amazon Redshift - Create an Amazon Redshift ClusterAmazon Redshift - Create an Amazon Redshift Cluster
Amazon Redshift - Create an Amazon Redshift Cluster
FlyData Inc.
 
The Internet of Things
The Internet of ThingsThe Internet of Things
The Internet of Things
FlyData Inc.
 
Create an Amazon Redshift Cluster with FlyData!
Create an Amazon Redshift Cluster with FlyData!Create an Amazon Redshift Cluster with FlyData!
Create an Amazon Redshift Cluster with FlyData!
FlyData Inc.
 
Near Real-Time Data Analysis With FlyData
Near Real-Time Data Analysis With FlyData Near Real-Time Data Analysis With FlyData
Near Real-Time Data Analysis With FlyData
FlyData Inc.
 
FlyData Autoload: 事例集
FlyData Autoload: 事例集FlyData Autoload: 事例集
FlyData Autoload: 事例集
FlyData Inc.
 
Scalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query SpeedScalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query Speed
FlyData Inc.
 
Amazon Redshift ベンチマーク Hadoop + Hiveと比較
Amazon Redshift ベンチマーク  Hadoop + Hiveと比較 Amazon Redshift ベンチマーク  Hadoop + Hiveと比較
Amazon Redshift ベンチマーク Hadoop + Hiveと比較
FlyData Inc.
 
Ad

Recently uploaded (20)

Douwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License codeDouwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License code
aneelaramzan63
 
Maxon CINEMA 4D 2025 Crack FREE Download LINK
Maxon CINEMA 4D 2025 Crack FREE Download LINKMaxon CINEMA 4D 2025 Crack FREE Download LINK
Maxon CINEMA 4D 2025 Crack FREE Download LINK
younisnoman75
 
Landscape of Requirements Engineering for/by AI through Literature Review
Landscape of Requirements Engineering for/by AI through Literature ReviewLandscape of Requirements Engineering for/by AI through Literature Review
Landscape of Requirements Engineering for/by AI through Literature Review
Hironori Washizaki
 
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
ssuserb14185
 
How can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptxHow can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptx
laravinson24
 
How to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud PerformanceHow to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud Performance
ThousandEyes
 
FL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full VersionFL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full Version
tahirabibi60507
 
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage DashboardsAdobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
BradBedford3
 
Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025
kashifyounis067
 
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
Egor Kaleynik
 
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Lionel Briand
 
WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)
sh607827
 
EASEUS Partition Master Crack + License Code
EASEUS Partition Master Crack + License CodeEASEUS Partition Master Crack + License Code
EASEUS Partition Master Crack + License Code
aneelaramzan63
 
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdfMicrosoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
TechSoup
 
Adobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest VersionAdobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest Version
kashifyounis067
 
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Eric D. Schabell
 
Designing AI-Powered APIs on Azure: Best Practices& Considerations
Designing AI-Powered APIs on Azure: Best Practices& ConsiderationsDesigning AI-Powered APIs on Azure: Best Practices& Considerations
Designing AI-Powered APIs on Azure: Best Practices& Considerations
Dinusha Kumarasiri
 
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
Andre Hora
 
Expand your AI adoption with AgentExchange
Expand your AI adoption with AgentExchangeExpand your AI adoption with AgentExchange
Expand your AI adoption with AgentExchange
Fexle Services Pvt. Ltd.
 
Why Orangescrum Is a Game Changer for Construction Companies in 2025
Why Orangescrum Is a Game Changer for Construction Companies in 2025Why Orangescrum Is a Game Changer for Construction Companies in 2025
Why Orangescrum Is a Game Changer for Construction Companies in 2025
Orangescrum
 
Douwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License codeDouwan Crack 2025 new verson+ License code
Douwan Crack 2025 new verson+ License code
aneelaramzan63
 
Maxon CINEMA 4D 2025 Crack FREE Download LINK
Maxon CINEMA 4D 2025 Crack FREE Download LINKMaxon CINEMA 4D 2025 Crack FREE Download LINK
Maxon CINEMA 4D 2025 Crack FREE Download LINK
younisnoman75
 
Landscape of Requirements Engineering for/by AI through Literature Review
Landscape of Requirements Engineering for/by AI through Literature ReviewLandscape of Requirements Engineering for/by AI through Literature Review
Landscape of Requirements Engineering for/by AI through Literature Review
Hironori Washizaki
 
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
ssuserb14185
 
How can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptxHow can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptx
laravinson24
 
How to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud PerformanceHow to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud Performance
ThousandEyes
 
FL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full VersionFL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full Version
tahirabibi60507
 
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage DashboardsAdobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
BradBedford3
 
Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025
kashifyounis067
 
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
Egor Kaleynik
 
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Lionel Briand
 
WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)
sh607827
 
EASEUS Partition Master Crack + License Code
EASEUS Partition Master Crack + License CodeEASEUS Partition Master Crack + License Code
EASEUS Partition Master Crack + License Code
aneelaramzan63
 
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdfMicrosoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
TechSoup
 
Adobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest VersionAdobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest Version
kashifyounis067
 
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Eric D. Schabell
 
Designing AI-Powered APIs on Azure: Best Practices& Considerations
Designing AI-Powered APIs on Azure: Best Practices& ConsiderationsDesigning AI-Powered APIs on Azure: Best Practices& Considerations
Designing AI-Powered APIs on Azure: Best Practices& Considerations
Dinusha Kumarasiri
 
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
Andre Hora
 
Expand your AI adoption with AgentExchange
Expand your AI adoption with AgentExchangeExpand your AI adoption with AgentExchange
Expand your AI adoption with AgentExchange
Fexle Services Pvt. Ltd.
 
Why Orangescrum Is a Game Changer for Construction Companies in 2025
Why Orangescrum Is a Game Changer for Construction Companies in 2025Why Orangescrum Is a Game Changer for Construction Companies in 2025
Why Orangescrum Is a Game Changer for Construction Companies in 2025
Orangescrum
 

What is Change Data Capture (CDC) and Why is it Important?

  • 1. www.flydata.com What Change Data Capture (CDC) is and Why It’s Important
  • 2. What is CDC? Copyright © 2015 FlyData Inc. All rights reserved. CDC captures just the change data so that replication can be done more efficiently. www.flydata.com
  • 3. www.flydata.com Why do we need CDC? • Increase in data collected • Queries take longer because database size is growing Need for BI Databases (Data Warehousing) & Need for Data Replication growing Copyright © 2015 FlyData Inc. All rights reserved.
  • 4. www.flydata.com Methods for Syncing Data ▪ Dumping the whole DB ▪ Change Data Capture (CDC) ▪ Other approaches Copyright © 2015 FlyData Inc. All rights reserved.
  • 5. www.flydata.com Benefits of CDC ▪ No need for full export/import ▪ Near real-time replication is possible => data always available for queries Copyright © 2015 FlyData Inc. All rights reserved.
  • 6. www.flydata.com Steps for CDC 1. Capture Change Data 2. Transform data to format supported on destination 3. Upload to destination 4. (Handle any errors as they occur) Copyright © 2015 FlyData Inc. All rights reserved.
  • 7. www.flydata.com 1. Methods for Change Data Capture • Transaction Logs • Triggers • Custom Solution at Application Layer Copyright © 2015 FlyData Inc. All rights reserved.
  • 8. www.flydata.com 1. Methods for Capturing Change Data Method Pros Cons Transaction Logs • Minimal performance impact • No change to tables or application necessary • Transaction logs are in binary format. • Lacking good documentation Triggers • Handle everything at the SQL level • Significant performance impact to the master DB • Need to define triggers on each table Custom Solutions • If INSERTs only, relatively easy to implement • With UPDATEs and DELETEs, keeping data integrity becomes much harder. Copyright © 2015 FlyData Inc. All rights reserved.
  • 9. www.flydata.com 2. Transforming Change Data Destination DB may have certain data type restrictions: • Identify such differences in data type • Convert the data Example of differences between MySQL and Amazon Redshift: • Case sensitive table/column names: MySQL - yes, Redshift - no • TIME column value beyond 23:59:59: MySQL - yes, Redshift - no • VARCHAR length: MySQL - character length, Redshift - byte length • Date ‘0000-00-00’: MySQL - allowed, Redshift - not allowed • BINARY, VARBINARY: MySQL - supported, Redshift - not supported • ENUM, SET: MySQL - supported, Redshift - not supported Copyright © 2015 FlyData Inc. All rights reserved.
  • 10. www.flydata.com 3. Upload to Destination • Uploading data to BI databases could be tricky. • Error Handling Crucial for CDC data integrity: • Recognize when errors occur • Buffer change data after the error (order important!) • Recognize when error is handled • Resume uploading • Do this in parallel to scale Copyright © 2015 FlyData Inc. All rights reserved.
  • 11. www.flydata.com www.flydata.com Check us out! -> http://flydata.com [email protected] Toll Free: 1-855-427-9787 http://flydata.com We are an official data integration partner of Amazon Redshift