Cross-Technology Data(base) Replication with Dataddo: Flexible, Reliable, Cost-Effective

By Tom Sedlacek | 7 min read

If you’re reading this, it’s because you’ve realized that replicating data across databases is a pain to do with an in-house solution, and you want to find an external tool that can automate it. Maybe you need to utilize operational data for analytics? Maybe you need to migrate or back up data across systems? Whatever the reason, this article will help you understand what you can achieve by doing it with Dataddo and how.

Click the links below to skip down:

 

Multiple Methods & Database Technologies, One Fixed Price

Most organizations replicating their data across databases use a combination of two general methods: full replication and change data capture (CDC) replication. Dataddo supports both, including various types of CDC, which we will discuss in more detail below. We also support replications across any two technologies (like MySQL to Google BigQuery). 

So, when replicating data with us, you are not confined to any single method, nor are you confined to any one database technology (although you can be, if you want to).

“That sounds nice,” you’re thinking to yourself, “but what about pricing?”

Price depends on how many tables you want to replicate, which only you know at this moment. But, rest assured that, once you’ve chosen a pricing plan, you’ll always pay a predictable amount in accordance with that plan. In an industry characterized by convoluted enterprise pricing packages, you’ll find this refreshingly transparent, especially given that Dataddo is a proper enterprise data integration tool. If you need to upgrade or downgrade, that’s easy, too.

 

Your Concerns, Our Livelihood

If any of the following sounds familiar, you’re in the right place.

  • I need a technology-agnostic replication tool that will connect the databases I use now and in the future.

Dataddo can connect any two databases, regardless of the technologies they are based on; this means databases using the same technology (like AWS RDS (PostgreSQL) and Google Cloud SQL (PostgreSQL)), as well as databases using different technologies.

If you need a database connector we don’t have, we’ll build it for free in about 10 business days.

  • I need a system that can detect and resolve replication errors quickly, and I need to keep downtime to a minimum.

Dataddo’s platform and connectors are fully managed, and client pipelines are proactively monitored by a team of integration specialists, who ensure that downtime is always kept to a minimum. Additionally, the platform has an embedded monitoring system, an anomaly detector, and a growing box of no-code tools for easy troubleshooting.

  • I want to save money by replicating only newly added or updated data, including deleted rows.

We specialize in both log-based and query-based CDC, allowing you to granularly define which changes to capture, including deleted rows. No need to waste money on unnecessary full replicas.

  • I don’t want to waste a lot of time deploying a replication tool.

For most replication cases, Dataddo can be deployed in minutes thanks to easy initial load configuration via the platform’s interface.

 

More on Replication Methods: How Dataddo Does It

No single replication method is perfect for everything, which is why Dataddo supports a combination of any of the following methods, to any extent, for any two databases (regardless of the technology they are based on).

 

Full Replication

full replication

Full replication is the simplest and most reliable method for replicating data. Most often, it is used to replicate small tables. 

Replication of large tables puts a heavy load on the production database, and may be cost-inefficient because large tables tend to contain rows that don’t change from day to day. Nevertheless, the full replication method is sometimes necessary for organizations that need to migrate or back up large tables, or preserve full snapshots of their data over time.

In general, change data capture (CDC) is the better method for replicating data in large tables.

 

Change Data Capture (CDC) Data Replication

CDC data replication is the copying of only newly added or modified data from one database to another. Imagine you have a table with around 1,000 rows, but only 5 of them change or are updated daily. In this case, making full replicas would be a near-complete waste of resources. Enter CDC. 

There are several use cases for CDC, the most common of which are:

  • Data warehousing for analytics. CDC helps deliver analytics data to business teams in near real time, because it quickly sends changes in operational data to the warehouse as they are made.
  • Database migration or recovery. CDC helps keep all data available in case of downtime in one database. 

Dataddo supports two main CDC techniques: query-based and log-based. When used together, these can essentially cover all CDC needs.

Query-Based CDC

query-basd cdc

Query-based CDC is best for capturing newly updated or added rows. It is a highly flexible technique that allows you to get very granular in your customization of the replication process.

This is easy to automate with Dataddo, because queries can be configured and maintained in the Dataddo UI, without any coding. By the same token, troubleshooting is easy.

Query-based CDC is simpler to set up in some databases but might be less efficient for high-transaction-rate databases due to potential performance impacts from frequent queries. It also cannot track deleted rows. This is where log-based CDC comes in.

Log-Based CDC

log-based cdc (1)

Most database technologies maintain binary logs that document all changes to the database. Log-based CDC copies the changes—inserts, updates, and even deletes—from these logs rather than the database itself. With log-based CDC, you can also achieve lower latency than with query-based CDC.

Like query-based CDC, log-based CDC is easy to automate in Dataddo, because all you need to do is choose a table, select the columns you want to replicate, and set a log start time.

Dataddo platform

These three data replication methods—full replication, query-based CDC, and log-based CDC—cover the vast majority of database replication needs. Small tables, large tables, full replicas, newly changed data, periodic batch processing, near real-time processing, cross-technology replications, same-technology replications, and more.

 

Case Studies: The Proof in the Pudding

Interested in how other organizations are using Dataddo to replicate data?

Read our case study on food and beverage company FoodChéri, which reduced its monthly infrastructure bill by 25-50% via CDC-based replication.

foodcheri & dataddo (1)

Or check out our case study on healthcare organization Sensire, which used Dataddo’s replication functionality to accelerate migration of a proprietary, on-premise data infrastructure to the cloud.

sensire & dataddo

Or click the button below to schedule a 15-minute call.

 

Connect All Your Data with Dataddo

Move your data from any online services to any data warehouse, between any two warehouses, and from any warehouse into any operational applications.

Contact Sales


Category: Product, Tools, tips-tricks, data-strategy

Comments