Let us first dive into the concept of what is Change Data Capture (CDC) relevant to the Oracle database management system (DMS).
In very basic terms, Oracle CDC is a software design pattern widely used by organizations around the world to track changes made to a database. Based on these changes, businesses can then take necessary action that will optimize their operating efficiency.
Apart from this function, the CDC also links data that has been sourced from data capture and identification as well as any data resulting from changes made to the primary databases in enterprises.
Technology Behind Oracle Change Data Capture
Oracle CDC is based on advanced cutting-edge technology and works in various ways. These can be classified as follows.
- It speeds up data warehousing which in turn takes the availability and performances of databases to another level.
- Oracle CDC helps data integration across businesses in real time.
- It assists in carrying out different types of replication activities. This includes migrating databases to the cloud and divesting queries to data warehouses or other similar platforms from databases in production.
- With Oracle CDC, there is no need for downtime or shutting down systems at any time during database migration.
Hence, in a nutshell, Oracle CDC extracts data from a source database and transfers it to a data warehouse. This is true also for incremental data that has been generated after the initial data migration is completed.
Being able to capture and preserve data in a data warehouse is one of the most important functions of Oracle Change Data Capture. Developers can use this feature in different ways, starting from application logic to physical storage in one or several configurations of system layers.
Growth of the Oracle Change Data Capture Technology
Oracle first launched the Change Data Capture feature out of the box with its 9i version of the database management system.
In this version, Oracle CDC could track and record changes in tables in databases. These changes were stored in specific change tables to be used in Extract, Transform, and Load (ETL) applications. This changed data post-processing and formatting were kept in data warehouses and databases.
The process of Oracle CDC in this version functioned through in-built triggers in the source tables. However, this method of CDC did not find favor with DBAs who not only found it very complex and elaborate but also very time-consuming.
Keeping this feedback in mind, Oracle made certain changes to its CDC tool and launched a new one with its 10g version. It was named Oracle Streams and worked by using the redo logs of the source database in Oracle. It could automatically detect any change in the database and transfer it to a target data storage system without lowering the performance of the source database.
This version of the Oracle CDC was very well-received but Oracle unfortunately withdrew this feature from their 12c version of DMS. Streams no longer supported the CDC. Instead, users had to pay for Oracle GoldenGate which had CDC as its built-in feature, or seek alternate solutions.
The Current Format Of The Oracle CDC
There are two systems that must work in tandem if CDC has to function. One is the system that stores the changed data and the other is one which takes specific actions based on those changes made. The first system is called the source database and the second where data is transferred is the target database.
However, even in cases where the source and the target databases are the same, Oracle CDC functions with the same efficiency. Several CDC solutions may be present in a single system. The changes made to the source data are picked up by the Oracle Data Integrator of the Oracle CDC which supports two journalizing modes as follows.
- Simple Journalizing Mode: It tracks changes to stand-alone data stored in the system.
- Consistent Set Journalizing Mode: This mode is used when changes made to a set of data stores need to be tracked after considering the referential integrity between the data stores.
Forms of Oracle Change Data Capture
Organizations looking for Oracle CDC solutions have two options depending on the requirements.
- Synchronous Change Data Capture: This form of CDC has triggers inserted into records in a change table. These triggers are activated when any changes are identified.
For this mode to function, a user must act as a change data publisher with access to the source database tables from where the changes must be tracked and captured. Any changes that are made lead to changesets and tables being created. This is possible through a script that copies the data, develops the records, and transfers the data to the target database.
An issue in the Synchronous Change Data Capture is that the triggers slow down the performance of the database.
- Asynchronous Change Data Capture: In this form of Oracle CDC, data is sent to the redo log files. Changes to the data can only be captured after an SQL statement performs a DML activity. Transactions here are not affected by CDC as the changed data is not a part of the transaction that led to the changes in the source table. The Asynchronous Change Data Capture is structured on Oracle Streams and has a relational interface.
There are three modes in this form of Oracle CDC, namely, HotLog, Distributed HotLog, and AutoLog.
Even though users must now pay for Oracle CDC, the many features that it brings to the table make it a good investment. It has, over the years, optimized database administration as well as migration and replication activities in organizations.