Data Replication

Topics How To  

Table of Contents

Overview

Supported Configurations

Data Synchronization

Full Resync

Smart Sync

Optimized Sync

Two-Way Sync

Replication Prediction

Replication Logs

Throttling

Orphan Files

Things to Consider

Data Replication Monitor

Out Of Band Sync

Replicate the Destination Data Back to the Source Computer

Important Considerations

General

Windows

Unix

Registry Keys for Data Replication

Overview

Data Replication is the process of copying specified, file-level content from one computer, the source computer, to another, the destination computer. This is achieved through an initial transfer of the specified data, after which the replicated copy is kept updated in real time with any changes that are made to the data on the source computer. This replicated copy on the destination computer provides on-going, nearly-real-time disaster recovery protection for the source computer, unlike most data protection solutions which require significant time to perform a complete data protection operation. In addition, data replication provides a basis for additional data protection activities, such as Recovery Points (snapshots) and backups of Recovery Points.

The content for replication can be defined at the directory or volume level on a source computer and replicated to a destination computer. Once the initial transfer is complete, a driver on the source computer performs the following:

A persistent connection is used as a data transfer mechanism, optionally compressing and encrypting data across the network, and through this facility, the destination computer is kept in sync with the defined content on the source computer. If the connection is interrupted at any point, the log continues to be maintained on the source computer, and once the connection is restored, CDR will automatically re-sync with the destination computer, bringing the replica up-to-date. Note that re-syncing is time and disk space intensive, and thus to be avoided if possible. For some additional discussion of this subject, see Interruptions and Restarts. If multiple Replication Pairs are active, CDR uses multiple threads to perform these operations on all Replication Pairs in parallel. CDR operations on a T1 link are fully certified. The success of CDR operations on a slower link is not guaranteed.

Supported Configurations

Some of the scenarios for data replication are listed below, but this is not a complete list of all the possible data replication configurations.

Data Synchronization

The following options can be used to perform data transfer from source to destination:

Full Resync

Full Resync should be necessary only in cases when no data presented on destination. Full Resync copies all the files from the source to the destination computer. When you start Full Resync at Replication Set or Replication Pair level, you can specify Full Resync, causing the Replication Pair to begin at the Baseline Scan phase.

Smart Sync

Smart Re-Sync is the default behavior of CDR when activities are interrupted and cannot be seamlessly restarted at the same point again. In this case all new/modified data will be transferred from source to destination.

Optimized Sync

If replication is interrupted and there is a chance that the data on the destination is manually partially deleted or modified etc., the destination path is considered as inconsistent and optimized sync is recommended to rebuild it again based on the current data in the source path with consideration of data which already presented on destination.

Optimized Sync is used to transfer the modified/new files on the source computer to the destination computer along with data missing on destination. In previous attempts of sync had failures these failures will be re-tried during running Optimized Sync.

Optimized can be used in the following scenarios

See Start Data Replication Activity for step-by-step instructions.

To perform the Optimize Sync, see Add a Replication Pair for step-by-step instructions.

To change the state of one or more Replication Pairs at once from the Replication Set level, see Change the state of Replication Pair for step-by-step instructions.

Two-Way Sync

You can use CDR to perform a two-way sync of data between the source and destination computers. The two-way sync up ensures that the data on the source and destination client computers is same and in sync with each other. The advantage of this feature is that the changes made to the data on the destination computer are replicated to the source computer. See Two-Way Sync, for more information.

Replication Prediction

Replication Prediction can be used to track the size of the data that has been added or modified for the time during which a pair is active and monitoring; for Windows file systems, monitoring is performed at the volume or folder level; for UNIX, monitoring is performed at the file system level. This information is used to estimate the amount of data throughput required per hour, day, etc., and thus whether the bandwidth of the current connection will be sufficient for the predicted data replication activity. For instance, to see how much data will be replicated for an Exchange Server during each workday or for the whole week, you can start monitoring all folders used by the Exchange Server (stores, logs etc.) After 24 hours or a week, you can check the size of data modified, and use that information to estimate bandwidth requirements.

Replication Prediction reports the following for each monitored folder, volume, or file system:

For step-by-step instructions, see Perform Replication Prediction.

Replication Logs

CDR maintain logs on the computer, logging all file write activity (new files and changes to existing files) involving the directories and volumes specified in the source paths of all the Replication Pair(s) on that computer. These replication logs are transferred to the destination computer and replayed, ensuring that the destination remains a real-time replica of the source. For more information, see Replication Logs.

Throttling

Throttling enables you to monitor and control the data replication activities. It also allows you to configure the rate of data transfer over the network, based on the throttling parameters. The various throttling options (including throttling amount and rules) can be configured. For more information, see Throttling.

Orphan Files

Files that are in the destination directory, but not the source directory, are orphan files. You can choose to ignore, log, or delete such files that are identified in the destination path; these settings are configured in the Orphan Files tab of the Replication Set Properties.

To configure Orphan File settings, see Configure Orphan File Processing for step-by-step instructions.

To view Orphan Files, see View Orphan Files for step-by-step instructions.

Things to Consider

Data Replication Monitor

Replication is a continuous activity and details of on-going data replication activity is shown in the Data Replication Monitor in the CommCell Console. The process of starting data replication with CDR involves several job phases, as follows:

For more detailed information about Job Phases and Job States, see Monitoring Data Replication.

All other job-based activity, such as Recovery Point creation, is reflected in the Job Controller. See Controlling Jobs in Job Management for comprehensive information.

Out Of Band Sync

In cases where large amounts of data must be transferred from the Source computer to the Destination computer during Baselining, but the connection between the source and the destination is constrained, such as a slow WAN connection, you may not want to begin replication using the Baselining Phases. You may prefer, for instance, to back up the source and restore it to the destination to effect the initial transfer of data.

To perform the initial transfer of data without using baseline, see Out Of Band Sync from the Replication Set for step-by-step instructions. After the transfer of data, start the Replication Pair with Start, so that only the data that is new or modified since the backup will need to be replicated.

Replicate the Destination Data Back to the Source Computer (Windows Only)

It is recommended that you keep the following in mind when performing the replicate data back to the source computer:

To Replicate the Destination Data Back to the Source Computer, see Replicate the Destination Data Back to the Source Computer for step-by-step instructions.

Important Considerations

It is recommended that you keep the following in mind when performing data replication:

General

Windows

Unix

Cross Platform Replication

Cross Unix platform data replication is now supported. For example, you can replicate data from a AIX source computer to a Solaris destination computer, or Solaris to Linux, etc. However, ACLs and Extended Attributes will be lost.

Registry Keys for Data Replication

Use the following registry keys to modify the default behavior of the Data Replication:

Topic Registry Key(s) Description
Change Journal dwCJSizeAsPercentOfVolumeSize ContinuousDataReplicator on Windows and the Windows File System iDataAgent use Change Journal to track updates made to Windows File Systems. On very large or very busy file systems, it may be necessary to increase the size of the change journal in cases where the agent or enabler is performing full scans too frequently. You can control the amount of volume space that is allocated for Change Journal when it is created by using the dwCJSizeAsPercentOfVolumeSize registry key value.
Pipeline Buffer Size PipelineBufferSizeInKiloBytes Replication performance is used to increase the speed at which data is replicated. The pipeline buffer size can be reconfigured from the default size of 64KB up to a maximum of 256KB (in increments of 32KB) using the PipelineBufferSizeInKiloBytes registry key. See PipelineBufferSizeInKiloBytes in Registry Keys for more information.
Connection Attempt MaxConnectionAttempts When communication is interrupted between the source and destination computers, the source computer will make 30 attempts (this default number can be changed using the MaxConnectionAttempts registry key) to reconnect to the pipeline, after which the Replication Pair(s) will show a state of Failed. Each connection attempt takes several minutes, an interval which is neither programmatic nor configurable.
Access Control  Files nDoNotReplicateACLs For Windows, the nDoNotReplicateACLs registry key can be used to disable the replication of the security stream of files. This stream includes user and group access control list (ACL) settings for file access. If this registry key is not present, ACLs will be replicated.

Back to top