Data Replication

Topics | How To | Related Topics


Overview

Replication Scenarios

Replicating Data with ContinuousDataReplicator

Replication Logs

Interruptions and Restarts

Throttling

Replication Prediction

Orphan Files

How to use CDR to Replicate Data

Fan-Out Considerations

Best Practices


Overview

Data Replication is the process of copying specified, file-level content from one computer, the source computer, to another, the destination computer. This is achieved through an initial transfer of the specified data, after which the replicated copy is kept updated in nearly real time with any changes that are made to the data on the source computer. This replicated copy on the destination computer provides on-going, nearly-real-time disaster recovery protection for the source computer, unlike most data protection solutions which require significant time to perform a complete data protection operation. In addition, data replication provides a basis for additional data protection activities, such as Recovery Points (snapshots) and backups of Recovery Points, which are discussed in more detail below.


Replication Scenarios

Several common scenarios for data replication are illustrated below, but these by no means illustrate all of the ways in which data replication can be configured.

Replication from one Source computer to one Destination computer:

This is the most fundamental configuration for data replication. A single computer on the LAN or WAN has its data replicated to another computer, either local or remote. This provides protection of the source computer against catastrophic failure of the computer itself.

Fan-In Configurations: Replication from multiple Source computers to a single Destination computer:

In a Fan-In configuration, multiple computers on the LAN or WAN have their data replicated to a single computer, either local or remote. This provides protection of all of the source computers against catastrophic failure, while maximizing the use of resources by directing all the data to a single destination computer.

On Windows, most of the configuration of replication and Recovery Point options can be accomplished from the Fan-In tab of the Agent Properties on the destination computer, and these settings are automatically applied to all the source computer. On UNIX, replication and Recovery Point options must be configured on each source computer.

Scalability

Although the scalability of a Fan-In setup can vary based on network and system resources, it is recommended that each Fan-In setup contains no more than 100 source clients.

For maximum performance and robustness, the total number of Replication Pairs configured for the same source volume should be kept to a minimum. If multiple Replication Pairs must be configured for the same source volume, the recommended upper limit is five.

In this configuration for data replication, For more information, see Using ContinuousDataReplicator in a Fan-In Configuration.

Recovery Points for Fan-In Configurations

Recovery Points created for a Fan-In configuration use VSS or ONTAP as the snap engine for creating snapshots. The use of snap engine is based on the destination being used. When the destination is a fixed volume then VSS is used and when the destination is a filer then ONTAP is used for the creating snapshots.

Consider the following for ONTAP snapshots:

 

Fan-Out Configurations: Replication from one Source computer to multiple Destination computers:

This configuration for data replication, referred to as "Fan-Out", adds significantly to the protection afforded to the source computer, because of the redundancy. A single computer on the LAN or WAN has its data replicated to multiple computers, any of which can be either local or remote. This provides protection against catastrophic failure of an entire site, as well as the source computer itself.

For more information, see Fan-Out Considerations below.

Back to Top


Replicating Data with ContinuousDataReplicator

ContinuousDataReplicator (CDR) can be used in a limitless variety of enterprises to replicate data within a CommCell, fully integrating with other Agents, all of which are controlled through the CommCell Console. For more information about CDR, see Overview - ContinuousDataReplicator.

Using CDR, content for replication can be defined at the directory or volume level on a source computer and replicated to a destination computer. Once the initial transfer is complete, a driver on the source computer performs the following:

A persistent connection is used as a data transfer mechanism, optionally compressing and encrypting data across the network, and through this facility, the destination computer is kept in sync with the defined content on the source computer. If the connection is interrupted at any point, the log continues to be maintained on the source computer, and once the connection is restored, CDR will automatically re-sync with the destination computer, bringing the replica up-to-date. Note that re-syncing is time and disk space intensive, and thus to be avoided if possible. For some additional discussion of this subject, see Interruptions and Restarts. If multiple Replication Pairs are active, CDR uses multiple threads to perform these operations on all Replication Pairs in parallel. CDR operations on a T1 link are fully certified. The success of CDR operations on a slower link is not guaranteed.

The process of starting data replication with CDR involves several job phases, as follows:

For more detailed information about Job Phases and Job States, see Monitoring Data Replication.

Back to Top


Replication Logs

CDR maintains logs on the source computer, logging all file write activity (new files and changes to existing files) involving the directories and volumes specified in the source paths of all the Replication Pair(s) on that computer. These replication logs are transferred to the destination computer and replayed, ensuring that the destination remains a nearly real-time replica of the source. Note the following differences in behavior for CDR on Windows and CDR on UNIX:

Destination Computer Considerations

For CDR on Windows, Replication Logs are replayed serially on the destination computer, not in parallel. Thus, if you have many Replication Pairs all configured to use the same destination computer, it should be able to receive and replay the Replication Logs at the same rate at which they are arriving. Ensure that the destination computer is suitable in the following areas; otherwise a backlog of Replication Log files will cause the allocated Log space to diminish to the point that throttling of the source computer(s) will result:

For CDR on UNIX, Replication Logs for different Replication Sets are replayed in parallel, as multiple replay threads are utilized on the destination computer. Ensure that the destination computer is suitable in the following areas; otherwise a backlog of Replication Log files greater than memory capacity will cause the Replication Pair to be aborted:

Location of Replication Logs

Log Space Requirements

Sufficient log file space is required on the source computer, and for CDR on Windows, on the destination computer as well; if a source computer runs out of log space (Windows) or attempts to create new entries in a log file before the old entries have been transferred (UNIX), logging will stop and all logs will be deleted; thus, to avoid an interruption and restart, it is important to have sufficient space allocated for logs. For minimum log space requirements, see System Requirements - ContinuousDataReplicator. These minimums should be considered a recommended starting point; allow more space than recommended if it is available.

Consider the following when allocating space for logs:

The location of log file space is specified when you Install ContinuousDataReplicator, and can be changed in the CommCell Console. To configure the Replication log file location, see Specify CDR Log File Location on Source and Destination Computers for step-by-step instructions.

Deletion of Log Files

Back to Top


Interruptions and Restarts

By default, CDR will always try to handle interruptions by seamlessly restarting replication, or if that is not possible, restarting with Smart Re-Sync; however, some interruptions are of such a nature or duration that a Full Re-Sync will be required.

Smart Re-Sync

Smart Re-Sync is the default behavior of CDR when activities are interrupted and cannot be seamlessly restarted at the same point again. In general, CDR endeavors to do the following in such cases, wherever possible:

For examples of commons types of interruptions, and how Smart Re-Sync handles the recovery, refer to System Behavior when Replication is Interrupted.

For a detailed listing of each phase, and the specifics of the exact point at which Smart Re-Sync restarts activities, refer to Job Phases.

Full Re-Sync

Full Re-Sync should be necessary only in cases such as the following:

In such a case, all existing content in the destination path is considered inconsistent and Full Re-Sync is recommended to rebuild it again based on the current data in the specified source path. When you start replication from the Replication Set or Replication Pair level, you can specify Full Re-Sync, causing the Replication Pair to begin at the Baseline Scan phase.

Changes that Interrupt Data Replication

Changes to the following configuration items will not be effective until data replication activity has been interrupted and restarted:

The following will require data replication to be interrupted and restarted:

System Behavior when Replication is Interrupted

There are several ways in which data replication activity can be interrupted, and CDR recovers from each of them in a similar manner. The table below provides a listing of common causes of interruption, and the effect of them on Baselining, SmartSync, and data replication, as well as how CDR recovers from them. For specific details about how restarts are handled in each particular phase, refer to the Comments section of the table in the Job Phases section.

INTERRUPTION EFFECT OF INTERRUPTION & SMART RE-SYNC
Abort a Replication Pair during Baselining phases Baselining activities stop on the source.

When the Replication Pair is restarted, Baselining activities will resume, restarting at the beginning of the phase if necessary, then SmartSync and data replication activities will begin automatically.

Abort a Replication Pair during SmartSync phases Logging stops on the source.

When the Replication Pair is restarted, SmartSync activities will resume, restarting at the beginning of a phase if necessary, and data replication activities will begin automatically.

Abort a Replication Pair during Replication phase Logging stops on the source.

When the Replication Pair is restarted, for NTFS or UNIX, Smart Re-Sync will continue the data replication activities automatically; for FAT file systems, Full Re-Sync will be necessary.

Suspend a Replication Set Baselining, SmartSync, and data replication activities stop for all Replication Pairs, but any logging activities will continue on the source.

When the Replication Set is resumed:

  • for any Replication Pairs that were performing data replication, CDR will transfer the accumulated logs to the destination, and data replication will continue.
  • for Replication Pairs that were in the Baselining or SmartSync phases, how activities begin again will depend on the exact phase the Replication Pairs were in, as well as the operating system type.
Graceful or non-graceful shutdown of the source computer The destination computer continues to replay the logs it has received.

When the source computer and software are running again, Replication Pair(s) will be in the "System Aborted" state for some time, then Smart Re-Sync will be performed.

Graceful or non-graceful shutdown of the destination computer Logging continues on the source.

When the destination computer and software are running again:

  • for any Replication Pairs that were performing data replication, CDR will transfer the accumulated logs to the destination, and data replication will continue.
  • for Replication Pairs that were in the Baselining or SmartSync phases, how activities begin again will depend on the exact phase the Replication Pairs were in, as well as the operating system type. Refer to the Comments section of the table in Job Phases for specific details.
CDR software shutdown on the source All CDR-related activities stop.

When the software is restarted, CDR will start Smart Re-Sync.

CDR software shutdown on the destination Logging continues on the source.
  • for any Replication Pairs that were performing data replication, CDR will transfer the accumulated logs to the destination, and data replication will continue.
  • for Replication Pairs that were in the Baselining or SmartSync phases, how activities begin again will depend on the exact phase the Replication Pairs were in, as well as the operating system type. Refer to the Comments section of the table in Job Phases for specific details.
Replication Service is stopped on the source Baselining, SmartSync, and data replication activities stop for all Replication Pairs, but logging continues on the source, and the destination computer continues to replay the logs it had received before the service was stopped.

When the Replication Service is started again:

  • for any Replication Pairs that were performing data replication, CDR will transfer the accumulated logs to the destination, and data replication will continue.
  • for Replication Pairs that were in the Baselining or SmartSync phases, how activities begin again will depend on the exact phase the Replication Pairs were in, as well as the operating system type. Refer to the Comments section of the table in Job Phases for specific details.
Replication Service is suspended on the destination Baselining, SmartSync, and data replication activities stop for all Replication Pairs, and log replay stops on the destination, but logging continues on the source.

When the Replication Service is started again:

  • for any Replication Pairs that were performing data replication, CDR will transfer the accumulated logs to the destination, and data replication will continue.
  • for Replication Pairs that were in the Baselining or SmartSync phases, how activities begin again will depend on the exact phase the Replication Pairs were in, as well as the operating system type. Refer to the Comments section of the table in Job Phases for specific details.
Interruption of network connectivity (source and/or destination) Baselining, SmartSync, and data replication activities stop for all Replication Pairs, but logging continues on the source, and the destination computer continues to replay the logs it had received before the network connectivity was interrupted.

When network connectivity is restored:

  • for any Replication Pairs that were performing data replication, CDR will transfer the accumulated logs to the destination, and data replication will continue.
  • for Replication Pairs that were in the Baselining or SmartSync phases, how activities begin again will depend on the exact phase the Replication Pairs were in, as well as the operating system type. Refer to the Comments section of the table in Job Phases for specific details.

If the network interruption is for a significant amount of time, the following will occur:

  • For CDR on Windows, the status of the Replication Pair will become "Failed", and will need to be restarted manually with Smart Re-Sync when connectivity is restored.
  • For CDR on UNIX, CDR will continue to retry sending the logs to the destination computer until network connectivity is restored.
Source computer runs out of log space (Windows)

-- or --

Source computer tries to create new entries in a log before the old entries have been transferred to the destination (UNIX)

Logging will stop, all logs will be deleted, all Replication Pairs will be System Aborted.
  • For CDR on Windows, the system will wait 3 minutes, then check space on the log volume. If there is sufficient space, a Smart Re-Sync will occur; if not, the Replication Pair will be Aborted.
  • For CDR on UNIX, a Smart Re-Sync will occur.

 

note.gif (292 bytes)
  • Multiple sources (Fan-In) or multiple destinations (Fan-Out) - each of the cases listed will generally work the same for Fan-In and Fan-Out configurations; bear in mind that when more than one source or destination is involved, the implications for each one of them must be considered in this context. For additional considerations, see Fan-Out Considerations below.
  • Data Replication will be interrupted if a hard disk used for either a source or destination is put into the 'standby' state through the power schema configuration. It will be necessary to abort activity for all affected Replication Sets and restart them again using Start Full Resync after such an event.

For instructions on restarting replication after it has been interrupted, see Start/Suspend/Resume/Abort Data Replication Activity.

Back to Top


Throttling

You can configure several throttling options for CDR at the Agent level, in the Operational Parameters tab of the CDR Properties screen.

Replication Activity Throttling (CDR on Windows only)

The following can be configured on the Source computer:

The following can be configured on the Destination computer, and is recommended; it will impact all source computers that use this destination computer:

Network Bandwidth Throttling

The following can be configured on the Source computer:

note.gif (292 bytes)
  • When configuring throttling, you should consider what unintended affects throttling might have on operations. As one example, if you have a source computer that has significant file write activity, and you impose network bandwidth throttling which makes it impossible to transfer the logs quickly enough to the destination computer to keep pace with the rate of change on the source computer, log file space requirements would increase dramatically on the source computer. In such a case, provision must be made for sufficient log file space, based on the expected activity and throttling.
  • An example of a beneficial use of throttling involves configurations where multiple source computers are all configured to use the same destination computer. In this case, you may want to impose throttling on the source computers to allow the destination computer enough time to keep pace with all the log files it is receiving, and ensure sufficient log space on the destination computer as well to accommodate all of the logs it will be receiving.
  • You can configure Alerts to be generated when throttling is imposed, or when 80 percent or more of a volume's disk space is consumed, for all of the client computer's volumes. For more information, see Alerts and Monitoring.
  • On Windows in a clustered environment, when a cluster node is the active node for more than one virtual server at the same time, throttling rules are applied equally to all of the virtual servers hosted by that physical node, using the highest numbers specified for any one of them. For example, consider an active node hosting three virtual servers simultaneously, with throttling configured as follows on each of the virtual servers, VS1, VS2, and VS3:
    Throttling Parameter VS1 VS2 VS3
    Throttling based on percentage of free log space on destination: 30% 35% 40%
    Stop replication based on percentage of free log space on destination: 80% 70% 60%
    Abort source based on percentage of free log space on source: 75% 80% 70%
    Network Bandwidth Throttling amount: 10Mbps 40Mbps 90Mbps

    Since throttling for all Virtual Servers will be based on the highest number specified for any one of them, all three Virtual Servers would be subject to the throttling numbers shown in bold, not necessarily the numbers specified individually. If throttling is imposed based on the destination computer running low on log space, in this example, when free log space reaches 40% on any virtual server, the maximum transfer rate will be reduced by 50% on each of the virtual servers -- to 5Mbps on VS1, 20Mbps on VS2, and 45Mbps on VS3.

 

For step-by-step instructions, see Configure Throttling for CDR Replication Activities.

Back to Top


Replication Prediction

Replication Prediction can be used to track the size of the data that has been added or modified for the time during which a pair is active and monitoring; for Windows file systems, monitoring is performed at the volume or folder level; for UNIX, monitoring is performed at the file system level. This information is used to estimate the amount of data throughput required per hour, day, etc., and thus whether the bandwidth of the current connection will be sufficient for the predicted data replication activity. For instance, to see how much data will be replicated for an Exchange Server during each workday or for the whole week, you can start monitoring all folders used by the Exchange Server (stores, logs etc.) After 24 hours or a week, you can check the size of data modified, and use that information to estimate bandwidth requirements.

Replication Prediction reports the following for each monitored folder, volume, or file system:

To use the Replication Prediction, see Perform Replication Prediction for step-by-step instructions.

Back to Top


Orphan Files

Files that are in the destination directory, but not the source directory, are orphan files. You can choose to ignore, log, or delete such files that are identified in the destination path; these settings are configured in the Orphan Files tab of the Replication Set Properties.

To configure Orphan File settings, see Configure Orphan File Processing for step-by-step instructions.

To view Orphan Files, see View Orphan Files for step-by-step instructions.

Things to Consider

Back to Top


How to use CDR to Replicate Data

The following section provides the steps required to use CDR for data replication, based on a single source and single destination. If your environment uses a different scenario, adjust your steps accordingly.

  1. Select two computers on which to install CDR, one designated as the source computer, and one designated as the destination computer.
  2. If you are using QSnap, consider the following:
  3. When using CDR on UNIX to replicate files with non-ASCII character names, perform the procedure detailed in Handling Files with non-ASCII Characters.
  4. For CDR on Windows, if you will be replicating application data, see Change Account for Accessing Application Servers.
  5. On both the source and destination computers, it is recommended that you Configure Throttling for CDR Replication Activities.
  6. It is recommended that you also Configure Alerts. For more information, see Application Management Alerts for CDR and Job Management Alerts for CDR.

    For CDR on Windows, when using VSS or QSnap on a source computer it is recommended that you also see Space Check for the Quick Recovery and ContinuousDataReplicator Agents and configure the Disk Space Low alert to provide warning that the source computer is running out of disk space, which will ultimately cause replication activity to be System Aborted.

  7. Create a Replication Set. (You can also use the Wizard for this, by right-clicking the CDR icon and selecting Replication Set Creation Wizard from the All Tasks menu.)
  8. Optionally, Configure CDR Recovery Points.
  9. Optionally, Configure CDR for Backups of Recovery Points.
  10. Add a Replication Pair. (If you created the Replication Set using the Wizard, you can skip this step.)
  11. Start Data Replication Activity.
  12. Monitor Data Replication Activities.

 

Back to Top


Fan-Out Considerations

For an overview of a Fan-Out configuration, see Fan-Out.

Follow the guidelines in How to use CDR to Replicate Data to install and configure all the computers that will function as either a source or destination.

Consider the following for Fan-Out configuration:

Back to Top


Best Practices

It is recommended that you keep the following in mind when performing data replication:

 


Back to Top