ContinuousDataReplicator Disaster Recovery Solution for Building a Standby SQL Server

Prepare the Production Server
Prepare the Standby Server
Setting up Replication

Overview

This document describes the procedure necessary to enable rapid recovery of Standby SQL Server at the disaster recovery site using ContinuousDataReplicator (CDR). In addition to using the live replicated data, this procedure also uses Consistent Recovery Points (CRPs) to allow for alternative recovery points in the event the live (replicated) data can not be used by SQL Server due to consistency issues. This procedure involves pre-configuring SQL Server on the Standby machine to eliminate these steps during an actual disaster recovery, reducing the recovery time.

The configuration described below uses the Standby SQL Server machine as the target. However, an intermediate server could also be used as the CDR target.

This procedure has been validated using SQL Server 2000 on Windows 2003 with SP1. Newer versions of SQL should behave the same in these scenarios.

Advantages

ContinuousDataReplicator provides a mechanism for asynchronously replicating file system and application data to a remote site. Data is replicated at the byte and file level which provides for a very effective use of available network bandwidth, thus making it ideal for disaster recovery scenarios. As SQL Server is not dependent on Active Directory no domain membership is required, and this further simplifies the fail-over process.

Configuration

The configuration described here involves setting up CDR to replicate directly to the Standby Server. The Standby SQL Server machine will replace the production machine by assuming its name, but will retain its existing IP address. The original DNS entry for the Standby Exchange server will remain intact but the IP address for the Production Exchange server will be set to the IP address of the Standby Server. The diagram depicts the configuration used in this approach.

Prepare the SQL Production Server

This procedure assumes that the Production SQL Server (iota) is operational and installed with all the required service packs and/or patches. It also assumes that the Standby SQL Server (pi), which is also the CDR target, is running as a member of the domain if domain membership is required and that the CommServe has already been installed.

Make sure that all databases, including the system databases, are moved off of the C: drive. This is usually the case in production environments but the location should be verified.
Install the CDR agent on the server along with the File System agent so the CRPs can be backed up. Considering SQL Server iDataAgent is also installed. It is expected that standard backups will be executed on the Production SQL Server to manage the logs. Note that CDR will not truncate SQL Server logs since all backup operations associated with CDR are file system based. This means that although a CRP will quiesce SQL Server and provide for a consistent point in time view there is no knowledge of the application during the backup phase. Hence, the SQL Server APIs which truncate logs are not utilized. Note that as the standard SQL Server iDataAgent backup causes logs to be truncated on the production server the deletion of the log files will be replicated to the target by CDR, so cleanup will be automatic on the target server.
NOTE: If using the standard SQL DB agent is not possible on the Production SQL Server then the logs must be managed through an alternate mechanism.

Prepare the Standby SQL Server

Configuring the Standby SQL Server will require that it be disconnected from the production server’s network to avoid a name clash. If the Standby Server is being created at recovery time then it can be connected to the production network since the original production server will be down. Though these steps could be performed during the actual recovery, it would increase the recovery time and potentially introduce more opportunity for error. So it is recommended that these steps be performed in advance to provide the most straight forward recovery procedure.

Install the ContinuousDataReplicator.
Create volumes and folders for storing SQL Server data and logs equivalent to those on the Production server. These will be used for replication targets for the live data as well as operational data and logs locations when the Standby SQL Server machine is being used.
Disconnect the Standby Server (pi) from the network.
Change the name of the Standby Server (pi) to the name of the production SQL Server (iota) and reboot.
Install MS SQL Server using the /DisasterRecovery.
Example: Z:> setup.exe /DisasterRecovery
1. You may receive an informational message on installing the service pack for SQL Server. This situation will be corrected by CommCell and may be disregarded:
2. When the installation wizard starts make sure that you install all of the components that were on the Production SQL Server. Note that you will need to select the same installation folders that were used on the Production server.
Load the Service Pack(s) and patches that were loaded on the production SQL Server also using the DisasterRecovery switch option The installation wizard will run and install the updates.
Make sure that all of the SQL Server services are stopped and set their startup type to Manual.
Rename the Standby SQL Server machine back to its original name (pi) and reboot for the change to take effect.
Reconnect the Standby SQL Server machine to the production network.

Setting up Replication

Now that the servers are prepared and the CommCell software has been, installed the replication set and associated replication pairs can be set up and replication started. The replication set and pairs can be either configured manually, or by the use of wizard. Note the destination paths on the target must be set to mimic the source. See ContinuousDataReplicator for more details on installation and configuration of CDR.

At this point the pairs are in sync and all of the initial data has been replicated to the target (pi). The next steps would be to schedule Consistent Recovery Points along with backups of those Recovery Points. This allows for Point-In-Time recovery for the SQL Server with known consistent data. Alternatively, the live data can be utilized which will provide the closest to up to the minute recovery. However, live data will be crash consistent at best and is not guaranteed to be recoverable by SQL Server.

Bring Back the Standby Server

As there are no tight dependencies with Active Directory recovering the SQL Server application on the Standby SQL machine is rather straight forward. If the machines do belong to a domain then you should DELETE the production SQL Server machine (iota) entry from the Active Directory.

Perform the following steps to bring the Standby Server online:

It is likely that the initiation of the fail-over to the Standby SQL Server machine is due to the Production SQL Server crashing but in any case the production server (iota) should be shutdown.
If the machines were domain members then on the Domain Controller using Active Directory Users and Computers delete the entry for the Production SQL Server (iota).
Rename the Standby SQL Server (pi) to the Production SQL Server (iota) and reboot.
In DNS change the IP address of the Production SQL Server (iota) to match the IP address of the Standby SQL Server (pi). By changing the IP address of the original Production SQL Server (iota) to the Standby Server’s address clients will be able to connect to SQL correctly. Replacing the host entry with an alias record will accomplish the same result.
Leave the existing entry for the Standby Exchange server (pi) intact so that the system can be accessed by its original name as well.
At this point the live replica data can be used or a recovery point selected to recover SQL Server. Note that that live data is crash consistent at best and can pre-date the crash point in time due to the asynchronous nature of CDR. However, it will provide the closest to current data for recovery purposes and not require a copy operation since it is in the proper location already. If recovery from the latest replicated data is not desirable or possible due to inconsistencies, then a CRP or backup can be chosen instead. From the CommCell Console browse the CRPs and select the one to be utilized on the Standby Server.
Recovery of CRP data can be done two ways. (Use one method.)
1. The Copyback button can be used to initiate the copy operation from the GUI. The original Standby SQL Server (pi) is selected as the recovery host and the Copyback function is initiated. This is the most straight forward method.
  NOTE: The copy back will overwrite the live copy so be sure that the live copy will not be required in the future. Otherwise manually copy the data to a new location before issuing the Copyback function. After selecting the Standby SQL Server machine by its original name (pi) click OK to start the copy operation.
2. Alternatively, the CRP could be mounted and manually copied to the required location. This is accomplished through the View Snapshots function. Click View Snapshots Each of the snapshots in the CRP will be listed and each should be mounted by selecting the individual snapshot and clicking the Mount button which exposes the mount point dialog so that a drive letter can be assigned.
Start the SQL services on the Standby SQL Server and set the Startup Type for each to automatic if the configuration will remain intact for a sufficient period of time.
Verify that SQL is operational and all databases are accessible.