QR Disaster Recovery Solution for an Exchange Cluster Using a Standby Server


Overview

Configuration

Bring the Standby Server online


Overview

This document describes the procedure necessary to create a Standby Exchange cluster server in the event that a Production Exchange cluster server is temporarily or permanently damaged.

This procedure is supported with Exchange 2000 or Exchange 2003 on Windows 2000 on MSCS cluster or Exchange 2003 on Windows 2003 MSCS cluster. The cluster is assumed to be active/passive. This procedure has been certified with the Quick Recovery Agent and QSnap, using either the VSS snap engine for Windows 2003 or the QSnap snap engine for Windows 2000. A familiarity with the functionality and configuration of the Quick Recovery Agent is necessary to properly conduct this procedure.

The key to successfully recovering your Exchange data is proper configuration of the Production and Standby Servers. This ensures that the data can quickly and easily be brought online from the QR Volumes on the Standby Server.

The system administrator will prepare the Standby cluster server by installing Exchange on both physical nodes and using QR Agent to mirror the Exchange database and log volumes from the Production Server. In the event of a Production Server failure, the system administrator would create a new group on the Standby cluster using the Production Server's network name and IP.  The QR Volume copies can then be added as physical disk resources and the Exchange System Attendant resource will be added to complete the transformation of your Standby Server.

Advantages


Configuration

Refer to Quick Recovery Agent for details on installing and/or configuring QR Agent in a Clustered Environment

Prepare the Production Server

This section assumes that the Production cluster server has already been installed with Exchange 2000/2003 Server and with the latest service packs or patches that may be needed, and that users and mailboxes have already been created and configured.

  1. Install QSnap and the Windows File System iDataAgent on the Production Cluster's active physical node, referred to as Physical Node A. If necessary, install the CommCell Console as a Stand-Alone Application.
  2. Reboot Physical Node A. This will cause a failover, and Physical Node B will become the active physical node.
  3. Install QSnap and the Windows File System iDataAgent on Physical Node B. If necessary, install the CommCell Console as a Stand-Alone Application.
  4. Reboot Physical Node B. This will cause another failover, and Physical Node A will once again become the active physical node.
  5. Install Quick Recovery Agent (and the VSS Enabler if you want to use VSS for your snaps) to the Virtual Server hosting the Exchange Server. QSnap is not needed on the Virtual Server. This installation should be done from the active node of the Production cluster, which at this time should again be Physical Node A. (reboot is not necessary after virtual server install).

    NOTE:
    For the <Software Installation folder>, select a volume that belongs to the virtual server hosting Exchange server but not the ones containing the data or logs.

  6. Record your Exchange configuration on your Production cluster server. Specifically, record the Exchange transaction log location, Exchange system path location, database location, and streaming database location for each storage group that you would like to protect with this Disaster Recovery solution. Also, you will need to record your Exchange data directory, which is the location of your virtual Exchange server's installation. This information will be used when setting up your destination volumes on your Standby Server.
  7. From the Active Directory Users and Computers, verify that your Exchange server is a member of Exchange Domain Servers group. Right-click the server, select Properties and click the Member of tab. If the group is not listed add it.
  8. Shut down the Virtual node resources of the Production Server if you are using the same Virtual Server name and IP address for the Standby Server virtual node that is used for the Production Server virtual name.

Prepare the Standby Server

  1. Based on Step 6 above, you will need to create the destination volumes, which will house your QR-copied Exchange data files. These destination volumes' drive letters should match those of their Production Server counterparts. 
    Here is an example assuming that you only want to protect the 'First Storage Group'. If on the Production cluster server –

    You would need to create 3 volumes on the Standby cluster with the same drive letters.

    QR Agent operates with Exchange on a storage group level, so you will only be copying the volumes which house your databases and transaction logs for a particular storage group. From the example above, that would mean you should copy your –

    You should make sure that your QR Volumes are the same size as, or larger than, their Production Server counterparts (this is just normal QR functionality). Although the contents of the data directory (the H: drive in the example) do not need to be copied to the Standby cluster in your QR operation, a volume with that drive letter needs to exist on your Standby cluster. When you install your virtual server on the Standby cluster during a disaster recovery, it will look for this data directory on your Standby Server.

    NOTE:
    The destination volumes, which you create in this step, should NOT be assigned as resources to the cluster. QR Agent cannot copy to destination volumes which are cluster physical disk resources. Instead, you should just create and configure them on the active physical node of your Standby cluster. If the volumes are 'shared' (which is expected in a normal cluster setup) please remove the drive letters from these volumes on the PASSIVE nodes OR shut down all the passive nodes of the cluster. This ensures that they can only be written to on the Active physical node, thereby avoiding corruption issues. Once the QR operation is finished and a disaster occurs (assuring that QR operations will no longer occur), start the rest of the cluster nodes and reassign these volumes as physical disk resources to your Standby cluster.

  2. Install Quick Recovery Agent, QSnap, and MediaAgent on the active physical node of your Standby cluster.

    NOTE:
    You do not need to install any of these products on the passive node of your Standby cluster, or on the virtual server of your Standby cluster, in order for this procedure to work.

  3. Install Microsoft Exchange on each physical nodes of the Standby cluster. Use the same path that is used on the Production cluster server (for example, if C:\Program Files\Exchsrvr is used on the Production cluster server for each Exchange installation, use that same path on your Standby cluster). Apply all necessary patches and service pack up to the level that Exchange on the Production Server is installed.

Configure the Quick Recovery Agent

  1. Create a scratch volume pool: from the CommCell Console, create a new scratch volume pool which contains the volumes created/configured to house your database and transaction log data, as discussed in Standby Server configuration. Using the example above, this would be the G: drive and F: drive.
  2. Create and configure a QR Policy:
    1. Use VSS as your snap engine type for Windows 2003 clusters and QSnap as your snap engine type for Windows 2000 clusters.
    2. Set the LAN Copy Manager of your Standby Server's active node as your copy manager.
    3. Use the scratch volume pool just created in Step 1.
       
  3. Create a new subclient on the virtual server of your Production Server and associate it with the QR Policy just created. Use the Add App button in the Subclient Properties Configuration tab to discover the Exchange volume for your storage group(s) that hosts your database files, transaction logs, and system path. From the example, this would be the G: drive and F: drive.

    NOTE:
    In case Add App does not discover Exchange Server volumes, you might need to specify the name of the Exchange server (the name that appears in the Exchange system manager as your server). To do this, open the CommCell Console, expand your Production Server virtual server client, right-click Quick Recovery Agent and select Properties. In the Authentication tab, select Exchange and click Edit. Enter a valid user name and password for your Exchange server and then add your Exchange Server Name. This should fix any issues you may have in discovering Exchange on your cluster.

  4. Set up a QR incremental update schedule so you can update your Exchange data on the Standby Server.

NOTE:
MAKE SURE THAT YOU CONFIGURE THE SCHEDULE SO THAT EACH VOLUME DRIVE LETTER ON THE PRODUCTION CLUSTER WILL BE COPIED TO THE SAME DRIVE LETTER ON THE STANDBY CLUSTER. From the example above, G: on the Production Server would copy to G: on the Standby Server, and F: on the Production Server would copy to F: on the Standby Server.

Refer to Quick Recovery Agent for details on configuring the Quick Recovery Agent.


Bring the Standby Server online

Due to dependencies that Exchange server has within Active Directory, changes are required to get Exchange operating correctly on the Standby cluster.

  1. At this point of the procedure we assume that a problem has caused the Production cluster to fail.
  2. Shut down all the physical nodes of the Production cluster server, one at a time.
  3. Reset the Network Name for the Production virtual server from Active Directory Users and Computers. You can accomplish this by opening the computers directory and right-clicking your Production Exchange server and selecting Reset Account.
  4. If you have shut down passive nodes on the Standby Server, start them one by one.
  5. From Cluster Administrator of the Standby cluster, create a virtual server (a new group) using the Production Exchange server network name. Create the network name and IP address resources. You must use the same network name that was used in the Production cluster, but you can choose whether to use the same IP address or not. You may also need to enable Kerberos Authentication in the Parameters tab of your network name in order for the network name resource to start successfully.

    NOTE:
    If you use a different IP address, you will need to change some advanced settings in IIS in order to get things like Exchange's web server operational (HTTP and SMTP services), but your domain administrator should be able to handle this.

  6. Add physical disk resources for both the QR Volumes and the Exchange data directory to the Virtual Server you just created.
    Using the example above, this would be your G:, F:, and H: drives.
    Do not add any dependencies to these physical disk resources.
  7. Start all resources you have created in the new group. Verify they start successfully. Check MS Windows Event Viewer for error messages. Failover the group to the Passive node and back to the original Active node to verify services start.
  8. From the Cluster Administrator, right-click the newly created group and select New Resource to create the Exchange System Attendant. This resource should have a dependency on your physical disk resources and the network name resource.
  9. While creating the System Attendant Resource, verify the correct Exchange data directory is shown in the last step. In our example, this would be H:\EXCHSRVR.
  10. After successful Exchange Resource creation, bring your Standby Server's virtual server online. Check for error messages and warnings in MS Windows Event Viewer.
  11. At this time, you may open Outlook and verify your Exchange user's mailboxes. Your Standby Server is now your Production Server.

Back to Top