QR Disaster Recovery Solution for an Exchange Cluster Using a Standby Server
Overview
Configuration
Bring the Standby Server online
This document describes the procedure necessary to create a Standby Exchange
cluster server in the event that a Production Exchange cluster server is
temporarily or permanently damaged.
This procedure is supported with Exchange 2000 or Exchange 2003 on Windows 2000
on MSCS cluster or Exchange 2003 on Windows 2003 MSCS cluster. The cluster is assumed
to be active/passive. This procedure has been certified with the Quick Recovery Agent and QSnap, using either the VSS snap engine for Windows 2003
or the QSnap snap engine for Windows 2000. A familiarity with the functionality
and configuration of the Quick Recovery Agent is necessary to properly conduct this
procedure.
The key to successfully recovering your Exchange data is proper configuration
of the Production and Standby Servers. This ensures that the data
can quickly and easily be brought online from the QR Volumes on the Standby Server.
The system administrator will prepare the Standby cluster server by installing
Exchange on both physical nodes and using QR Agent to mirror the Exchange database
and log volumes from the Production Server. In the event of a Production
Server failure, the system administrator would create a new group on the
Standby cluster using the Production Server's network name and IP.
The QR Volume copies can then be added as physical disk resources and the Exchange
System Attendant resource will be added to complete the transformation of your
Standby Server.
Advantages
- There is no downtime of the Production Server during configuration.
- Recovery of the system state and Active Directory objects is not necessary.
- Minimal time needed to bring up the Standby Server after a disaster on the
Production Server.
Refer to Quick Recovery Agent
for details on installing and/or configuring QR Agent in a Clustered Environment
This section assumes that the Production cluster server has already been installed
with Exchange 2000/2003 Server and with the latest service packs or patches that
may be needed, and that users and mailboxes have already been created and configured.
- Install QSnap and the Windows File System
iDataAgent on the Production Cluster's active physical node,
referred to as Physical Node A. If necessary, install the CommCell
Console as a Stand-Alone Application.
- Reboot Physical Node A. This will cause a failover, and Physical
Node B will become the active physical node.
- Install QSnap and the Windows File System
iDataAgent on Physical Node B. If necessary, install
the CommCell Console as a Stand-Alone Application.
- Reboot Physical Node B. This will cause another failover, and
Physical Node A will once again become the active physical node.
- Install Quick Recovery Agent (and the VSS Enabler if you want to use VSS
for your snaps) to the Virtual Server hosting the Exchange Server. QSnap is
not needed on the Virtual Server. This installation should be done from the
active node of the Production cluster, which at this time should again
be Physical Node A. (reboot is not necessary after virtual server install).
NOTE:
For the <Software Installation folder>, select a volume that belongs to the virtual server
hosting Exchange server but not the ones containing the data or logs.
- Record your Exchange configuration on your Production cluster server.
Specifically, record the Exchange transaction log location, Exchange system
path location, database location, and streaming database location for each storage
group that you would like to protect with this Disaster Recovery solution. Also,
you will need to record your Exchange data directory, which is the location
of your virtual Exchange server's installation. This information will be used
when setting up your destination volumes on your Standby Server.
- From the Active Directory Users and Computers, verify that your Exchange
server is a member of Exchange Domain Servers group. Right-click the server,
select Properties and click the Member of tab. If the group is
not listed add it.
- Shut down the Virtual node resources of the Production Server if
you are using the same Virtual Server name and IP address for the Standby
Server virtual node that is used for the Production Server virtual
name.
- Based on Step 6 above, you will need to create the destination volumes,
which will house your QR-copied Exchange data files. These destination volumes'
drive letters should match those of their Production Server counterparts.
Here is an example assuming that you only want to protect the 'First Storage
Group'. If on the Production cluster server –
- Exchange was installed on the Virtual Server to the
H: drive
- The databases for First Storage Group were on the
G: drive
- The transaction logs and system path were on the
F: drive
You would need to create 3 volumes on the Standby cluster with the
same drive letters.
QR Agent operates with Exchange on a storage group level, so you will only
be copying the volumes which house your databases and transaction logs for a
particular storage group. From the example above, that would mean you should
copy your –
- G: drive on the Production cluster to
the G: drive on your Standby cluster
- F: drive on the Production cluster to
the F: drive on the Standby cluster.
You should make sure that your QR Volumes are the same size as, or larger
than, their Production Server counterparts (this is just normal QR functionality).
Although the contents of the data directory (the H:
drive in the example) do not need to be copied to the Standby cluster
in your QR operation, a volume with that drive letter needs to exist on your
Standby cluster. When you install your virtual server on the Standby
cluster during a disaster recovery, it will look for this data directory on
your Standby Server.
NOTE:
The destination volumes, which you create in this step, should
NOT be assigned as resources to the cluster. QR Agent cannot copy
to destination volumes which are cluster physical disk resources. Instead, you
should just create and configure them on the active physical node of
your Standby cluster. If the volumes are 'shared' (which is expected
in a normal cluster setup) please remove the drive letters from these volumes
on the PASSIVE nodes OR shut down all the passive nodes of the cluster.
This ensures that they can only be written to on the Active physical node, thereby
avoiding corruption issues. Once the QR operation is finished and a disaster
occurs (assuring that QR operations will no longer occur), start the rest of
the cluster nodes and reassign these volumes as physical disk resources to your
Standby cluster.
- Install Quick Recovery Agent, QSnap, and MediaAgent on the active physical
node of your Standby cluster.
NOTE:
You do not need to install any of these products on the passive node of your
Standby cluster, or on the virtual server of your Standby cluster,
in order for this procedure to work.
- Install Microsoft Exchange on each physical nodes of the Standby
cluster. Use the same path that is used on the Production cluster server
(for example, if C:\Program Files\Exchsrvr is
used on the Production cluster server for each Exchange installation,
use that same path on your Standby cluster). Apply all necessary patches
and service pack up to the level that Exchange on the Production Server
is installed.
- Create a scratch volume pool: from the CommCell Console, create a new scratch
volume pool which contains the volumes created/configured to house your database
and transaction log data, as discussed in Standby Server configuration.
Using the example above, this would be the G:
drive and F: drive.
- Create and configure a QR Policy:
- Use VSS as your snap engine type for Windows 2003 clusters and QSnap
as your snap engine type for Windows 2000 clusters.
- Set the LAN Copy Manager of your Standby Server's active node
as your copy manager.
- Use the scratch volume pool just created in Step 1.
- Create a new subclient on the virtual server of your Production Server
and associate it with the QR Policy just created. Use the Add App button
in the Subclient Properties Configuration tab to discover the Exchange volume
for your storage group(s) that hosts your database files, transaction logs,
and system path. From the example, this would be the
G: drive and F:
drive.
NOTE:
In case Add App does not discover Exchange Server volumes, you might
need to specify the name of the Exchange server (the name that appears in the
Exchange system manager as your server). To do this, open the CommCell Console,
expand your Production Server virtual server client, right-click Quick Recovery
Agent and select Properties. In the Authentication tab, select
Exchange and click Edit. Enter a valid user name and password for
your Exchange server and then add your Exchange Server Name. This should fix
any issues you may have in discovering Exchange on your cluster.
- Set up a QR incremental update schedule so you can update your Exchange
data on the Standby Server.
NOTE:
MAKE SURE THAT YOU CONFIGURE THE SCHEDULE SO THAT EACH VOLUME DRIVE LETTER ON THE
PRODUCTION CLUSTER WILL BE COPIED TO THE SAME DRIVE LETTER ON THE STANDBY CLUSTER.
From the example above, G: on the Production
Server would copy to G: on the Standby Server,
and F: on the Production Server would copy
to F: on the Standby Server.
Refer to Quick Recovery Agent
for details on configuring the Quick Recovery Agent.
Due to dependencies that Exchange server has within Active Directory, changes
are required to get Exchange operating correctly on the Standby cluster.
- At this point of the procedure we assume that a problem has caused the
Production cluster to fail.
- Shut down all the physical nodes of the Production cluster server,
one at a time.
- Reset the Network Name for the Production virtual server from Active
Directory Users and Computers. You can accomplish this by opening the computers
directory and right-clicking your Production Exchange server and selecting
Reset Account.
- If you have shut down passive nodes on the Standby Server, start
them one by one.
- From Cluster Administrator of the Standby cluster, create a virtual
server (a new group) using the Production Exchange server network name.
Create the network name and IP address resources. You must use the same network
name that was used in the Production cluster, but you can choose whether
to use the same IP address or not. You may also need to enable Kerberos Authentication
in the Parameters tab of your network name in order for the network name resource
to start successfully.
NOTE:
If you use a different IP address, you will need to change some advanced settings
in IIS in order to get things like Exchange's web server operational (HTTP and
SMTP services), but your domain administrator should be able to handle this.
- Add physical disk resources for both the QR Volumes and the Exchange data
directory to the Virtual Server you just created.
Using the example above, this would be your G:,
F:, and H: drives.
Do not add any dependencies to these physical disk resources.
- Start all resources you have created in the new group. Verify they start
successfully. Check MS Windows Event Viewer for error messages. Failover the
group to the Passive node and back to the original Active node to verify services
start.
- From the Cluster Administrator, right-click the newly created group and
select New Resource to create the Exchange System Attendant. This resource
should have a dependency on your physical disk resources and the network name
resource.
- While creating the System Attendant Resource, verify the correct Exchange
data directory is shown in the last step. In our example, this would be
H:\EXCHSRVR.
- After successful Exchange Resource creation, bring your Standby Server's
virtual server online. Check for error messages and warnings in MS Windows Event
Viewer.
- At this time, you may open Outlook and verify your Exchange user's mailboxes.
Your Standby Server is now your Production Server.
Back to Top