QR Disaster Recovery Solution for Building a UNIX Standby Oracle Server with Log Recovery


Overview

Configuration

Bring the Standby Server online

Reinstall the Agent software on the new Production Server

Appendices


Overview

In this setup, there are three servers, a Production Server which runs the Oracle Database, a Catalog Server with Recovery Catalog Database configured, and a Standby Server.

The Quick Recovery Agent takes a snapshot of the running instance. All tablespaces of the database are put in backup mode before taking the snapshot (Point-In-Time-Image) of the data partitions; as soon as the snapshot is taken, all the tablespaces are put back in open mode, then, after archiving the online redo logs, a snapshot of the archive log partition(s) is taken. This ensures that any transactions that occur during the backup mode of the tablespaces are archived. This archived redo log can be used for later Quick Recovery.

The Production Server contains the source (Primary) volumes, which are copied and updated incrementally via the LAN Copy Manager to QR Volumes on the Standby Server. The data is therefore physically ready to use (no lengthy restore from tape or other media is necessary) on the Standby Server. The Standby Server should have the same version of the operating system and the same version of Oracle installed on the same path as the Production Server. The Catalog Server hosts a Recovery Catalog Database used to recover the Production database in case of failure.

In cases where the Production Server suffers a failure, or requires downtime, the system administrator can remove the Production Server from the network and then add the Standby Server to the network (with the same name and configuration as the disconnected Production Server). After the Standby Server is up, a Recover using Archive Logs can be performed before opening the database. This will recover the data changed after the last successful QR Volume creation job and up to the last successful Archive Logs backup.

The key to a successful QR Recovery is properly configuring the Production and Standby Servers. This ensures that the database(s) can quickly and easily be run from the QR volumes on the Standby Server.

This procedure is written for bringing up one instance on the Standby Server. If multiple instances are present on one client we strongly recommend creating a separate subclient for each instance and following the same procedure for each instance individually.

It should be possible to accomplish this procedure between different versions of operating systems and Oracle databases, as well as different locations for Oracle installations. However, due to the significant complexity of dealing with all the additional factors involved, this document only covers the most common scenario.


Configuration

The following sections discuss preparing the Catalog Database, Production, and Standby Servers as well as configuring the QR Agent and Oracle iDataAgent. The basic workflow is described below. For detailed instructions on installation and configuration options, see Quick Recovery Agent.

Prepare the Catalog Database Server

To restore and recover the data that has changed after the last successful QR volume creation job, you must configure the Recovery Catalog. For detailed descriptions on how to create and configure the Recovery Catalog for the version of Oracle you are using, refer to Oracle Documentation.

Prepare the Production Server

The Quick Recovery Agent supports 9.2. The database must be in “archive log mode” for QR Volume Creation. We recommend that data files, archive logs, control files, and online redo logs each reside on different partitions (different mount points), and that none of these four types of files reside on a partition with any of the others.

Following is a sample configuration of an Oracle 9i installation on a Solaris 2.8 machine, which has an Oracle instance, “CV”, running. Oracle requires that the Standby database be on the same OS level as that on the Production Server.

Mount Point Description
/oracle901 Contains the Oracle 9i installed binaries
/oracle901 Is the ORACLE_HOME of instance CV
/cv_data Contains all the data files of instance CV
/cv_archlog Contains all the archived logs of instance CV
/cv_control1 Contains current control file 1 of instance CV
/cv_control02 Contains current control file 2 of instance CV
/cv_redologs Contains online redo logs of instance CV

The redo logs and control files are not backed up by the QR Volume creation jobs since they are constantly changing. In order to take a consistent control file backup, Oracle provides specific commands to back up online control files. It is strongly recommended to have each database's data and archived logs on different volumes exclusively for each instance. (Example: /cv_data or /cv_archlog is on exclusive volume for instance CV. It should not have data or archive logs of any other instance running on that machine).

  1. If the Production Server is not already installed with Oracle 9.2, install the desired version of Oracle and apply any required patches.
  2. Create the required Oracle database.
    NOTE: When creating the instances or when installing Oracle, record your Oracle database configuration and storage locations so that Oracle can be installed identically on the Standby Server.
  3. Register the database with the Recovery Catalog. For more information refer to the Oracle procedure for creating and using Recovery Catalog.
  4. Install the QR Agent. If the installation detects that you have not already installed the Base software and the File System iDataAgent prior to installing the QR Agent, you will be prompted to install them.
  5. Install the Oracle iDataAgent and configure the instance from the CommCell Console. Create a subclient that will back up the logs and control files. You will need to re-link the database by running the script Ora_install.sh (in the iDataAgent folder under the software installation) if you didn't specify the instance during the Oracle iDataAgent installation.
    NOTE: While configuring the instance, specify the connect string to the Recovery Catalog Database.
  6. Shut down the Oracle instance to configure the disks used by Oracle as CXBF devices. Use Volume Explorer to perform this operation. If the devices are mounted you can select the Unmount before Configure option from the Volume Configuration dialog. The device will automatically be mounted back. Select Detect Volumes from Volume Explorer and verify all mount points used by the Oracle instance are configured and mounted.
  7. Start up the Oracle instance on the Production Server.

Prepare the Standby Server

Before installing any software on the Standby Server, consider the following:

Install the same version of Oracle in the Standby Server as is installed on the Production Server. You must use the same Oracle user ID, Oracle group ID, and the same installation path of Oracle as the Production Server.

  1. Set up the Oracle instance with the same instance name and configuration as that of the Production Server:
    1. Create the required directories (create, cdump, bdump, udump, and pfile) for the instance in the Standby Server matching the Production Server. (Example location: $ORACLE_BASE/admin/<ORACLE_SID>/cdump)
    2. Oracle database may be using spfile or init file for storing the initialization parameters. Copy the init/spfile (init<ORACLE_SID.ora or spfile<ORACLE_SID>) from the Production Server to the Standby Server. Create the necessary links for the init file if required. (Example location of these files: $ORACLE_HOME/dbs)
    3. Copy the password file (example: orapw<ORACLE_SID>) of the instance from the Production Server to the Standby Server. (Example location of this file: $ORACLE_HOME/dbs)
    4. Copy the TNSNAMES file from the Production Server. It contains information about the Recovery Catalog Database. Copy any other files specific to your configuration that you may need to connect to Recovery Catalog. Use TNSPING to verify the connection.
      NOTE: Run the Oracle netmgr tool (from the machine console) if Oracle has a problem resolving the instance name.
    5. On the Production Server, note the name and location of each control file. This step is required to copy back the backup control file to each control file location when bringing up the Standby Server.
      Example: /cv_control1/control01.ctl, /cv_control2/control02.ctl, /cv_control3/control03.ctl

    NOTE:
    It is very important to follow Step 1 precisely. These files are used for bringing up the database on the Standby Server in the event the Production Server goes down. Failure to properly configure the Standby Server will result in not being able to bring up the Standby Server successfully.

  2. Install the MediaAgent and Quick Recovery Agent software on the Standby Server. If the installation detects that you have not already installed the Base software and the File System iDataAgent prior to installing the QR Agent, you will be prompted to install them.
  3. Start up the instance using the nomount option.
  4. Install the Oracle iDataAgent and configure the instance from the CommCell Console. Specify the connect string to catalog database. You may need to re-link the instance as mentioned above.
  5. Shut down the Oracle instance on the Standby Server.
  6. Use Volume Explorer to configure all volumes you are planning to use in the Volume Scratch Pool on the Standby Server as CXBF devices. Detect volumes when finished.

Configure the Quick Recovery Agent and Oracle iDataAgent

  1. Configure the scratch volume pool. The scratch volume pool for this recovery scenario should consist only of the disk resources attached to the Standby Server. The QR volumes and incremental updates will be written to these volumes. The on-line help contains detailed steps for this operation.
  2. Create a QR Policy. Select the LAN Copy Manager (LANVolCopy) on the Standby Server as the copy manager, and associate this QR Policy with the scratch volume pool that was created in the previous step. Multiple QR Policies may be associated with a given scratch volume pool.
  3. In the Authentication tab of the QR Agent Properties dialog box, add the selected Oracle instance.
  4. Create the QR Agent subclient(s) on the Production Server. Please note the following:
  5. Create an Oracle iDataAgent subclient on the Production Server. As subclient content, select only Archive Logs. Deselect data files.
  6. Start or schedule the QR Volume creation job. In this particular scenario, incremental updates should be scheduled. Also schedule backups for Archive Logs using the Oracle iDataAgent. In general, Archive Log backups should occur often, in between QR Volume creation jobs. There should be one more Archive Log backup after the last QR Volume creation job. These will be used to recover the database to a point-in-time, which is after the last snap.

    From the QR Volume Creation Advanced Options dialog box, assign each volume on the Production Server to its corresponding destination volume on the Standby Server. The mount path of each standby volume should be the same as its counterpart production volume. For each source raw partition on the Production Server, the corresponding destination raw partition on the Standby Server should be selected in the QR Volume Creation Advanced Options dialog box.

    If any raw partitions are used for data files or control files, you must create the necessary links for appropriate devices on the Standby Server before QR Volume Creation. The soft link pointing to the destination raw partition should be the same as that of the source raw partition.

    For example, if the production database has a data file linked to a raw volume (data file) as follows:

    /ora_data/<ORACLE_SID>/raw.dbf -> /dev/cxbf/rdsk/c1t1d1s1 (source)

    then the links to the destination volume should be created on the Standby Server as follows (after selecting the destination volume in the Advanced tab and before starting the QR Volume creation process):

    /ora_data/<ORACLE_SID>/raw.dbf -> /dev/cxbf/rdsk/c2t1d1s1 (destination)

  7. To test this configuration, allow some scheduled updates to complete. If necessary, execute a few test Oracle transactions between updates to verify that changed/new blocks are being copied to the QR volumes.

Bring the Standby Server online

In cases where the Production Server suffers a failure, or requires downtime, the Standby Server can quickly be brought on-line to host the database from the QR Volumes you have created. The following steps must be taken to add the Standby Server to the network.

  1. Shut down the Production Server and/or remove it completely from the network.
  2. Mount all Oracle volumes on the Standby Server. Set ownership to Oracle user if needed.
  3. There is a backup.ctl.galaxy (Backup Controlfile) file located in one of the archive log destinations of the instance on the Standby Server (Example location: /ora_logs/admin/sid/arch)

    Copy the backup.ctl.galaxy file to all the control file location as noted in Step 1 of “Prepare the Standby Server”.

    Example:
    cp backup.ctl.galaxy /cv_control1/control01.ctl
    cp backup.ctl.galaxy /cv_control2/control02.ctl
    cp backup.ctl.galaxy /cv_control3/control03.ctl

    If the control file is on a raw device, then the above example would change to:

    dd if=backup.ctl.galaxy of=/cv_control1/controlraw.ctl

    NOTES:

  4. Start the database in the mount mode after connecting as the sysdba user (Example: sys/password as sysdba):

    SQL> startup mount

  5. Restore Archive logs:
    1. Using the CommCell Console, select the Oracle iDataAgent on the Production Server and select Browse.
    2. Select the Oracle instance and click Recover all selected.
    3. Deselect Restore Data and select Restore Archive logs. Click Advanced Options and select Restore to End (clear Start). Select a time before or equal to the last Archive logs backup and click OK.
    4. Change the restore location to the Standby Server. Provide Recovery Catalog login information and click OK to start the restore.
  6. Recover the Database:
    1. Using the CommCell Console select the Oracle iDataAgent on the Production Server and select Browse.
    2. Select the Oracle instance and click Recover all selected.
    3. Deselect Restore Data and select Recover. Click Advanced Options and select Recover to End (clear Start). Select a time before or equal to the last Archive logs you restored. Alternately, you can use SCN.
    4. Change the recover location to the Standby Server. Provide Recovery Catalog login information and click OK to start the recovery.
    5. At the end of the recovery, the database will be altered to OPEN. Verify in the RMAN Log that the operation was successful.
  7. Rename the Standby Server to match the name of the Production Server that has been removed from the network. (Please refer to Solaris user documentation for renaming the machine. One possible way is to use the sys-unconfig command, then delete DNS entries; when the machine restarts, use the name and IP of the Production Server). If necessary, change the IP Address to match the one on the Production Server. Using the old IP a with a new name could cause problems with name resolution. Add the Standby Server to the network if necessary, and configure TCP/IP and any other applicable network settings.

    After bringing up the Standby Server, verify that the destination volumes are mounted back. These are the volumes that were present in the scratch volume pool while creating QR volumes. Also verify Oracle user is an owner of all mount points, directories and files. If necessary, change the ownership to the appropriate Oracle user and group; otherwise your instance might fail to access the files.

    NOTE:
    Do not try to detect volumes in Volume Explorer on either the Production or the Standby Server. The Oracle recovery procedure is not accomplished with Volume Explorer.

    OPTIONAL:
    Add the destination volumes and their mount points to /etc/vfstab on the Standby Server, to ensure the volumes will be mounted back in the event of a re-boot.

After the successful completion of Step 7, the database should be in OPEN mode and ready to use.

NOTE:


Reinstall the Agent software on the new Production Server

Once your Oracle database is up and running and the computer name changed to the Production Server name, you must reinstall the Agent software if you plan to perform backups from this machine. This is because all configuration settings still refer to the old name of the machine (Standby Server Name).

  1. Uninstall all Agent software from the new Production Server. Do not delete remaining client entries from the CommCell Console.
  2. Reinstall the same modules you had on the Production Server. They will be installed with the new machine name (Production Server one)
  3. If you have the new Standby Server ready, configure it according to the section above, "Prepare the Standby Server."
  4. Proceed with the section above, "Configure the QR Agent and Oracle iDataAgent."

NOTE:


Appendices

Rename a Solaris Machine

There is a utility called sys-unconfig which is used for resetting the network configuration. After using it, reboot the machine and enter the new IP, hostname, etc. You can achieve the same effect by editing the listed files below and then rebooting.

/etc/hosts

This is where you specify the IP address of your hostname. Change the hostname here to match your host's new name. It should be the same as what you specify in /etc/nodename.

/etc/nodename

This is just like /etc/HOSTNAME in Linux. It defines the real name of the host. Simply change it to whatever you want to call your host.

/etc/hostname.hme0 (or other interface name)

Change these to line up with what you specified in /etc/hosts.

/etc/net/tic*/hosts

Change everything in here to line up with the files above.

/etc/resolv.conf

Just like Linux. This is where you specify your DNS servers and domain resolution information.

/etc/defaultrouter

Enter the IP address of the default router for your Solaris host. Note that Sun does not create this file by default, nor is there any other location where you may specify a default route.

Back to Top