QR Disaster Recovery Solution for Building a UNIX Standby Oracle Server with Log
Recovery
Overview
Configuration
Bring the Standby Server online
Reinstall
the Agent software on the new Production Server
Appendices
In this setup, there are three servers, a Production Server which runs
the Oracle Database, a Catalog Server with Recovery Catalog Database
configured, and a Standby Server.
The Quick Recovery Agent takes a snapshot of the running instance. All tablespaces
of the database are put in backup mode before taking the snapshot (Point-In-Time-Image)
of the data partitions; as soon as the snapshot is taken, all the tablespaces are
put back in open mode, then, after archiving the online redo logs, a snapshot of
the archive log partition(s) is taken. This ensures that any transactions that occur
during the backup mode of the tablespaces are archived. This archived redo log can
be used for later Quick Recovery.
The Production Server contains the source (Primary) volumes, which are
copied and updated incrementally via the LAN Copy Manager to QR Volumes on the
Standby Server. The data is therefore physically ready to use (no lengthy
restore from tape or other media is necessary) on the Standby Server. The
Standby Server should have the same version of the operating system and the
same version of Oracle installed on the same path as the Production Server.
The Catalog Server hosts a Recovery Catalog Database used to recover the
Production database in case of failure.
In cases where the Production Server suffers a failure, or requires downtime,
the system administrator can remove the Production Server from the network
and then add the Standby Server to the network (with the same name and configuration
as the disconnected Production Server). After the Standby Server is
up, a Recover using Archive Logs can be performed before opening the database. This
will recover the data changed after the last successful QR Volume creation job and
up to the last successful Archive Logs backup.
The key to a successful QR Recovery is properly configuring the Production
and Standby Servers. This ensures that the database(s) can quickly and easily
be run from the QR volumes on the Standby Server.
This procedure is written for bringing up one instance on the Standby Server.
If multiple instances are present on one client we strongly recommend creating a
separate subclient for each instance and following the same procedure for each instance
individually.
It should be possible to accomplish this procedure between different versions
of operating systems and Oracle databases, as well as different locations for Oracle
installations. However, due to the significant complexity of dealing with all the
additional factors involved, this document only covers the most common scenario.
The following sections discuss preparing the Catalog Database, Production, and Standby
Servers as well as configuring the QR Agent and Oracle
iDataAgent. The basic workflow is described
below. For detailed instructions on installation and configuration options, see Quick
Recovery Agent.
To restore and recover the data that has changed after the last successful QR
volume creation job, you must configure the Recovery Catalog. For detailed descriptions
on how to create and configure the Recovery Catalog for the version of Oracle you
are using, refer to Oracle Documentation.
- Create an Oracle Instance on the server you are planning to use as the
Recovery
Catalog Server.
- Create a schema and catalog using RMAN.
- Verify that the Target Database (Production Server database) is accessible
using TNSPING. Edit TNSNAMES to correct any connection problems.
- Continue with the configuration of the Production Server. Later, you will
need to register the Target Database with the Recovery Catalog from Production Server.
The Quick Recovery Agent supports 9.2. The database must
be in “archive log mode” for QR Volume Creation. We recommend that data files, archive
logs, control files, and online redo logs each reside on different partitions (different
mount points), and that none of these four types of files reside on a partition
with any of the others.
Following is a sample configuration of an Oracle 9i installation on a Solaris
2.8 machine, which has an Oracle instance, “CV”, running. Oracle requires that the
Standby database be on the same OS level as that on the Production Server.
Mount Point |
Description |
/oracle901 |
Contains the Oracle 9i installed binaries |
/oracle901 |
Is the ORACLE_HOME of instance CV |
/cv_data |
Contains all the data files of instance CV |
/cv_archlog |
Contains all the archived logs of instance CV |
/cv_control1 |
Contains current control file 1 of instance CV |
/cv_control02 |
Contains current control file 2 of instance CV |
/cv_redologs |
Contains online redo logs of instance CV |
The redo logs and control files are not backed up by the QR Volume creation jobs
since they are constantly changing. In order to take a consistent control file backup,
Oracle provides specific commands to back up online control files. It is strongly
recommended to have each database's data and archived logs on different volumes
exclusively for each instance. (Example: /cv_data or
/cv_archlog is on exclusive
volume for instance CV. It should not have data or archive logs of any other instance
running on that machine).
- If the Production Server is not already installed with Oracle 9.2, install the desired version of Oracle and apply any required patches.
- Create the required Oracle database.
NOTE: When creating the instances or when installing Oracle, record your
Oracle database configuration and storage locations so that Oracle can be installed
identically on the Standby Server.
- Register the database with the Recovery Catalog. For more information
refer to the Oracle procedure for creating and using Recovery Catalog.
- Install the QR Agent. If the installation detects that you have not already
installed the Base software and the File System iDataAgent prior to installing
the QR Agent, you will be prompted to install them.
- Install the Oracle iDataAgent and configure the instance from the
CommCell Console. Create a subclient that will back up the logs and
control files. You will need to re-link the database by running the script
Ora_install.sh
(in the iDataAgent folder under the
software installation) if you didn't
specify the instance during the Oracle iDataAgent installation.
NOTE: While configuring the instance, specify the connect string to the
Recovery Catalog Database.
- Shut down the Oracle instance to configure the disks used by Oracle as CXBF
devices. Use Volume Explorer to perform this operation. If the devices are mounted
you can select the Unmount before Configure option from the Volume Configuration
dialog. The device will automatically be mounted back. Select Detect Volumes
from Volume Explorer and verify all mount points used by the Oracle instance
are configured and mounted.
- Start up the Oracle instance on the Production Server.
Before installing any software on the Standby Server, consider the following:
- The Standby Server should have adequate resources to host the Oracle
database(s).
- The Standby Server should have the same operating system version
as the Production Server.
- The scratch volumes on the Standby Server must be of equal or greater
size than the Primary volumes on the Production Server.
Install the same version of Oracle in the Standby Server as is installed
on the Production Server. You must use the same Oracle user ID, Oracle group
ID, and the same installation path of Oracle as the Production Server.
- Set up the Oracle instance with the same instance name and configuration
as that of the Production Server:
- Create the required directories (create,
cdump, bdump,
udump, and pfile)
for the instance in the Standby Server matching the Production
Server. (Example location: $ORACLE_BASE/admin/<ORACLE_SID>/cdump)
- Oracle database may be using spfile or init file for storing the initialization
parameters. Copy the init/spfile (init<ORACLE_SID.ora or
spfile<ORACLE_SID>)
from the Production Server to the Standby Server. Create the
necessary links for the init file if required. (Example location of these
files: $ORACLE_HOME/dbs)
- Copy the password file (example: orapw<ORACLE_SID>) of the instance
from the Production Server to the Standby Server. (Example
location of this file: $ORACLE_HOME/dbs)
- Copy the TNSNAMES file from the Production Server. It contains
information about the Recovery Catalog Database. Copy any other files specific
to your configuration that you may need to connect to Recovery Catalog.
Use TNSPING to verify the connection.
NOTE: Run the Oracle netmgr tool (from the machine console) if Oracle
has a problem resolving the instance name.
- On the Production Server, note the name and location of each
control file. This step is required to copy back the backup control file
to each control file location when bringing up the Standby Server.
Example:
/cv_control1/control01.ctl, /cv_control2/control02.ctl, /cv_control3/control03.ctl
NOTE:
It is very important to follow Step 1 precisely. These files are used for
bringing up the database on the Standby Server in the event the Production
Server goes down. Failure to properly configure the Standby Server
will result in not being able to bring up the Standby Server successfully.
- Install the MediaAgent and Quick Recovery Agent software on the Standby
Server. If the installation detects that you have not already installed
the Base software and the File System iDataAgent prior to installing
the QR Agent, you will be prompted to install them.
- Start up the instance using the nomount option.
- Install the Oracle iDataAgent and configure the instance from
the CommCell
Console. Specify the connect string to catalog database. You may need to re-link
the instance as mentioned above.
- Shut down the Oracle instance on the Standby Server.
- Use Volume Explorer to configure all volumes you are planning to use in
the Volume Scratch Pool on the Standby Server as CXBF devices. Detect
volumes when finished.
- Configure the scratch volume pool. The scratch volume pool for this recovery
scenario should consist only of the disk resources attached to the Standby
Server. The QR volumes and incremental updates will be written to these
volumes. The on-line help contains detailed steps for this operation.
- Create a QR Policy. Select the LAN Copy Manager (LANVolCopy) on the Standby
Server as the copy manager, and associate this QR Policy with the scratch
volume pool that was created in the previous step. Multiple QR Policies may
be associated with a given scratch volume pool.
- In the Authentication tab of the QR Agent Properties dialog
box, add the selected Oracle instance.
- Create the QR Agent subclient(s) on the Production Server. Please
note the following:
- The QR Agent manages Oracle databases at the instance level. Therefore,
different databases can be distributed across multiple subclients.
- Add the Oracle instance to a QR subclient through the Add App
option located in the Subclient Contents property box. Verify all
Oracle volumes containing data files and archive logs are listed. Disks
containing redo log files and control files will not be listed unless they
reside on the same volume as data or archive logs.
- Create an Oracle iDataAgent subclient on the Production Server.
As subclient content, select only Archive Logs. Deselect data files.
- Start or schedule the QR Volume creation job. In this particular scenario,
incremental updates should be scheduled. Also schedule backups for Archive Logs
using the Oracle iDataAgent. In general, Archive Log backups should occur
often, in between QR Volume creation jobs. There should be one more Archive
Log backup after the last QR Volume creation job. These will be used to recover
the database to a point-in-time, which is after the last snap.
From the
QR Volume Creation Advanced Options dialog box, assign each volume on the
Production Server to its corresponding destination volume on the Standby
Server. The mount path of each standby volume should be the same
as its counterpart production volume. For each source raw partition on
the Production Server, the corresponding destination raw partition on
the Standby Server should be selected in the QR Volume Creation Advanced
Options dialog box.
If any raw partitions are used for data files or control files, you must create the necessary links for appropriate devices on the Standby
Server before QR Volume Creation. The soft link pointing to the destination
raw partition should be the same as that of the source raw partition.
For example, if the production database has a data file linked to a raw volume
(data file) as follows:
/ora_data/<ORACLE_SID>/raw.dbf -> /dev/cxbf/rdsk/c1t1d1s1
(source)
then the links to the destination volume should be created on the Standby
Server as follows (after selecting the destination volume in the Advanced
tab and before starting the QR Volume creation process):
/ora_data/<ORACLE_SID>/raw.dbf -> /dev/cxbf/rdsk/c2t1d1s1
(destination)
- To test this configuration, allow some scheduled updates to complete. If
necessary, execute a few test Oracle transactions between updates to verify
that changed/new blocks are being copied to the QR volumes.
In cases where the Production Server suffers a failure, or requires downtime,
the Standby Server can quickly be brought on-line to host the database from
the QR Volumes you have created. The following steps must be taken to add the
Standby Server to the network.
- Shut down the Production Server and/or remove it completely from
the network.
- Mount all Oracle volumes on the Standby Server. Set ownership to
Oracle user if needed.
- There is a backup.ctl.galaxy (Backup Controlfile) file located in one of
the archive log destinations of the instance on the Standby Server (Example
location: /ora_logs/admin/sid/arch)
Copy the backup.ctl.galaxy file to all the control file location as noted
in Step 1 of “Prepare the Standby Server”.
Example:
cp backup.ctl.galaxy /cv_control1/control01.ctl
cp backup.ctl.galaxy /cv_control2/control02.ctl
cp backup.ctl.galaxy /cv_control3/control03.ctl
If the control file is on a raw device, then the above example would change
to:
dd if=backup.ctl.galaxy of=/cv_control1/controlraw.ctl
NOTES:
- This is the copy of control files backed up by the QR Agent.
- Oracle user should copy these control files to the appropriate location.
- Before proceeding to Step 4, set up the Oracle environment (such as
ORACLE_HOME and ORACLE_SID) on the destination machine to match that of
the Production Server.
(Example: export ORACLE_HOME=/oracle901 and ORACLE_SID=CV)
- Start the database in the mount mode after connecting as the sysdba user
(Example: sys/password as sysdba):
SQL> startup mount
- Restore Archive logs:
- Using the CommCell Console, select the Oracle
iDataAgent on the
Production Server and select Browse.
- Select the Oracle instance and click Recover all selected.
- Deselect Restore Data and select Restore Archive logs.
Click Advanced Options and select Restore to End (clear Start). Select a time before or equal to the last Archive logs backup
and click OK.
- Change the restore location to the Standby Server. Provide Recovery Catalog login information and click
OK to start the
restore.
- Recover the Database:
- Using the CommCell Console select the Oracle
iDataAgent on the Production Server and select
Browse.
- Select the Oracle instance and click Recover all selected.
- Deselect Restore Data and select Recover. Click Advanced
Options and select Recover to End (clear Start). Select
a time before or equal to the last Archive logs you restored. Alternately,
you can use SCN.
- Change the recover location to the Standby Server. Provide Recovery Catalog login information and click
OK to start the
recovery.
- At the end of the recovery, the database will be altered to OPEN. Verify
in the RMAN Log that the operation was successful.
- Rename the Standby Server to match the name of the Production
Server that has been removed from the network. (Please refer to Solaris
user documentation for renaming the machine. One possible way is to use the
sys-unconfig command, then delete DNS entries; when the machine restarts,
use the name and IP of the Production Server). If necessary, change the
IP Address to match the one on the Production Server. Using the old IP
a with a new name could cause problems with name resolution. Add the Standby
Server to the network if necessary, and configure TCP/IP and any other applicable
network settings.
After bringing up the Standby Server, verify that
the destination volumes are mounted back. These are the volumes that were present
in the scratch volume pool while creating QR volumes. Also verify Oracle user
is an owner of all mount points, directories and files. If necessary, change
the ownership to the appropriate Oracle user and group; otherwise your instance
might fail to access the files.
NOTE:
Do not try to detect volumes in Volume Explorer on either the Production
or the Standby Server. The Oracle recovery procedure is not accomplished
with Volume Explorer.
OPTIONAL:
Add the destination volumes and their mount points to
/etc/vfstab on the
Standby Server, to ensure the volumes will be mounted back in the event
of a re-boot.
After the successful completion of Step 7, the database should be in OPEN mode
and ready to use.
NOTE:
- If you receive any messages that files cannot be accessed or access is denied
during the startup process, check again that you have mounted all volumes, copied
the files, and set the ownership to Oracle user.
Once your Oracle database is up and running and the computer name changed to
the Production Server name, you must reinstall the Agent software if you plan to
perform backups from this machine. This is because all configuration settings still refer to the
old name of the machine (Standby Server Name).
- Uninstall all Agent software from the new Production Server. Do not
delete remaining client entries from the CommCell Console.
- Reinstall the same modules you had on the Production Server. They
will be installed with the new machine name (Production Server one)
- If you have the new Standby Server ready, configure it according
to the section above, "Prepare the Standby Server."
- Proceed with the section above, "Configure the QR Agent and Oracle iDataAgent."
NOTE:
- In most scenarios, the machine that used to be Production Server
could be easily turned into a Standby Server after its maintenance or
repair is completed; follow the procedures above.
There is a utility called sys-unconfig which is used for resetting the
network configuration. After using it, reboot the machine and enter the new IP,
hostname, etc. You can achieve the same effect by editing the listed files below
and then rebooting.
/etc/hosts
This is where you specify the IP address of your hostname. Change the hostname
here to match your host's new name. It should be the same as what you specify in
/etc/nodename.
/etc/nodename
This is just like /etc/HOSTNAME in Linux. It
defines the real name of the host. Simply change it to whatever you want to call
your host.
/etc/hostname.hme0 (or other interface name)
Change these to line up with what you specified in /etc/hosts.
/etc/net/tic*/hosts
Change everything in here to line up with the files above.
/etc/resolv.conf
Just like Linux. This is where you specify your DNS servers and domain resolution
information.
/etc/defaultrouter
Enter the IP address of the default router for your Solaris host. Note that Sun
does not create this file by default, nor is there any other location where you
may specify a default route.
Back to Top