Advanced - Global Deduplication

Getting Started Advanced  

Table of Contents

Overview

When to Use Global Deduplication

Deduplicating Remote Office Backups

Seeding Data

Storage Policies with Different Retentions Rules

When Not To Use Global Deduplication

Storage Policy Copies with Global Deduplication

Creating Primary Copies

Creating Secondary Copies

Block Size for Global Deduplication

Managing the Global Deduplication Store

Setting up the Minimum Free Space

Setting up an Alert For Free Space

Setting the Age of the Primary Block

Setting up Data Compression

Changing the MediaAgent Hosting the Store

Changing the Location of the Deduplication Store

Configuring Deduplication Store Creation

Sealing Deduplication Store

Deduplication Store Failover

Manually Reconstructing a Store

Rebooting a MediaAgent Hosting the Deduplication Store

Configuring Data Paths

Backing up Deduplication Store Database

Configure Alerts for Deduplication Store Backup

License Requirements

Overview

Global deduplication enables you to deduplicate data across storage policy copies. Global deduplication provides a global, common deduplication store that can be used by multiple storage policy copies. This enables data from multiple copies to be deduplicated against each other, eliminating the redundant data between the copies. Global deduplication improves the effectiveness of deduplication by identifying and eliminating redundant data across copies.

The first step in implementing global deduplication is to create a Global Deduplication Policy and associate it with deduplication enabled Storage Policy copies that you wish to consolidate. This is explained in the Getting Started.

When to Use Global Deduplication

Associating deduplication enabled storage policy copies with a global deduplication policy is based on your requirement. This is explained in the following use cases.

Deduplicating Remote Office Backups

Consider a setup with multiple remote sites and a centralized data center. Each remote site backs up the internal data using individual storage policies and saves a copy of the backup on the centralized data center. Here, though the redundant data within the individual backup can be eliminated using deduplication on primary copies at the remote site, the secondary copies stored at the data center might still contain redundant data among the copies. This redundant data can be identified and eliminated using global deduplication.

For step-by-step instructions on how to setup remote office backups, see Remote Office Backup Using Global Deduplication.

Seeding Data

Seeding a storage policy configured with Global Deduplication, over a Wide Area Network (WAN), is useful when you need to perform backup or auxiliary copy operations from a MediaAgent computer in an remote office site to a data center. This method seeds source side database from the remote office and verifies the data from the data center instead of transferring large amounts of data from one site to the other.

For step-by-step instructions on how to use the seeding process, see Seeding a Global Deduplication Storage Policy.

Storage Policies with Different Retention Rules

For smaller sites that do not have a large amount of data but different retention settings are required. Global deduplication policy provides a common deduplication store that can be shared by multiple storage policy copies with different retention rules. However, all participating storage policy copies share the same data paths which consists of MediaAgents and disk library mount paths.

When Not To Use Global Deduplication

Global deduplication policy is not required in following situation:

Storage Policy Copies with Global Deduplication

Global deduplication must be configured at the time of copy creation, for primary as well as secondary copies. This is done by associating the copy to the Global Deduplication Policy. Once configured:

When configuring global deduplication between copies, ensure that the MediaAgents of all copies have access to the deduplication store.

If multiple MediaAgents are involved, then the disk library containing the deduplication store must be configured as a static share library.

Creating Primary Copies

To enable global deduplication for a Primary copy, you must create a new storage policy. Use the following steps to create a storage policy with global deduplication:

  1. From the CommCell Browser, expand Policies.
  2. Right-click Storage Policies, and then click New Storage Policy.
  3. Click Next.
  4. Enter the name in the Storage Policy Name box and click Next.
  5. Under the Use Existing Global Deduplication Policy, click Yes and then click Next.
  6. From Existing Global Deduplication Policy list, click the name of a Global Deduplication Policy and click Next.
  7. Click Finish.
  8. The storage policy with global deduplication policy enabled will appear under Storage Policies node.

Creating Secondary Copies

To enable global deduplication for a Secondary copy, create a new Storage Policy Copy. Use the following steps to create a storage policy copy with global deduplication:

  1. From the CommCell Browser, navigate to Policies | Storage Policies.
  2. Right-click <storage_policy>, point to All Tasks and then click Create New Copy.
  3. Specify name in Copy Name.
  4. From the Library list, click the name of a disk library.
  5. From the MediaAgent list, click the name of a MediaAgent.
  6. Select Enable Deduplication box.
  7. Select Global Deduplication Policy box and click the name of a global deduplication policy from the list.
  8. Click OK.
  9. Click OK to accept the default schedule.

  10. You can view the Secondary Copy in the Storage Policy pane.

Block Size for Global Deduplication

If a storage policy is set with a block size, the block size is applicable for all copies in the policy except for copies with global deduplication. Storage policy copies with global deduplication inherit the block size set at the global deduplication policy. To get the maximum benefit of deduplication, it is recommended to have the same block size for all the copies in a storage policy. So, if one or more copies of a storage policy is associated with a global deduplication policy, it is recommended that the storage policy and the global deduplication policy are configured with the same block size.

By default the block size is set to 128 KB in both the deduplication enabled storage policy as well as global deduplication policy.

Use the following steps to configure the block size in a global deduplication policy:

  1. From the CommCell Browser, navigate to Policies | Storage Policies.
  2. Right-click <global_deduplication_policy>, and then click Properties.
  3. Click Advanced tab.
  4. From Block level Deduplication factor list, click the block size to be used for deduplication.
  5. Click OK.

Use the following steps to configure the block size in a deduplication enabled storage policy:

  1. From the CommCell Browser, navigate to Policies | Storage Policies.
  2. Right-click <storage_policy>, and then click Properties.
  3. Click Advanced tab.
  4. From Block level Deduplication factor list, click the block size to be used for deduplication.
  5. Click OK.
 

Managing the Global Deduplication Store

Setting up the Minimum Free Space

The minimum free space that must be available at all times in the volume in which the Deduplication Store is configured. By default, if the free space is less than 10% in the volume hosting the Deduplication Store, jobs will not continue.

Use the following steps to set the minimum free space.

  1. From the CommCell Browser, navigate to Policies | Storage Policies | <Global_Deduplication_Policy>.
  2. Right-click the primary copy displayed in the right pane and click Properties.
  3. Click the Deduplication tab, and then click the Settings tab.
  4. In the Minimum Free Space box, type or select the amount of free space you want to change.
  5. Click OK.

Setting up an Alert For Free Space

If the amount of free space falls below the specified amount in the volume in which the Deduplication Store is stored, the MediaAgent generates an event message and generates the MediaAgents (Disk Space Low) alert, if configured.

Use the following steps to set the minimum free space to generate the alert:

  1. From the CommCell Browser, navigate to Policies | Storage Policies | <Global_Deduplication_Policy>.
  2. Right-click the primary copy displayed in the right pane and click Properties.
  3. Click the Deduplication tab, and then click the Settings tab.
  4. In the Free Space Warning box, type or select the amount of free space you want to change to generate alert.
  5. Click OK.

Setting the Age of the Primary Block

You can set the number of days after which a block cannot be used for new deduplication. Setting this value will ensure that very old blocks are not allowed as the 'origin' data for newer backup jobs that are deduplicated.

Use the following steps to set the number of days after which a block cannot be used for deduplication:

  1. From the CommCell Browser, navigate to Policies | Storage Policies | <Global_Deduplication_Policy>.
  2. Right-click the primary copy displayed in the right pane and click Properties.
  3. Click the Deduplication tab, and then click the Settings tab.
  4. In the Do not Deduplicate against objects older than box, type or select the number of days you want to change for deduplication reference.
      If you do not specify the value, the default value is set to infinite.
  5. Click OK.

Setting up Data Compression

You can enable data compression for all subclients associated with the storage policy copy. Note this option will enable data compression on subclients associated with this Storage Policy Copy, even if compression is not enabled at the corresponding subclient level.

Use the following steps to enable data compression for all subclients:

  1. From the CommCell Browser, navigate to Policies | Storage Policies | <Global_Deduplication_Policy>.
  2. Right-click the primary copy displayed in the right pane and click Properties.
  3. Click the Deduplication tab, and then click the Settings tab.
  4. Select the Enable Software Compression with Deduplication box.

    This options is enabled by default. It is recommended to have data compression enabled when using deduplication.

    Note this option supersedes the compression option set in the corresponding subclients.

  5. Click OK.

Changing the MediaAgent Hosting the Store

Perform the following to change the MediaAgent hosting the deduplication store:

  1. Stop the Services
  2. Copy the Deduplication Store Content
  3. Change MediaAgent Hosting Deduplication Store
  4. Start the Services

Stop the Services

Make sure that there are no SIDB.exe and SIDB2.exe process are running on the MediaAgent from which the SIDB currently resides. Use the following steps to confirm that no process are running:

For Windows:
  1. Click the Start button on the Windows task bar and then click All Programs.
  2. Navigate to bull | Calypso and click Service Control Manager.
  3. Select All Services in Services.
  4. Click Stop to stop all services.

For Linux:

  1. Log on to the computer as root.
  2. Run the following command to stop services:

    Calypso stop

Copy the Deduplication Store Content

You need to manually copy the content available in the current Deduplication Store to the new mediaagent which you want to host the Deduplication Store. Use the following steps to copy the content available in the current Deduplication Store:

  You cannot copy the deduplication database (SIDB) from Windows to Linux or from Linux to Windows location.
  1. Login to the MediaAgent hosting the current Deduplication Store.
  2. Navigate to the location where Deduplication database is available.
  3. Copy the following available content on to a shared drive:
    • SIDB
    • icl_label.txt files
  4. Login to the new MediaAgent that will be hosting a Deduplication Store.
  5. From the share drive, copy the files into desired folder that will host the Deduplication Store.

    Make sure to note down the directory it is copied on.

  6. Verify the size of the directory is in fact the same to ensure all data was copied.

Change MediaAgent Hosting Deduplication Store

Use the following steps to change the MediaAgent hosting the Deduplication Store:

  1. From the CommCell Browser, navigate to Policies | Storage Policies | <Global_Deduplication_Policy>.
  2. Right-click the Primary copy displayed in the right pane and click Properties.
  3. Click the Deduplication tab.
  4. In the Deduplication Storage Access Path area, click Change Host button.
  5. You will be prompted with a Warning message.

    If you have performed Stop the Services and Copy the Deduplication Store Content steps, then click OK and then click Change Host button.

    If you did not stop the services and did not copy the files, click OK, and then Stop the Services and Copy the Deduplication Store Content and then follow from step 1.

  6. From the Deduplication Storage Access Path dialog box, perform the following:
    • Select <MediaAgent> from MediaAgent Name drop-down list.
    • In the Deduplication Store Location, type the name of the folder to which the deduplication  or click Browse button to select the folder in which the deduplication database must be located.

    The store information is displayed in the Deduplication Store Access Path area.

  7. Click OK.

Start the Services

If your old MediaAgent is in use hosting deduplication store for other storage policies and libraries or for backup, use the following steps to start the services.

For Windows:
  1. Click the Start button on the Windows task bar and then click All Programs.
  2. Navigate to bull | Calypso and click Service Control Manager.
  3. Select ALL Services in Services.
  4. Click Start to start all services.

For Linux:

  1. Log on to the computer as root.
  2. Run the following command to start services:

    Calypso start

 

Changing the Location of the Deduplication Store

Use the following steps to change the location of the Deduplication Store in the existing MediaAgent:

  1. Make sure that no running jobs are currently accessing the store.
  2. Stop the Services on existing MediaAgent by performing the following:

    For Windows:

    • Click the Start button on the Windows task bar and then click All Programs.
    • Navigate to bull | Calypso and click Service Control Manager.
    • Select All Services in Services.
    • Click Stop to stop all services.

    For Linux:

    • Log on to the computer as root.
    • Run the following command to stop services:

      Calypso stop

  3. Copy the deduplication database files (SIDB, icl_label.txt files) to the new location.
  4. From the CommCell Browser, navigate to Policies | Storage Policies | <Global_Deduplication_Policy>.
  5. Right-click the Primary copy displayed in the right pane and click Properties.
  6. Click the Deduplication tab.
  7. In the Deduplication Store Access Path area, select <MediaAgent> and click the Properties button.
  8. You will be prompted with a Warning message.

    If you have stopped the services and copied the store files to new location, then click OK and then click Properties button.

    If you did not stop the services and did not copy the files, click OK, and then follow from step 2.

  9. In the Deduplication Access Path dialog, perform the following:
    • Click Change button.
    • In the Deduplication Store Location box, type the name of the folder in which the deduplication database must be located.
    • Click OK.
  10. In the Deduplication Store Data dialog box, click Yes, if you have completed the steps provided in the dialog box.

    The store information will be displayed in the Deduplication Store Access Path area.

  11. Click OK.
  12. Start the services.

Configuring Deduplication Store Creation

By default, a new Deduplication Store is created for every 100 TB of data. Note that this is the amount of data stored on the media after deduplication.

Use the following steps to create new Deduplication Store:

  1. From the CommCell Browser, navigate to Policies | Storage Policies | <Global_Deduplication_Policy>.
  2. Right-click the primary copy displayed in the right pane and click Properties.
  3. Click the Deduplication tab.
  4. In the Deduplication Store Creation, select one of the following:
    • Select  Create new store every - days and specify the number of days after which a new Deduplication Store must be created.
    • Select Create new store every - TB and specify the size of the store, reaching which a new Deduplication Store must be created.
        If above both options are set, a new Deduplication Store will be created if either one of the two conditions is satisfied.
    • Select Create new store every - Month(s). Starting from... and specify the month and start date for a new Deduplication Store creation.
  5. Click OK.

Sealing Deduplication Store

The currently active Deduplication Store can be sealed on-demand.

When a Deduplication Store is sealed:

The option to Seal Deduplication Stores is useful in rare cases when there are hardware issues or disk malfunction. Creating a new store will prevent new data from referencing any of the old data in the malfunctioned disks.

Use the following steps to seal the Deduplication Store:

  1. From the CommCell Browser, navigate to Policies | Storage Policies | <Global_Deduplication_Policy>.
  2. Right-click the primary copy displayed in the right pane, point to All Tasks and then click Seal Deduplication Store.
  3. Click Yes on the Confirm Seal Deduplication Store dialog.

Deduplication Store Failover

You can choose to automatically create a new Deduplication Store in the event the active store becomes offline and deduplication database backup is not available. When configured, if a offline active store is detected then that store is automatically sealed and a new store is created.

Use the following steps to automatically create a new Deduplication Store when the active store becomes offline and deduplication database backup is not available:

  1. From the CommCell Browser, navigate to Policies | Storage Policies | <Global_Deduplication_Policy>.
  2. Right-click the primary copy displayed in the right pane and click Properties.
  3. Click the Deduplication tab, and then click the Settings tab.
  4. Click Failover to New Store.
  5. Click OK.

Manually Reconstructing a Store

You can choose to recover from a offline deduplication store by manually reconstructing the store. If a offline store is detected, all jobs on that copy are paused until the store is manually reconstructed.

Use the following steps to configure and perform manual reconstruction:

Setting Up Manual Reconstruction as the Default Option

  1. From the CommCell Browser, navigate to Policies | Storage Policies | <Global_Deduplication_Policy>.
  2. Right-click the primary copy displayed in the right pane and click Properties.
  3. Click the Deduplication tab, and then click the Settings tab.
  4. Click Pause and Recover current store option.
  5. Click On-Demand.
  6. In the Create recovery points every - Hour(s) box, type or select the frequency of Deduplication Store snapshots.
      You will be able to use the deduplication store snapshot frequency, if you have not already created a subclient for backing up deduplication database.
  7. Click OK.

Manually Reconstruct a Store

  1. From the CommCell Browser, navigate to Policies | Storage Policies | <Global_Deduplication_Policy>.
  2. Right-click the primary copy displayed in the right pane, point to All Tasks and then click Recover Store.
  3. In the Reconstruct Dedupe Database Options dialog, select Allow Maximum.
  4. Click OK.

Setting Up Automatic Recovery

When a system detects a offline deduplication store, the 'recover job' will automatically run to restore the deduplication store from the deduplication backup which was backed up using DDB subclient.

See Backing Up Deduplication Store Database for more information on backing up deduplication store.

Use the following steps to revert to the default settings if you have changed the store recovery points.

  1. From the CommCell Browser, navigate to Policies | Storage Policies | <Global_Deduplication_Policy>.
  2. Right-click the primary copy displayed in the right pane and click Properties.
  3. Click the Deduplication tab, and then click the Settings tab.
  4. Click Pause and Recover current store.
  5. Click Automatically.
  6. In the Create recovery points every - Hour(s) box, type or change the frequency of Deduplication Store snapshots.
      You will be able to use the deduplication store snapshot frequency, if you have not already created a subclient for backing up deduplication database.
  7. Click OK.

Rebooting a MediaAgent Hosting the Global Deduplication Store

You may want to reboot a MediaAgent for installing updates or maintenance purposes. For MediaAgents controlling the deduplication database, you will have to ensure that all the deduplication transactions in the memory are completed before rebooting. Failure to follow the recommendations might result in the sealing of the Global Deduplication Store, which will increase the amount of storage space consumed in the primary disk library.
Reboot a Windows MediaAgent
  1. Click the Start button on the Windows task bar and then click All Programs.
  2. Navigate to bull | Calypso and click Service Control Manager.
  3. Select All Services in Services.
  4. Click Stop.
  5. When the services are stopped, open the Windows Task Manager.
  6. Select the Processes tab and locate the SIDB.exe or SIDB2.exe process. If either of the processes is located, then wait until the process is complete.

    Depending on the size of the Deduplication database, this process might take as long as 30 minutes to complete.

  7. Once the process is complete and no longer displayed on the task manager, reboot the computer.
Reboot a Unix MediaAgent
  1. Log on to the computer as root and run the following command to stop services:

    Calypso stop

  2. When the services are stopped, type the following command to view all the deduplication processes that are still running.

    ps –aef | grep sidb

  3. If either the SIDB.exe or SIDB2.exe process is found running, then wait until the process is complete.

    Depending on the size of the Deduplication database, this process might take as long as 30 minutes to complete.

  4. Repeat Step 2 to confirm that the processes are no longer running and then reboot the computer.

Configuring Data Paths

A global deduplication copy can be configured to use more than one data path. Use the following steps to add a new data path:

  When a storage policy copy is associated with a global deduplication policy, all data paths are strictly inherited from the global deduplication policy. Data paths cannot be added to the storage policy copy.

Use the following steps to add the data path to the global deduplication policy:

  1. From the CommCell Browser, navigate to Policies | Storage Policies | <Global_Deduplication_Policy>.
  2. Right-click the <global deduplication primary copy> displayed in the right pane and click Properties.
  3. Click the Data Paths tab.
  4. Click Add button to add the multiple data paths.
  5. Click OK.

Backing up Deduplication Store Database

Use the following method to backup the deduplication database so that it can be reconstructed in the unlikely event of an offline deduplication database. If this method is not used, the system will automatically use the automatic recovery process as described in Setting Up Automatic Recovery to reconstruct the database.

This is the recommended method of protecting the deduplication store database. If there are multiple deduplication databases on the MediaAgent, this method automatically backs up all the deduplication databases.

This method performs a FULL backup of the deduplication database and the backup data is sent to the appropriate backup media based on the storage policy selected for the Deduplication Database Store subclient.

  If you have deduplication database hosted on Linux Intel Itanium (IA64) machine, deduplication database backups using DDB subclient is not supported. To backup deduplication databases, use automatic recovery process described in Setting Up Automatic Recovery.

Use the following steps to set up regular backup of deduplication store:

1. File System iDataAgent must be installed on the MediaAgent hosting the deduplication store.

You can install the File System iDataAgent as a Restore Only Agent without consuming any license.

To do so, make sure to select Install Agents for Restore Only check box from the Select Platforms dialog box during File System iDataAgent installation.

See Getting Started - Windows File System Deployment for step-by-step procedure.
2. From the CommCell Browser, navigate to Client Computers | <MA_client_hosting_global_dedup_store> | File System.

Right-click the defaultBackupSet, point to All Tasks and then click New Subclient.

3.
  • Type a name for the subclient in the Subclient name.
  • Select the DDB Subclient check box.
  DDB Subclient check box is available only when creating new subclients under defaultBackupSet.
For Windows:

For Unix:

4.
  • Click the Storage Device tab.
  • From the Storage Policy list, select a storage policy that does not have deduplication enabled for primary copy.
  • Click OK.
5. Click Schedule and then click OK.
6.
  • By default, the Backup type is always Full.

    Note that the other backup types such as incremental, differential etc., are not supported.

  • From Job Initiation, select the Schedule option and then click Configure.
7.
  • In the Schedule Name box, type a name of the schedule.
  • Select the appropriate scheduling options.

    For example, use the following steps to create a weekly schedule:

    • Type a name for the schedule in the Schedule Name box.
    • Select Daily.
    • Type the Start Time to start the schedule.
    • Click Options >> button.
    • From the Advanced Schedule Options dialog box, select Repeat every check box.

      The default value is set to 8 hours.

    • Click OK.
8. Click OK to close the Schedule Details job.

The new Deduplicate Store Database subclient will be displayed in the right-pane.

9. When the schedule is run, the Job Controller window will display the backup job as shown in the sample image.
  During DDB backup job, if system detects reboot of a DDB MediaAgent, then the DDB backup job will go into Pending state. After reboot, the DDB backup job will restart from the beginning by creating a new snapshot of the DDB to perform the backup.
10.
  • From the CommCell Browser, navigate to Policies | Storage Policies | <Global_Dedup_Storage_Policy> where the deduplication store was created.
  • Right-click the global deduplication primary copy displayed in the right pane and click Properties.
11.
  • Click the Deduplication tab, and then click the Settings tab.
  • Ensure that Pause and Recover current store and Automatically options are selected.
  • Click OK.
12. When the system detects an offline deduplication Store, the Job Controller window will display the recover job as shown in the sample image.

Configure Alerts for Deduplication Store Backup

Additionally, you can configure alert for global deduplication store backup jobs to receive alerts when a deduplication store backup job fails and when there are no deduplication backup jobs.

Use the following steps to setup alert for the global deduplication store backup subclient:

13.
  1. From the CommCell Browser, click Control Panel and then double-click the Email and IIS Configuration.
  2. In the Mail Server box, specify the mail server to be used by alerts. The Mail Server must support SMTP messages.
  3. In the Mail Server Port box, select the port number.
  4. In the Mail Server Size limit, specify the size limit per e-mail.
  5. Click OK.
  6. From the CommCell Browser, click Control Panel and then double-click the Alerts.
  7. From the Alert dialog box, click Add button.
  8. In the Add Alert Wizard, specify the following:
    • In the Display Name box, specify the name for the alert.
    • In the Category list box, select Job Management.
    • In the Type list box, select Data Protection.
    • Click Next.
    • From the Entities Selection, navigate to <client_computer> | File System and then select the <Deduplication_Backup_Subclient> and then click Next.
        If you have multiple deduplication backup subclients on multiple MediaAgent(s), select the <deduplication_backup_subclient> of each MediaAgent and then click Next.
    • By default, Job Failed check box is selected which allows you to receive alerts when deduplication backup job fails.

      Select No Data Protection check box to receive alerts when there are no deduplication backup jobs.

      Clear Delayed by 1 Hrs and Job Succeeded with Errors check boxes and then click Next.

    • Select the way in which the alert is to be sent to its intended recipient. Select the Select [Email/Pager] for notification check box.

      If you wish to customize the e-mail or pager notification, click a token from the list and then click Add Token.

    • Select the CommCell users and/or CommCell user groups that will receive the alert. Or,

      Specify the e-mail address(es) of the recipient(s) in the Email to Recipients box, these recipients can reside within an external domain.

      Click Next.

    • Verify the options you have selected for the alert in the Summary and then click Finish.

      The alert is now configured and displayed in the Alerts dialog box.

    • Click OK to close the Alerts dialog box.

License Requirements

Global Deduplication requires following licenses on the MediaAgent hosting the deduplication store, based on the License Type: