TSM 6.3: Setup the Deduplication


Dedupcation only keeps a single instance of redundant data on the storage (must be sequential device). The advantage is to save space (= save money, TSM license is not cheap). But it also brings in some disadvantages, e.g it requires 16G memory as minimum, adds more system load, increase the database size significantly (4-5 times). 

In TSM 6.3, there are two types of deduplications: client-deduplication and server-deduplication.  You can find it out by running ‘q node nodename f=d’, and check the Deduplication section. In my example, I will test both. Let’s start with ServerOnly which is also the default.

image

1) As only sequential device supports deduplication, a FILE device type has to be created.

define devclass FILE devtype=file format=drive

image

update dev FILE maxcapacity=10240M

image

2) Create a new storage group and make it as FILE device type. And make sure to include deduplicate=yes (it is disabled by default).

define stgpool DEDUPPOOL FILE pooltype=primary maxscratch=200 deduplicate=yes

image

3) Create a new folder named deduppool under G: H: I: drive.

4) Create the volumes for the DEDUPPOOL storage pool, each volume is 10 G, and 5 volumes in each drive.

define vol DEDUPPOOL G:\deduppool\dedupvol F=10240 N=5
define vol DEDUPPOOL H:\deduppool\dedupvol F=10240 N=5
define vol DEDUPPOOL I:\deduppool\dedupvol F=10240 N=5

image

5)  Change the storage pool destination of the FILE/NORMAL/DAILY/STANDARD management class to DEDUPPOOL.

update copygroup FILE NORMAL DAILY STANDARD destination=DEDUPPOOL
update copygroup FILE NORMAL DAILY STANDARD type=archive destination=DEDUPPOOL

validate policyset FILE NORMAL
activate policyset FILE NORMAL

image

6) Login into the TSM_Client to run a full backup of 1.5 G file

image

7) Login to TSM_Sandbox the check the DB and storage pool status

image

8) Run another full backup on TSM_Client01 (choose Always backup, as it is incremental by default)

image

9) Check the status again. The duplicate data shows 0 (0%). This is because the deduplicated data will not be removed until data migration or copy happens.

image

q stg DEDUPPOOL f=d

image

10) Check how much data has been identified as duplicated data. In my example, it is about 1.6 GB.

q pro

image

11) Deduplication will not start until the primary pool is backed up to a non-deduplication copy pool. This can be changed by running ‘deduprequiresbackup no’ (This is not recommended in production environment). In my test, I create a 50GB copy pool to backup the DEDUPPOOL.

define stgpool COPYPOOL FILE pooltype=copy maxscratch=200
define vol COPYPOOL J:\copypool\copyvol F=10240 N=5

image

12) Start to backup data from primary pool to copy pool.

backup stg DEDUPPOOL COPYPOOL

image

13) The process to remove the duplicated data is called reclaim. The process does not start unless the reclaim threshold (the storage pool utilization) has been met, it is 60% by default. To make it happen right away, I make it the threshold as 1%.

update stg DEDUPPOOL reclaim=1

image

Start the reclaim process: reclaim stg DEDUPPOOL

image

14) Let’s check the DEDUPPOOL status after the storage pool reclaim process finished. 50% of the used space has been reclaimed.

image
image

15) Next I am going to enable Client side deduplication. Firstly, add ‘DEDUPLICATION Yes’ in the client dsm.opt file. Secondly run the following command in TSM admin console.

update node TSM_Client01 deduplication=clientorserver

image

16) Run another full backup on TSM_Client and observe the report. Only 343.06 KB data transferred to the TSM_Sandbox for this full backup.

image

The duplicated data becomes 66%

image

17) Now I am going to simulate the scenario that a volume is damaged in the primary pool and restore it from the copy pool.

image

18) Delete G:\DEDUPPOOL\DEDUPVOL003.

image

19) Login to the TSM_Client01 to restore the file. As I have backed it up 3 times, there are 3 versions.

image

20) The restore failed due to the data can not be found in the primary pool.

image

21) Login back to TSM_Sandbox to mark the deleted Volume as destroyed

update vol G:\DEDUPPOOL\DEDUPVOL003 access=destroyed

image

image

22) Run the restore again. Now it works, as it is restoring from the copy pool.

image

23) Restore the damaged volume from copy pool

restore vol G:\DEDUPPOOL\DEDUPVOL003

image

As I have deleted the volume physically from the hard disk, so the data will be restored to other volume.

image

Advertisement

8 thoughts on “TSM 6.3: Setup the Deduplication

  1. Hi jakie chen
    We are using TSM 6.3.4 and we enabled the server side DE-duplication
    Sql databases are getting de-dup but exchange databases are not getting de-dup.

    Please let me know is additional parameters do we need modify to get de-dup for exchange DB’s

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s