Dedupcation only keeps a single instance of redundant data on the storage (must be sequential device). The advantage is to save space (= save money, TSM license is not cheap). But it also brings in some disadvantages, e.g it requires 16G memory as minimum, adds more system load, increase the database size significantly (4-5 times).
In TSM 6.3, there are two types of deduplications: client-deduplication and server-deduplication. You can find it out by running ‘q node nodename f=d’, and check the Deduplication section. In my example, I will test both. Let’s start with ServerOnly which is also the default.
1) As only sequential device supports deduplication, a FILE device type has to be created.
define devclass FILE devtype=file format=drive
update dev FILE maxcapacity=10240M
2) Create a new storage group and make it as FILE device type. And make sure to include deduplicate=yes (it is disabled by default).
define stgpool DEDUPPOOL FILE pooltype=primary maxscratch=200 deduplicate=yes
3) Create a new folder named deduppool under G: H: I: drive.
4) Create the volumes for the DEDUPPOOL storage pool, each volume is 10 G, and 5 volumes in each drive.
define vol DEDUPPOOL G:\deduppool\dedupvol F=10240 N=5
define vol DEDUPPOOL H:\deduppool\dedupvol F=10240 N=5
define vol DEDUPPOOL I:\deduppool\dedupvol F=10240 N=5
5) Change the storage pool destination of the FILE/NORMAL/DAILY/STANDARD management class to DEDUPPOOL.
update copygroup FILE NORMAL DAILY STANDARD destination=DEDUPPOOL
update copygroup FILE NORMAL DAILY STANDARD type=archive destination=DEDUPPOOL
validate policyset FILE NORMAL
activate policyset FILE NORMAL
6) Login into the TSM_Client to run a full backup of 1.5 G file
7) Login to TSM_Sandbox the check the DB and storage pool status
8) Run another full backup on TSM_Client01 (choose Always backup, as it is incremental by default)
9) Check the status again. The duplicate data shows 0 (0%). This is because the deduplicated data will not be removed until data migration or copy happens.
q stg DEDUPPOOL f=d
10) Check how much data has been identified as duplicated data. In my example, it is about 1.6 GB.
q pro
11) Deduplication will not start until the primary pool is backed up to a non-deduplication copy pool. This can be changed by running ‘deduprequiresbackup no’ (This is not recommended in production environment). In my test, I create a 50GB copy pool to backup the DEDUPPOOL.
define stgpool COPYPOOL FILE pooltype=copy maxscratch=200
define vol COPYPOOL J:\copypool\copyvol F=10240 N=5
12) Start to backup data from primary pool to copy pool.
backup stg DEDUPPOOL COPYPOOL
13) The process to remove the duplicated data is called reclaim. The process does not start unless the reclaim threshold (the storage pool utilization) has been met, it is 60% by default. To make it happen right away, I make it the threshold as 1%.
update stg DEDUPPOOL reclaim=1
Start the reclaim process: reclaim stg DEDUPPOOL
14) Let’s check the DEDUPPOOL status after the storage pool reclaim process finished. 50% of the used space has been reclaimed.
15) Next I am going to enable Client side deduplication. Firstly, add ‘DEDUPLICATION Yes’ in the client dsm.opt file. Secondly run the following command in TSM admin console.
update node TSM_Client01 deduplication=clientorserver
16) Run another full backup on TSM_Client and observe the report. Only 343.06 KB data transferred to the TSM_Sandbox for this full backup.
The duplicated data becomes 66%
17) Now I am going to simulate the scenario that a volume is damaged in the primary pool and restore it from the copy pool.
18) Delete G:\DEDUPPOOL\DEDUPVOL003.
19) Login to the TSM_Client01 to restore the file. As I have backed it up 3 times, there are 3 versions.
20) The restore failed due to the data can not be found in the primary pool.
21) Login back to TSM_Sandbox to mark the deleted Volume as destroyed
update vol G:\DEDUPPOOL\DEDUPVOL003 access=destroyed
22) Run the restore again. Now it works, as it is restoring from the copy pool.
23) Restore the damaged volume from copy pool
restore vol G:\DEDUPPOOL\DEDUPVOL003
As I have deleted the volume physically from the hard disk, so the data will be restored to other volume.
Great Guide!
Very usefull!
wooww..
this is one of the super link for TSM 6.3..
Please keep this updated ..
nice post,go ahead…………
How i could to calculate hos many space to deduplication?
check the ‘Duplicate Data Not Stored’ in the results of ‘q stg f=d’
Jackie,
Super POST thanx
Djamel,
You helped me understand better how deduplication works.
Very good work, please continue.
Hi jakie chen
We are using TSM 6.3.4 and we enabled the server side DE-duplication
Sql databases are getting de-dup but exchange databases are not getting de-dup.
Please let me know is additional parameters do we need modify to get de-dup for exchange DB’s