ZFS Compression Vs Deduplication (dedup)
Been playing with ZFS dedupe for the last two weeks and just wanted to share my findings.
Setup OpenSolaris build 131
Sun X4200, 2 x Dual Core Opteron 2.6Ghz, 8Gb Ram, 4 x 73Gb SAS 10Krpm
root@osol:~# zfs list
NAME USED AVAIL REFER MOUNTPOINT
compress 72K 66.9G 21K /compress
dedupe 72K 66.9G 21K /dedupe
root@osol:~# zfs set compression=on compress
root@osol:~# zfs set dedup=on dedupe
Wanted to see how much real data would dedupe.
I loaded the my company project/Software folders, 68,000 files (Visio/PDF/Project/Word/OpenOffice/Excel,ISO's... ) total of 38.9Gb
Load times, copying files from local UFS filesystem to ZFS dataset.
root@osol:/ufs# ptime tar cf - iso projects software | pv | ( cd /dedupe/ ; tar xf - )
real 19:51.930407394
user 5.807881662
sys 1:48.025965013
38.8GB 0:19:51 [33.3MB/s]
root@osol:/ufs# ptime tar cf - iso projects software | pv | ( cd /compress/ ; tar xf - )
real 18:46.544321180
user 3.368262960
sys 1:52.065809786
38.8GB 0:18:46 [35.3MB/s]
The deupe ZFS volume was 66 seconds slower than the compress volume.
Let see how much space we saved for both methods
root@osol:/ufs# zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
compress 68G 36.1G 31.9G 53% 1.00x ONLINE -
dedupe 68G 38.4G 29.6G 56% 1.02x ONLINE -
rpool 67G 49.8G 17.2G 74% 1.00x ONLINE -
root@osol:/ufs# zfs get compressratio compress
NAME PROPERTY VALUE SOURCE
compress compressratio 1.08x -
The compressed volume did a better job than dedupe and saving an extra 6% storage.
Conclusion
There isn't any advantages for dedupe on a general home file share, slight slower performance and less space saved when compared to compression.
Now why would you want to dedupe ? Well just look at my dedupe ratio of 2.28 for a NFS share with VMware, now this is exciting!
Therefore I can only say "Some data is more equal than others."
Andy
(Minor edit 7.02.10)
Setup OpenSolaris build 131
Sun X4200, 2 x Dual Core Opteron 2.6Ghz, 8Gb Ram, 4 x 73Gb SAS 10Krpm
root@osol:~# zfs list
NAME USED AVAIL REFER MOUNTPOINT
compress 72K 66.9G 21K /compress
dedupe 72K 66.9G 21K /dedupe
root@osol:~# zfs set compression=on compress
root@osol:~# zfs set dedup=on dedupe
Wanted to see how much real data would dedupe.
I loaded the my company project/Software folders, 68,000 files (Visio/PDF/Project/Word/OpenOffice/Excel,ISO's... ) total of 38.9Gb
Load times, copying files from local UFS filesystem to ZFS dataset.
root@osol:/ufs# ptime tar cf - iso projects software | pv | ( cd /dedupe/ ; tar xf - )
real 19:51.930407394
user 5.807881662
sys 1:48.025965013
38.8GB 0:19:51 [33.3MB/s]
root@osol:/ufs# ptime tar cf - iso projects software | pv | ( cd /compress/ ; tar xf - )
real 18:46.544321180
user 3.368262960
sys 1:52.065809786
38.8GB 0:18:46 [35.3MB/s]
The deupe ZFS volume was 66 seconds slower than the compress volume.
Let see how much space we saved for both methods
root@osol:/ufs# zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
compress 68G 36.1G 31.9G 53% 1.00x ONLINE -
dedupe 68G 38.4G 29.6G 56% 1.02x ONLINE -
rpool 67G 49.8G 17.2G 74% 1.00x ONLINE -
root@osol:/ufs# zfs get compressratio compress
NAME PROPERTY VALUE SOURCE
compress compressratio 1.08x -
The compressed volume did a better job than dedupe and saving an extra 6% storage.
Conclusion
There isn't any advantages for dedupe on a general home file share, slight slower performance and less space saved when compared to compression.
Now why would you want to dedupe ? Well just look at my dedupe ratio of 2.28 for a NFS share with VMware, now this is exciting!
root@osol:~$ zpool list vm-dedupe
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
vm-dedupe 68G 16.1G 51.9G 23% 2.28x ONLINE -NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
Therefore I can only say "Some data is more equal than others."
Andy
(Minor edit 7.02.10)
Comments