AR-Dedupe: An Efficient Deduplication Approach for Cluster Deduplication System
AR-Dedupe: An Efficient Deduplication Approach for Cluster Deduplication System
XING Yu-xuan1* (邢玉轩), XIAO Nong1 (肖侬), LIU Fang1 (刘芳), SUN Zhen1 (孙振), HE Wan-hui2 (何晚辉)
(1. State Key Laboratory of High Performance Computing, National University of Defense Technology,
Changsha 410073, China; 2. Command Department, Nanjing Artillery Academy, Nanjing 210000, China)
(1. State Key Laboratory of High Performance Computing, National University of Defense Technology,
Changsha 410073, China; 2. Command Department, Nanjing Artillery Academy, Nanjing 210000, China)
XING Yu-xuan1* (邢玉轩), XIAO Nong1 (肖侬), LIU Fang1 (刘芳), SUN Zhen1 (孙振), HE Wan-hui2 (何晚辉). AR-Dedupe: An Efficient Deduplication Approach for Cluster Deduplication System[J]. Journal of shanghai Jiaotong University (Science), 2015, 20(1): 76-81.
Villars R L, Olofson C W, Eastwood M. Big data: What it is and why you should care [R]. Framingham,MA, USA: IDC, 2011.
[2]
Kolodg C J. Effective data leak prevention programs:Start by protecting data at he source—your database[R]. Framingham, MA, USA: IDC, 2011.
[3]
Bhagwat D, Eshghi K, Long D D E, et al. Extreme Binning: Scalable, parallel deduplication for chunk based file backup [C]//Proceedings of the 17th IEEE/ACM International Symposium on Modeling,Analysis and Simulation of Computer and Telecommunication Systems (MAS-COTS’2009). London, UK:IEEE, 2009: 1-9.
[4]
Fu Y J, Jiang H, Xiao N. A scalable inline cluster deduplication framework for big data protection [C]//The ACM/IFIP/USENIX 13th International Conference on Middleware (Middleware’12).[s.l.]: ACM, 2012: 354-373.
[5]
Fu Y J, Jiang H, Xiao N, et al. AA-dedupe: An application-aware source deduplication approach for cloud backup services in the personal computing environment[C]// Proceedings of the 13th IEEE Internatioanl Conference on Cluster Computing (Cluster’11).[s.l.]: IEEE, 2011: 112-120.
[6]
El-Shimi A, Kalach R, Kumar A, et al. Primary data deduplication—large scale study and system design[C]//Proceedings of the 2012 USENIX Annual Technical Conference. [s.l.]: ATC, 2012: 285-296.
[7]
Bhagwat D, Eshghi K, Mehra P. Content-based document routing and index partitioning for scalable similarity-based searches in a large corpus[C]//Proceedings of the 13th ACM International Conference on Knowledge Discovery and Data Mining(SIGKDD’07). San Jose, California, USA: ACM, 2007:105-112.
[8]
Meyer D T, Bolosky W J. A study of practical deduplication [J]. ACM Transaction on Storage, 2012,7(4): 14.