Electronic Information and Electrical Engineering

A Consistency Checking Method for Erasure-Coded Striped Data

Expand
  • 1. Shanghai Children’s Medical Center, Shanghai Jiao Tong University School of Medicine, Shanghai 200127, China
    2. Shanghai Xiaoyun Info Tech Co., Ltd., Shanghai 200240, China
    3. School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China

Received date: 2024-01-24

  Revised date: 2024-02-20

  Accepted date: 2024-02-22

  Online published: 2024-04-30

Abstract

Erasure code is commonly used in distributed storage systems. The stripe is the basic unit of consistency check in erasure-coded data, including multiple original stripe units and verification stripe units. In order to reduce the cost of reading for consistency check of erasure-coded striped data and improve the efficiency of erasure-coded data consistency check and reading-after-writing, self-correction data tags (SCDTs) is added to each stripe unit when writing erasure-coded data in striping mode, based on which, the consistency checks of each stripe are implemented. The method proposed can complete the consistency check of a stripe without reading all data units in the stripe, which improves the efficiency of consistency checks by 1.7 to 2.6 times. Moreover, when the number of stripe units updated by written data is less than the critical value, it can effectively reduce the number of Input/Output (IO) interactions for writing. The method proposed can better handle partial updates of striped data sets while improving the efficiency of consistency checks.

Cite this article

XU Liangye, SHI Lianxing, SHAN Rongsheng . A Consistency Checking Method for Erasure-Coded Striped Data[J]. Journal of Shanghai Jiaotong University, 2024 , 58(4) : 579 -584 . DOI: 10.16183/j.cnki.jsjtu.2024.035

References

[1] 艾瑞咨询. 2022年中国医疗信息化行业研究报告[EB/OL]. (2022-04-19) [2023-10-30]. https://www.thepaper.cn/newsDetail_forward_17686922 .
  iResearch Consulting Group. 2022 China Medical Informatization Industry Research Report[EB/OL]. (2022-04-19) [2023-10-30]. https://www.thepaper.cn/newsDetail_forward_17686922 .
[2] HURLEN P, ?STBYE T, BORTHNE A, et al. Introducing PACS to the late majority. A longitudinal study[J]. Journal of Digital Imaging, 2010, 23(1): 87-94.
[3] DESHMUKH V, SVB L, KULKARNI M, et al. PACS: An overview of the technology and related issues[J]. International Journal of Engineering Technology Science and Research, 2018, 5(5): 122-128.
[4] GHEMAWAT S, GOBIOFF H, LEUNG S T. The Google file system[J]. ACM SIGOPS Operating Systems Review, 2003, 37(5): 29-43.
[5] SHVACHKO K, KUANG H R, RADIA S, et al. The hadoop distributed file system[C]// Proceedings of the 26th Symposium on Mass Storage Systems and Technologies. Washington, USA: IEEE, 2010: 1-10.
[6] WEIL S A, BRANDT S A, MILLER E L, et al. Ceph:A scalable, high-performance distributed file system[C]// Proceedings of the 7th Symposium on Operating Systems Design and Implementation. Seattle, USA: ACM, 2006: 307-320.
[7] ADYA A, BOLOSKY W J, CASTRO M, et al. Farsite: Federated, available, and reliable storage for an incompletely trusted environment[J]. ACM SIGOPS Operating Systems Review, 2002, 36(1): 1-14.
[8] REED I S, SOLOMON G. Polynomial codes over certain finite fields[J]. Journal of the Society for Industrial and Applied Mathematics, 1960, 8(2): 300-304.
[9] MURALIDHAR S, LLOYD W, ROY S, et al.f4: Facebook’s warm BLOB storage system[C]// Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation. Broomfield, USA: ACM, 2014: 383-398.
[10] CALDER B, WANG J, OGUS A, et al. Windows Azure Storage: A highly available cloud storage service with strong consistency[C]// Proceedings of the 23rd ACM Symposium on Operating Systems Principles. Cascais Portugal: ACM, 2011: 143-157.
[11] HUANG C, SIMITCI H, XU Y K, et al. Erasure coding in windows azure storage[C]// Proceedings of the 2012 USENIX Conference on Annual Technical Conference. Boston, USA: ACM, 2012: 15-26.
[12] BERMUDEZ I, TRAVERSO S, MELLIA M, et al. Exploring the cloud from passive measurements: The Amazon AWS case[C]// Proceedings of 2013 IEEE INFOCOM. Turin, Italy: IEEE, 2013: 230-234.
[13] KUBIATOWICZ J, BINDEL D, CHEN Y, et al. OceanStore: An architecture for global-scale persistent storage[J]. ACM SIGPLAN Notices, 2000, 35(11): 190-201.
[14] 杨传辉. 大规模分布式存储系统: 原理解析与架构实战[M]. 北京: 机械工业出版社, 2013.
  YANG Chuanhui. Large-scale distributed storage system: Principles and architectures[M]. Beijing: China Machine Press, 2013.
[15] Swift Team. Erasure code support[EB/OL]. (2019-08-14) [2024-02-23]. https://docs.openstack.org/swift/latest/overview_erasure_code.html .
[16] BREWER E A. Towards robust distributed systems (abstract)[C]// Proceedings of the nineteenth annual ACM symposium on Principles of distributed computing. New York, USA: ACM, 2000: 7.
[17] 田俊峰, 王彦骉, 何欣枫, 等. 数据因果一致性研究综述[J]. 通信学报, 2020, 41(3): 154-167.
  TIAN Junfeng, WANG Yanbiao, HE Xinfeng, et al. Survey on the causal consistency of data[J]. Journal on Communications, 2020, 41(3): 154-167.
[18] LAMPORT L. The part-time parliament[J]. ACM Transactions on Computer Systems, 1998, 116(2): 133-169.
[19] LAMPORT L. Paxos made simple[J]. ACM SIGACT News, 2001, 32(4): 51-58.
[20] ONGARO D, OUSTERHOUT J. In search of an understandable consensus algorithm[C]// Proceedings of the 2014 USENIX Conference on USENIX Annual Technical Conference. Philadelphia, USA: ACM, 2014: 305-320.
[21] 刘爱贵, 李纲彬, 阮薛平. 一种纠删码数据一致性保障方法及系统: CN 114064346 A[P]. 2022-02-18 [2023-10-28].
  LIU Aigui, LI Gangbin, RUAN Xueping. A method and system for ensuring consistency of erasure coded data: CN 114064346 A[P]. 2022-02-18 [2023-10-28].
Outlines

/