现有的piggybacking编码在最大距离可分(Maximum Distance Separable, MDS)码的基础上能够有效减小信息节点的修复带宽开销,但仍存在修复度大、校验节点修复带宽高等问题,同时忽略了实际分布式存储系统中数据冷热程度不同。为此,提出了一种基于区组设计的piggybacking编码构造,对热数据节点提供更高等级的保护。具体地,采用区组设计,对冷热数据节点进行非均匀分组,将热数据符号捎带到对应校验节点中,通过一定规则生成冷、热数据校验块和斜校验块,实现节点的低修复带宽率和低修复度率。理论分析与实验仿真发现,与现有的piggybacking编码相比,基于区组设计的piggybacking码显著降低了故障节点的平均修复带宽率和平均修复度率,且相较于冷数据节点,热数据节点的修复带宽开销更低。
Based on Maximum Distance Separable (MDS) codes, piggybacking codes can reduce the repair bandwidth overhead of systematic nodes effectively, but there are still problems about larger repair degree, higher repair bandwidth of parity nodes, etc. At the same time, the existing piggybacking codes ignore the differences of hot and cold data in the actual distributed storage systems. For the reason above, a construction of block design-based piggybacking codes is proposed, which can provide more protection for hot data nodes. Specifically, the block design is used to group hot and cold data nodes non-uniformly, in which the hot data symbols are piggybacked into the corresponding parity nodes. Moreover, cold data parity blocks, hot data parity blocks, and slant parity blocks are generated to achieve lower repair bandwidth rate and repair degree rate of nodes. Theoretical analyses and experimental simulations show that, compared with the existing piggybacking codes, the block design-based piggybacking codes reduce the average repair bandwidth rate and average repair degree rate of the failed nodes significantly, and the repair bandwidth overhead of hot data nodes is much lower than cold data nodes.