上海交通大学学报

• • 上一篇    下一篇

线框引导与Transformer增强U-Net方法

  

  1. 1. 西北民族大学 语言与文化计算教育部重点实验室,兰州730000; 2. 西北民族大学 数学与计算机科学学院,兰州730000;3. 兰州大学,信息科学与工程学院,兰州730000
  • 作者简介:李巧巧(1988—),副教授,从事图像处理与深度学习研究;E-mail:liqq@xbmu.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(62466053)

 A Line-Drawing-Guided and Transformer-Enhanced U-Net Method

  1. 1. Key Laboratory of Linguistic and Cultural Computing of the Ministry of Education, Northwest Minzu University, Lanzhou 730000, China;2. Institute of Mathematics and Computer Science, Northwest Minzu University, Lanzhou 730000, China;3. School of Information Science and Engineering, Lanzhou University, Lanzhou 730000, China

摘要: 本文提出了一种线框引导与Transformer增强U-Net(LDG-TEUN)的敦煌壁画图像修复方法。首先,在编码端嵌入融合轴向注意力与二维位置编码的交叉注意力模块,通过有效建模全局结构与长距离依赖关系,缓解大范围缺损造成的结构信息缺失问题。其次,设计双域部分卷积(DPConv)单元,联合对空间域与频率域特征进行建模,以此增强模型对复杂纹理和边缘细节的刻画能力,解决细节还原困难的问题。最后,构建复合损失函数,从结构一致性、纹理保真度与色彩分布合理性三个维度协同约束训练,进而提升整体修复效果,尤其在色彩复原方面更有助于贴近壁画的历史原貌。实验结果表明本方法在结构连贯性和色彩复原方面均取得了更优性能,验证了其在敦煌壁画数字化修复中的有效性和实用价值。

关键词: 图像修复, Transformer, U-Net, 线框引导, 敦煌壁画

Abstract: This paper introduces a line-drawing-guided Transformer-enhanced U-Net (LDG-TEUN) for the digital restoration of Dunhuang murals. A cross-attention module that integrates axial attention with two-dimensional positional encoding is embedded in the encoder to capture global structures and long-range dependencies, thereby alleviating the structural loss caused by large-scale damage. A dual-domain partial convolution (DPConv) unit is then designed to jointly model spatial- and frequency-domain features, enhancing the reconstruction of complex textures and fine edges while addressing challenges in detail recovery. Finally, a composite loss function is formulated to enforce structural consistency, texture fidelity, and color distribution simultaneously, which improves overall restoration quality and, in particular, enables more authentic color reconstruction. Experimental results demonstrate that the proposed method outperforms state-of-the-art approaches in both structural coherence and color restoration, confirming its effectiveness and practical value for the digital conservation of Dunhuang murals.

Key words: image inpainting, Transformer, U-Net, line-drawing-guided, Dunhuang murals

中图分类号: