上海交通大学学报 ›› 2024, Vol. 58 ›› Issue (10): 1606-1617.doi: 10.16183/j.cnki.jsjtu.2023.043

• 电子信息与电气工程 • 上一篇    下一篇

面向多天气退化图像恢复的自注意力扩散模型

秦菁, 文渊博(), 高涛, 刘瑶   

  1. 长安大学 信息工程学院,西安 710064
  • 收稿日期:2023-02-10 修回日期:2023-03-02 接受日期:2023-03-09 出版日期:2024-10-28 发布日期:2024-11-01
  • 通讯作者: 文渊博,博士生;E-mail:wyb@chd.edu.cn.
  • 作者简介:秦 菁(1975—),讲师,现主要从事信号处理及图像处理研究.
  • 基金资助:
    国家国家自然科学基金(52172379);长安大学中央高校基本科研业务费专项资金(300102242901)

A Transformer-Based Diffusion Model for All-in-One Weather-Degraded Image Restoration

QIN Jing, WEN Yuanbo(), GAO Tao, LIU Yao   

  1. School of Information and Engineering, Chang’an University, Xi’an 710064, China
  • Received:2023-02-10 Revised:2023-03-02 Accepted:2023-03-09 Online:2024-10-28 Published:2024-11-01

摘要:

复杂天气下的图像恢复对后续高级计算机视觉任务具有重要意义.然而,多数现有图像恢复算法仅能去除单一天气退化,鲜有针对多天气退化图像恢复的同一模型.对此,结合去噪扩散概率模型和视觉Transformer,提出一种用于多天气退化图像恢复的自注意力扩散模型.首先,利用天气退化图像作为条件来引导扩散模型反向采样生成去除退化的干净背景图像.其次,提出次空间转置自注意力噪声估计网络,利用退化图像和噪化状态来估计噪声分布,包括次空间转置自注意力机制 (STSA) 和双分组门控前馈网络 (DGGFFN).STSA利用次空间变换系数实现有效学习特征全局性长距离依赖的同时,可显著降低计算负担;DGGFFN利用双分组门控机制来增强前馈网络的非线性表征能力.实验结果表明,在5个天气退化图像数据集上,相比近来同类算法All-in-One和TransWeather,本文算法所得恢复图像的平均峰值信噪比分别提高3.68和3.08 dB,平均结构相似性分别提高2.93%和3.13%.

关键词: 计算机视觉, 扩散模型, 图像恢复, Transformer, 天气退化图像

Abstract:

Image restoration under adverse weather conditions is of great significance for the subsequent advanced computer vision tasks. However, most existing image restoration algorithms only remove single weather degradation, and few studies has been conducted on all-in-one weather-degraded image restoration. The denoising diffusion probability model is combined with Vision Transformer to propose a Transformer-based diffusion model for all-in-one weather-degraded image restoration. First, the weather-degraded image is utilized as the condition to guide the reverse sampling of diffusion model and generate corresponding clean background image. Then, the subspace transposed Transformer for noise estimation (NE-STT) is proposed, which utilizes the degraded image and the noisy state to estimate noise distribution, including the subspace transposed self-attention (STSA) mechanism and a dual grouped gated feed-forward network (DGGFFN). The STSA adopts subspace transformation coefficient to effectively capture global long-range dependencies while significantly reducing computational burden. The DGGFFN employs the dual grouped gated mechanism to enhance the nonlinear characterization ability of feed-forward network. The experimental results show that in comparison with the recently developed algorithms, such as All-in-One and TransWeather, the method proposed obtains a performance gain of 3.68 and 3.08 dB in average peak signal-to-noise ratio while 2.93% and 3.13% in average structural similarity on 5 weather-degraded datasets.

Key words: computer vision, diffusion model, image restoration, Transformer, weather-degraded image

中图分类号: