&nbsp;利用结构特征的语音压缩感知重建算法

JIA Xiaoli; JIANG Xiaobo; JIANG Sanxin，LIU Peilin

doi:10.16183/j.cnki.jsjtu.2017.09.014

上海交通大学学报 >

2017 , Vol. 51 >Issue 9: 1111 - 1116

DOI: https://doi.org/10.16183/j.cnki.jsjtu.2017.09.014

兵器工业

利用结构特征的语音压缩感知重建算法

JIA Xiaoli ,
JIANG Xiaobo ,
JIANG Sanxin，LIU Peilin

展开

上海交通大学北斗导航与位置服务重点实验室

网络出版日期: 2017-09-20

基金资助

收起

A Reconstruction Algorithm for Speech Compressive Sensing Using Structural Features

贾晓立，江晓波，蒋三新，刘佩林

Expand

Shanghai Key Laboratory of Navigation and LocationBased Services,
Shanghai Jiao Tong University

Online published: 2017-09-20

Supported by

Fold

摘要

针对语音信号在变换域中不够稀疏使得压缩感知重建困难的问题，提出了一种利用频域结构特征的重建算法.该算法为单帧语音信号的修正离散余弦变换系数引入幅度和状态2个隐变量，并分别用高斯马尔可夫过程和马尔可夫链对幅度和状态沿频率轴的连续性建模.在此基础上用因子图表示系数及其幅度、状态的联合后验分布，在因子图上用Turbo消息传递迭代求出系数的后验均值，进而重建原始语音信号.与当前几种最新的算法相比，该算法在不同帧长、不同压缩率下均获得更高的重建精度，重建信号在时频图上的能量分布也与原始语音最为接近.可见，利用语音频域系数的连续性，以Turbo消息传递的方式可以在压缩感知中得到较高的重建精度.

关键词： 语音信号；压缩感知；高斯混合模型；马尔可夫链；消息传递

本文引用格式

JIA Xiaoli , JIANG Xiaobo , JIANG Sanxin，LIU Peilin . 利用结构特征的语音压缩感知重建算法[J]. 上海交通大学学报, 2017 , 51(9) : 1111 -1116 . DOI: 10.16183/j.cnki.jsjtu.2017.09.014

Abstract

It is difficult to reconstruct speech signal after compressive sampling because coefficients of the signal in transforming domain aren’t sparse enough. In this paper the speech signal was recovered from compressed samples in the frequency domain using structural features. Two hidden variables, amplitude and state, are defined for each modified discrete cosine transforming (MDCT) coefficient of the speech signal. The probability density function of the amplitude of the MDCT coefficient is represented using a Gaussian mixture model, and the continuity of the states along the frequency axis is modeled through a first order Markov chain，the continuity of the amplitude along the frequency axis is modeled through GaussMarkov process. The joint posterior distribution of coefficient, amplitude and state is represented by the factor graph, on which the posterior mean of the coefficient is obtained using Turbo message passing method, and then the speech can be reconstructed. After compressive sampling the MDCT coefficients of a speech segment, we reconstructed the signal using our proposed algorithm and other stateoftheart algorithms for comparison. The results showed that our proposed algorithm achieved best reconstruction quality under different frames and compressive ratios. The spectrogram showed that the energy distribution of reconstructed signal using our algorithm was the most similar to the original signal’s energy distribution. It can be seen that better reconstruction accuracy can be obtained using the continuity along frequency axis and Turbo message passing method.

Key words： speech signal; compressive sensing; Gaussian mixture model; Markov chain; message passing

参考文献

［1］CANDS E J, WAKIN M B. An introduction to compressive sampling［J］. IEEE Signal Processing Magazine, 2008, 25(2): 2130.
［2］HILL P R, KIM J H, BASARAB A, et al. Compressive imaging using approximate message passing and a Cauchy prior in the wavelet domain［C］∥International Conference on Image Processing, Phoenix：IEEE, 2016: 25142518.
［3］LEE D. MIMO OFDM channel estimation via block stagewise orthogonal matching pursuit［J］. IEEE Communications Letters, 2016, 20(10): 21152118
［4］SUN B, FENG H, CHEN K F, et al. A deep learning framework of quantized compressed sensing for wireless neural recording［J］. IEEE Access, 2016, 4(99): 51695178.
［5］LEE K, BRESLER Y，JUNGE M. Subspace methods for joint sparse recovery［J］. IEEE Transactions on Information Theory, 2012, 58(6): 36133641
［6］ZINIEL J, SCHNITER P. Efficient highdimensional inference in the multiple measurement vector problem［J］. IEEE Transactions on Signal Processing, 2013, 61(2): 340354.
［7］ZINIEL J, SCHNITER P. Dynamic compressive sensing of timevarying signals via approximate message passing［J］. IEEE Transactions on Signal Processing, 2013, 61(21): 52705284.
［8］FE′VOTTE C, TORRE′SANI B, DAUDET L, et al. Sparse linear regression with structured priors and application to denoising of musical audio［J］. IEEE Transactions on Audio, Speech, and Language Processing, 2008, 16(1): 174185.
［9］SCHNITER P. Turbo reconstruction of structured sparse signals［C］∥44th Annual Conference on Information Sciences and Systems (CISS). New Jersey: Princeton University, 2010: 16.
［10］DONOHO D L, MALEKI A, MONTANARI A. Message passing algorithms for compressed sensing［J］. Proceedings of the National Academy of Sciences of the United States of America, 2009, 106(45): 1891418919.
［11］VILA J P，SCHNITER P. Expectationmaximization Gaussianmixture approximate message passing［J］. IEEE Transactions on Signal Processing, 2013, 61(19): 46584672.
［12］BECK A，TEBOULLE M. A fast iterative shrinkage thresholding algorithm for linear inverse problems［J］. Society for Industrial and Applied Mathematics, 2009, 2(1): 183202.

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献