Journal of Shanghai Jiao Tong University (Science) ›› 2020, Vol. 25 ›› Issue (1): 70-75.doi: 10.1007/s12204-019-2147-6
Previous Articles Next Articles
ZHU Tao (朱涛), CHENG Chunling¤ (程春玲)
Online:
2020-01-15
Published:
2020-01-12
Contact:
CHENG Chunling (程春玲)
E-mail: chengcl@njupt.edu.cn
CLC Number:
ZHU Tao (朱涛), CHENG Chunling (程春玲). Joint CTC-Attention End-to-End Speech Recognition with a Triangle Recurrent Neural Network Encoder[J]. Journal of Shanghai Jiao Tong University (Science), 2020, 25(1): 70-75.
[1] | ANUSUYA M A, KATTI S K. Speech recognition by machine: A review [J]. International Journal of Computer Science and Information Security, 2009, 6(3):181-205. |
[2] | RABINER L R. A tutorial on hidden Markov models and selected applications in speech recognition [J].Proceedings of the IEEE, 1989, 77(2): 257-286. |
[3] | HINTON G, DENG L, YU D, et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups [J]. IEEE Signal Processing Magazine, 2012, 29(6): 82-97. |
[4] | GRAVES A, FERN?ANDEZ S, GOMEZ F, et al.Connectionist temporal classiˉcation: Labelling unsegmented sequence data with recurrent neural networks [C]//23rd International Conference on Machine Learning. Pittsburgh, Pennsylvania, USA: ACM, 2006:369-376. |
[5] | GRAVES A, JAITLY N. Towards end-to-end speech recognition with recurrent neural networks [C]//31st International Conference on Machine Learning. Beijing, China: W&CP, 2014: 1764-1772. |
[6] | BAHDANAU D, CHO K H, BENGIO Y. Neural machine translation by jointly learning to align and translate [C]//International Conference on Learning Representations. San Diego, CA, USA: Computational and Biological Learning Society, 2015: 0473. |
[7] | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]//31st Conference on Neural Information Processing Systems. Long Beach, CA,USA: NIPS, 2017: 5998-6008. |
[8] | MARKOVNIKOV N, KIPYATKOVA I, LYAKSO E. End-to-end speech recognition in Russian[C]//International Conference on Speech and Computer. Leizig, Germany: Springer, 2018: 377-386. |
[9] | SAK H, SENIOR A, BEAUFAYS F. Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition [C]//15th Annual Conference of the International Speech Communication Association. Singapore: ISCA, 2014: 1128. |
[10] | HANNUN A Y, MAAS A L, JURAFSKY D,et al. First-pass large vocabulary continuous speech recognition using bi-directional recurrent DNNs [EB/OL]. (2014-08-12) [2018-11-08].https://arxiv.org/pdf/1408.2873.pdf. |
[11] | MIAO Y, GOWAYYED M, METZE F. EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding [C]//IEEE Workshop on Automatic Speech Recognition and Understanding. Scottsdale, AZ, USA: IEEE, 2015: 167-174. |
[12] | MOHRI M, PEREIRA F, RILEY M. Weighted finite-state transducers in speech recognition [J]. Computer Speech & Language, 2002, 16(1): 69-88. |
[13] | CHOROWSKI J K, BAHDANAU D, SERDYUK D,et al. Attention-based models for speech recognition [C]//29th Conference on Advances in Neural Information Processing Systems. Montreal, Canada: NIPS,2015: 577-585. |
[14] | BAHDANAU D, CHOROWSKI J, SERDYUK D, et al. End-to-end attention-based large vocabulary speech recognition [C]//41st IEEE International Conference on Acoustics, Speech and Signal Processing. Shanghai,China: IEEE, 2016: 4945-4949. |
[15] | LU L, ZHANG X, CHO K, et al. A study of the recurrent nerual network encoder-decoder for large vocabulary speech recognition [C]//Proceedings of the Interspeech. Dresden, Germany: ISCA, 2015: 3249-3253. |
[16] | ZEILER M D. Adadelta: An adaptive learning rate method [EB/OL]. (2012-12-22) [2018-11-08].https://arxiv.org/pdf/1212.5701.pdf. |
[17] | WATANABE S, HORI T, KARITA S, et al. ESPnet:End-to-end speech processing toolkit [C]//Proceedings of the Interspeech. Hyderabad, India: ISCA, 2018:2207-2211. |
[18] | POVEY D, GHOSHAL A, BOULIANNE G, et al. The Kaldi speech recognition toolkit [C]//IEEE Workshop on Automatic Speech Recognition and Understanding.Hawaii, USA: IEEE, 2011: 1-4. |
[1] | Duolin, Xu Boyu, Ren Yong, Yang Xin. Magnetic Resonance Imaging Reconstruction Based on Butterfly Dilated Geometric Distillation [J]. J Shanghai Jiaotong Univ Sci, 2025, 30(3): 591-599. |
[2] | Fan Xinggang, Liu Jiaxian, Li Chao, Yang Youdong, Gu Wenting, Jiang Xinyang. Computer Aided Diagnosis for COVID-19 in CT Images Utilizing Transfer Learning and Attention Mechanism [J]. J Shanghai Jiaotong Univ Sci, 2025, 30(3): 566-581. |
[3] | Diao Zijian, Cao Shuai, Li Wenwei, Liang Jianan, Wen Guilin, Huang Weixi, Zhang Shouming. Person Re-Identification Based on Spatial Feature Learning and Multi-Granularity Feature Fusion [J]. J Shanghai Jiaotong Univ Sci, 2025, 30(2): 363-374. |
[4] | WANG Ke, LIU Yiyang, YANG Jie, LU Aiguo, LI Zhe, XU Mingliang. Landing State Recognition of Carrier-Based Aircraft Based on Adaptive Feature Enhancement and Fusion [J]. Journal of Shanghai Jiao Tong University, 2025, 59(2): 274-282. |
[5] | LI Chuchen, TANG Shanjun, ZHAO Bingqing. Weak Object Detection Algorithm Based on High Resolution Remote Sensing Image of UAV Platform [J]. Air & Space Defense, 2025, 8(1): 41-47. |
[6] | MA Xianda1,2,3‡ (马显达), LAN Zhaohui1,2,3‡ (兰兆辉),CHEN Zhitang1,2,3 (陈志堂), MONISHA M L4, HE Xinyi1,2,3 (何欣怡), LI Weidong1,2,3* (李卫东). Significant Retest Effects in Spatial Working Memory Task [J]. J Shanghai Jiaotong Univ Sci, 2025, 30(1): 115-120. |
[7] | XU Wangwang1,2 (徐旺旺), XU Liangfeng1,2 (许良凤), LIU Ninghui3(刘宁徽), LU Na3(律娜). Histological Image Diagnosis of Breast Cancer Based on Multi-Attention Convolution Neural Network [J]. J Shanghai Jiaotong Univ Sci, 2025, 30(1): 91-106. |
[8] | DING Lihui1, 2(丁黎辉), FU Lijun1, 3 (付立军), YANG Guang4(杨光), WAN Lin4, 5 (万林), CHANG Zhijun7(常志军). Video-Based Detection of Epileptic Spasms in IESS: Modeling, Detection, and Evaluation [J]. J Shanghai Jiaotong Univ Sci, 2025, 30(1): 1-9. |
[9] | LI Lijuan, LIU Hai, LIU Hongliang, ZHANG Qingsong, CHEN Yongdong. Non-Intrusive Load Disaggregation Using Sequence-to-Point Integrating External Attention Mechanism [J]. Journal of Shanghai Jiao Tong University, 2024, 58(6): 846-854. |
[10] | ZHOU Cheng (周成), JIANG Zuhua∗ (蒋祖华). Named Entity Recognition of Design Specification Integrated with High-Quality Topic and Attention Mechanism [J]. J Shanghai Jiaotong Univ Sci, 2024, 29(6): 1169-1180. |
[11] | PENG Shiwei1 (彭诗玮), ZHANG Xi1∗ (张希), ZHU Wangwang1 (朱旺旺), DOU Rui2 (窦瑞). Comfort of Autonomous Vehicles Incorporating Quantitative Indices for Passenger Feeling [J]. J Shanghai Jiaotong Univ Sci, 2024, 29(6): 1063-1070. |
[12] | LI Cuiming, WANG Hua, XU Longer, WANG Long. Road Recognition Method of Photovoltaic Plant Based on Improved DeepLabv3+ [J]. Journal of Shanghai Jiao Tong University, 2024, 58(5): 776-782. |
[13] | YAN Congqiang1,2 (鄢丛强), GUO Zhengyun3,4 (郭正玉), CAI Yunze1,2∗∗ (蔡云泽). Data Augmentation of Ship Wakes in SAR Images Based on Improved CycleGAN [J]. J Shanghai Jiaotong Univ Sci, 2024, 29(4): 702-711. |
[14] | CHEN Haolan, JIN Bingying, LIU Yadong, QIAN Qinglin, WANG Peng, CHEN Yanxia, YU Xijuan, YAN Yingjie. Fault Detection in Power Distribution Systems Based on Gated Recurrent Attention Network [J]. Journal of Shanghai Jiao Tong University, 2024, 58(3): 295-303. |
[15] | HUANG Quanyin, CAI Yichao, LI Hao, TANG Xiao, WANG Chenyang. Adaptive Trajectory Prediction Method Based on Improved Attention Mechanism [J]. Air & Space Defense, 2024, 7(3): 94-101. |
Viewed | ||||||||||||||||||||||||||||||||||||||||||||||||||
Full text 53
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||
Abstract 524
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||