J Shanghai Jiaotong Univ Sci ›› 2026, Vol. 31 ›› Issue (2): 282-288.doi: 10.1007/s12204-024-2725-0
Special Issue: 人机语音通讯
• Automation & Computer Technologies • Previous Articles Next Articles
肖素杰1, 2, 郝锐朋1, 程高峰1, 徐晓艳1, 黎塔1, 2
Received:2023-12-19
Accepted:2024-01-05
Online:2026-04-01
Published:2024-04-22
CLC Number:
Xiao Sujie, Hao Ruipeng, Cheng Gaofeng, Xu Xiaoyan, Li Ta. EC-BERT: A BERT Language Model with Error Correction for Mandarin Chinese Speech Recognition[J]. J Shanghai Jiaotong Univ Sci, 2026, 31(2): 282-288.
| 1. GRAVES A, FERNÁNDEZ S, GOMEZ F, et al. Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks [C]// 23rd International Conference on Machine Learning. Pittsburgh: IMLS, 2006: 369-376. 2. GRAVES A. Sequence transduction with recurrent neural networks [DB/OL]. (2012-11-14). https://arxiv.org/abs/1211.3711 3. CHAN W, JAITLY N, LE Q, et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition [C]//2016 IEEE International Conference on Acoustics, Speech and Signal Processing. Shanghai: IEEE, 2016: 4960-4964. 4. WATANABE S, HORI T, KIM S, et al. Hybrid CTC/attention architecture for end-to-end speech recognition [J]. IEEE Journal of Selected Topics in Signal Processing, 2017, 11(8): 1240-1253. 5. LIU S L, YANG T, YUE T C, et al. PLOME: Pre-training with misspelled knowledge for Chinese spelling correction [C]// 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Online: ACL, 2021: 2991-3000. 6. LIU S L, SONG S K, YUE T C, et al. CRASpell: A contextual typo robust approach to improve Chinese spelling correction [C]//Findings of the Association for Computational Linguistics: ACL 2022. Dublin: ACL, 2022: 3008-3018. 7. ZHANG S H, HUANG H R, LIU J C, et al. Spelling error correction with soft-masked BERT [C]// 58th Annual Meeting of the Association for Computational Linguistics. Online: ACL, 2020: 882-890. 8. ZHANG R Q, PANG C, ZHANG C Q, et al. Correcting Chinese spelling errors with phonetic pre-training [C]//Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. Online: ACL, 2021: 2250-2261. 9. JI T, YAN H, QIU X P. SpellBERT: A lightweight pretrained model for Chinese spelling check [C]// 2021 Conference on Empirical Methods in Natural Language Processing. Online: ACL, 2021: 3544-3551. 10. CHENG X Y, XU W D, CHEN K L, et al. SpellGCN: Incorporating phonological and visual similarities into language models for Chinese spelling check [C]// 58th Annual Meeting of the Association for Computational Linguistics. Online: ACL, 2020: 871-881. 11. LIAO J W, ESKIMEZ S, LU L Y, et al. Improving readability for automatic speech recognition transcription [J]. ACM Transactions on Asian and Low-Resource Language Information Processing, 22(5): 142. 12. MANI A, PALASKAR S, MERIPO N V, et al. ASR error correction and domain adaptation using machine translation [C]// 2020 IEEE International Conference on Acoustics, Speech and Signal Processing. Barcelona: IEEE, 2020: 6344-6348. 13. LENG Y C, TAN X, ZHU L C, et al. FastCorrect: Fast error correction with edit alignment for automatic speech recognition [DB/OL]. (2021-05-09). http://arxiv.org/abs/2105.03842 14. KIM S, HORI T, WATANABE S. Joint CTC-attention based end-to-end speech recognition using multi-task learning [C]//2017 IEEE International Conference on Acoustics, Speech and Signal Processing. New Orleans: IEEE, 2017: 4835-4839. 15. DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding [DB/OL]. (2018-10-11). http://arxiv.org/abs/1810.04805 16. BU H, DU J Y, NA X Y, et al. AISHELL-1: An open-source Mandarin speech corpus and a speech recognition baseline [C]//2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment. Seoul: IEEE, 2017: 1-5. 17. YAO Z Y, WU D, WANG X, et al. WeNet: Production oriented streaming and non-streaming end-to-end speech recognition toolkit [C]//Interspeech 2021. Brno: ISCA, 2021: 4054-4058. 18. PARK D S, CHAN W, ZHANG Y, et al. SpecAugment: A simple data augmentation method for automatic speech recognition [C]//Interspeech 2019. ISCA: ISCA, 2019: 2613-2617. 19. GULATI A, QIN J, CHIU C C, et al. Conformer: Convolution-augmented transformer for speech recognition [C]//Interspeech 2020. Graz: ISCA, 2020: 5036-5040. 20. VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// 31st International Conference on Neural Information Processing Systems. Long Beach: NIPS, 2017: 6000-6010. |
| [1] | Chen Chengxin, Zhang Pengyuan. DSNet: Disentangled Siamese Network with Neutral Calibration for Speech Emotion Recognition [J]. J Shanghai Jiaotong Univ Sci, 2026, 31(2): 248-257. |
| [2] | WU Yalei, LI Jinghua, KONG Dehui, LI Qianxing, YIN Baocai. 3D Hand Pose Estimation Using Semantic Dynamic Hypergraph Convolutional Networks [J]. J Shanghai Jiaotong Univ Sci, 2025, 30(5): 855-865. |
| [3] | DONG Zhaoxian, YU Shuo, SHEN Yanming. Multi-Scale Dynamic Hypergraph Convolutional Network for Traffic Flow Forecasting [J]. J Shanghai Jiaotong Univ Sci, 2025, 30(5): 880-888. |
| [4] | Ma Jin, Ren Ze, Zhang Tongtong, Ding Ying, Lu Yilei, Peng Yinghong. Transformer-Based Contrastive Learning Method for Automated Sleep Stages Classification [J]. J Shanghai Jiaotong Univ Sci, 2025, 30(4): 720-732. |
| [5] | Xiao Wenbo, Xiong Jiakai, Yu Lesheng, He Yinshui, Ma Guohong. Weld Defect Monitoring Based on Two-Stage Convolutional Neural Network [J]. J Shanghai Jiaotong Univ Sci, 2025, 30(2): 291-299. |
| [6] | KE Jing1(柯晶), ZHU Junchao2 (朱俊超), YANG Xin1(杨鑫), ZHANG Haolin3 (张浩林), SUN Yuxiang1(孙宇翔), WANG Jiayi1(王嘉怡), LU Yizhou4(鲁亦舟), SHEN Yiqing5(沈逸卿), LIU Sheng6(刘晟), JIANG Fusong7(蒋伏松), HUANG Qin8(黄琴). TshFNA-Examiner: A Nuclei Segmentation and Cancer Assessment Framework for Thyroid Cytology Image [J]. J Shanghai Jiaotong Univ Sci, 2024, 29(6): 945-957. |
| [7] | LI Mingai1, 2∗ (李明爱), WEI Lina1 (魏丽娜). Motor Imagery Classification Based on Plain Convolutional Neural Network and Linear Interpolation [J]. J Shanghai Jiaotong Univ Sci, 2024, 29(6): 958-966. |
| [8] | GENG Zongsheng1 (耿宗盛), ZHAO Dongdong1,2 (赵东东), ZHOU Xingwen1 (周兴文), YAN Lei1 (闫磊), YAN Shi1,2∗ (阎石). Leader-Following Consensus of Multi-Agent Systems via Fully Distributed Event-Based Control [J]. J Shanghai Jiaotong Univ Sci, 2024, 29(4): 640-645. |
| [9] | LIU Zengmin (刘增敏), WANG Shentao(王申涛), YAO Lixiu(姚莉秀), CAI Yunze(蔡云泽). Online Multi-Object Tracking Under Moving Unmanned Aerial Vehicle Platform Based on Object Detection and Feature Extraction Network [J]. J Shanghai Jiaotong Univ Sci, 2024, 29(3): 388-399. |
| [10] | ZHANG Yanjun(张彦军), WANG Biyun(王碧云),CAI Yunze (蔡云泽). Multi-Channel Based on Attention Network for Infrared Small Target Detection [J]. J Shanghai Jiaotong Univ Sci, 2024, 29(3): 414-427. |
| [11] | WANG Yujuan1 (王玉娟),LI Wengang2 (李文刚),LIU .Jianyong3 (刘建勇),CHEN Guangxue4 (陈广学),WANG Jun1*(汪军). Color Prediction Model of Gray Hybrid Multifilament Fabric [J]. J Shanghai Jiaotong Univ Sci, 2023, 28(6): 802-808. |
| [12] | LIU Zhuoran (刘卓然), ZHAO Xu∗ (赵旭). Multilevel Disparity Reconstruction Network for Real-Time Stereo Matching [J]. J Shanghai Jiaotong Univ Sci, 2022, 27(5): 715-722. |
| [13] | SU Chong∗ (宿翀), LÜ Jing (吕晶), ZHANG Danyang (张丹阳), LI Hongguang∗ (李宏光). Affective Preferences Mining Approach with Applications in Process Control [J]. J Shanghai Jiaotong Univ Sci, 2022, 27(5): 737-746. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||