Unraveling Predictive Mechanism in Speech Perception and Production: Insights from EEG Analyses of Brain Network Dynamics

doi:10.1007/s12204-024-2729-9

Abstract

Abstract: How neural networks coordinate to support speech perception and speech production represents a forefront research topic in both contemporary neuroscience and artificial intelligence. Despite the successful incorporation of hierarchical and predictive attributes from biological neural networks (BNNs) into artificial counterparts, substantial disparities persist, particularly in terms of real-time feedback and nonlinear regulation. To gain a more profound understanding of how BNNs manifest these attributes, the present study employed electroencephalography (EEG) techniques to examine the spatiotemporal brain network dynamics involved in listening and oral reading of identical sentences. These two tasks engage distinct sensorimotor modalities while sharing high-level semantic and syntactic representations. According to a hierarchical feedforward model, the low-level auditory and visual inputs would be progressively transformed towards abstract representations of the sentence meaning, leading to a convergence of brain network patterns in higher cognitive regions. However, our findings challenged this viewpoint by revealing an early resemblance of network activation in the prefrontal and parietal areas in both tasks. It implies a top-down predictive mechanism along with the bottom-up progression. This bidirectional interaction could be potentially implemented through frequency-specific synchronization and desynchronization between functional-specific cortical regions, laying the foundation of the speech chain system with common neural substrates.

Key words: speech perception and production, electroencephalography (EEG) techniques, brain network dynamics, predictive coding, frequency multiplexing

摘要： 神经网络如何协调支持语音感知和语音产生是当代神经科学和人工智能的前沿研究课题。尽管人工神经网络已成功地整合了生物神经网络的层次性和预测性，但两者之间实质性的差异仍然存在，特别是在实时反馈和非线性调节方面。为了更深入地了解生物神经网络如何表现这些属性，本研究采用脑电技术研究了听力和口语阅读任务中的脑网络时空动态特性。这两个任务涉及不同的感觉运动模态，但共享高层级的语义和句法表征。根据层级前馈模型，低层级的听觉和视觉输入将逐步转化为句子意义的抽象表征，导致大脑网络模式在更高的认知区域趋同。然而，我们的研究结果揭示了与这一观点相悖的现象，即两个任务中前额叶和顶叶区域的网络激活的早期相似性，它意味着自上而下的预测机制和自下而上的同步展开。这种双向交互作用可能通过特定功能皮质区域之间频率特异性的同步和去同步来实现，为具有共同神经基质的言语链奠定了神经生理学方面的基础。

关键词: 语音感知与产生，脑电技术，脑网络动力学，预测编码，频率复用

CLC Number:

Zhao Bin, Dang Jianwu, Li Aijun. Unraveling Predictive Mechanism in Speech Perception and Production: Insights from EEG Analyses of Brain Network Dynamics[J]. J Shanghai Jiaotong Univ Sci, 2026, 31(2): 273-281.

References

[1] HOHENSTEIN J, KIZILCEC R F, DIFRANZO D, et al. Artificial intelligence in communication impacts language and social relationships [J]. Scientific Reports, 2023, 13(1): 5487.
[2] SHEWALKAR A, NYAVANANDI D, LUDWIG S A. Performance evaluation of deep neural networks applied to speech recognition: RNN, LSTM and GRU [J]. Journal of Artificial Intelligence and Soft Computing Research, 2019, 9(4): 235-245.
[3] SCHRIMPF M, BLANK I A, TUCKUTE G, et al. The neural architecture of language: Integrative modeling converges on predictive processing [J]. Proceedings of the National Academy of Sciences of the United States of America, 2021, 118(45): e2105646118.
[4] BEIER E J, CHANTAVARIN S, REHRIG G, et al. Cortical tracking of speech: Toward collaboration between the fields of signal and sentence processing [J]. Journal of Cognitive Neuroscience, 2021, 33(4): 574-593.
[5] VIGNEAU M, BEAUCOUSIN V, HERVÉ P Y, et al. Meta-analyzing left hemisphere language areas: Phonology, semantics, and sentence processing [J]. NeuroImage, 2006, 30(4): 1414-1432.
[6] ZHU Y M, XU M, LU J F, et al. Distinct spatiotemporal patterns of syntactic and semantic processing in human inferior frontal gyrus [J]. Nature Human Behaviour, 2022, 6: 1104-1111.
[7] HAMILTON L S, OGANIAN Y, HALL J, et al. Parallel and distributed encoding of speech across human auditory cortex [J]. Cell, 2021, 184(18): 4626-4639.e13.
[8] APŠVALKA D, FERREIRA C S, SCHMITZ T W, et al. Dynamic targeting enables domain-general inhibitory control over action and thought by the prefrontal cortex [J]. Nature Communications, 2022, 13: 274.
[9] BINDER J R, DESAI R H, GRAVES W W, et al. Where is the semantic system? A critical review and meta-analysis of 120 functional neuroimaging studies [J]. Cerebral Cortex, 2009, 19(12): 2767-2796.
[10] WALENSKI M, EUROPA E, CAPLAN D, et al. Neural networks for sentence comprehension and production: An ALE‐based meta‐analysis of neuroimaging studies [J]. Human brain mapping, 2019, 40(8): 2275-2304.
[11] HICKOK G, POEPPEL D. Dorsal and ventral streams: A framework for understanding aspects of the functional anatomy of language [J]. Cognition, 2004, 92(1/2): 67-99.
[12] FRIDRIKSSON J, YOURGANOV G, BONILHA L, et al. Revealing the dual streams of speech processing [J]. Proceedings of the National Academy of Sciences of the United States of America, 2016, 113(52): 15108-15113.
[13] FRIEDERICI A D, RÜSCHEMEYER S A, HAHNE A, et al. The role of left inferior frontal and superior temporal cortex in sentence comprehension: Localizing syntactic and semantic processes [J]. Cerebral Cortex, 2003, 13(2): 170-177.
[14] TOURVILLE J A, REILLY K J, GUENTHER F H. Neural mechanisms underlying auditory feedback control of speech [J]. NeuroImage, 2008, 39(3): 1429-1443.
[15] BERENT I, PLATT M, THEODORE R, et al. Speech perception triggers articulatory action: Evidence from mechanical stimulation [J]. Frontiers in Communication, 2020, 5: 34.
[16] LIBERMAN A M, MATTINGLY I G. The motor theory of speech perception revised [J]. Cognition, 1985, 21(1): 1-36.
[17] JUNG T P, MAKEIG S, BELL A J, et al. Independent component analysis of electroencephalographic and event-related potential data [M]// Central auditory processing and neural modeling. Boston: Springer, 1998: 189-197.
[18] MULLEN T, DELORME A, KOTHE C, et al. An electrophysiological information flow toolbox for EEGLAB [J]. Biological Cybernetics, 2010, 83: 35-45.
[19] MAKEIG S, DEBENER S, ONTON J, et al. Mining event-related brain dynamics [J]. Trends in Cognitive Sciences, 2004, 8(5): 204-210.
[20] KRIEGESKORTE N, MUR M, BANDETTINI P. Representational similarity analysis - connecting the branches of systems neuroscience [J]. Wellcome Open Research, 2008, 2: 4.
[21] DELORME A, MAKEIG S. EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis [J]. Journal of Neuroscience Methods, 2004, 134(1): 9-21.
[22] MULLEN T R, KOTHE C A E, CHI Y M, et al. Real-time neuroimaging and cognitive monitoring using wearable dry EEG [J]. IEEE Transactions on Biomedical Engineering, 2015, 62(11): 2553-2567.
[23] HSU S H, PION-TONACHINI L, PALMER J, et al. Modeling brain dynamic state changes with adaptive mixture independent component analysis [J]. NeuroImage, 2018, 183: 47-61.
[24] OOSTENVELD R, OOSTENDORP T F. Validating the boundary element method for forward and inverse EEG computations in the presence of a hole in the skull [J]. Human Brain Mapping, 2002, 17(3): 179-192.
[25] PION-TONACHINI L, KREUTZ-DELGADO K, MAKEIG S. ICLabel: An automated electroencephalographic independent component classifier, dataset, and website [J]. NeuroImage, 2019, 198: 181-197.
[26] DELORME A, MULLEN T, KOTHE C, et al. EEGLAB, SIFT, NFT, BCILAB, and ERICA: New tools for advanced EEG processing [J]. Computational Intelligence and Neuroscience, 2011, 2011: 10.
[27] SCHELTER B, WINTERHALDER M, EICHLER M, et al. Testing for directed influences among neural signals using partial directed coherence [J]. Journal of Neuroscience Methods, 2006, 152(1/2): 210-219.
[28] BONHAGE C E, MEYER L, GRUBER T, et al. Oscillatory EEG dynamics underlying automatic chunking during sentence processing [J]. NeuroImage, 2017, 152: 647-657.
[29] VON STEIN A, SARNTHEIN J. Different frequencies for different scales of cortical integration: From local gamma to long range alpha/theta synchronization [J]. International Journal of Psychophysiology, 2000, 38(3): 301-313.
[30] PALVA S, PALVA J M. New vistas for α-frequency band oscillations [J]. Trends in Neurosciences, 2007, 30(4): 150-158.
[31] CUELLAR M, BOWERS A, HARKRIDER A W, et al. Mu suppression as an index of sensorimotor contributions to speech processing: Evidence from continuous EEG signals [J]. International Journal of Psychophysiology, 2012, 85(2): 242-248.
[32] KOECHLIN E, ODY C, KOUNEIHER F. The architecture of cognitive control in the human prefrontal cortex [J]. Science, 2003, 302(5648): 1181-1185.
[33] HELFRICH R F, KNIGHT R T. Oscillatory dynamics of prefrontal cognitive control [J]. Trends in Cognitive Sciences, 2016, 20(12): 916-930.
[34] GERANMAYEH F, WISE R J S, MEHTA A, et al. Overlapping networks engaged during spoken language production and its cognitive control [J]. Journal of Neuroscience, 2014, 34(26): 8728-8740.
[35] BABILONI C, DEL PERCIO C, VECCHIO F, et al. Alpha, beta and gamma electrocorticographic rhythms in somatosensory, motor, premotor and prefrontal cortical areas differ in movement execution and observation in humans [J]. Clinical Neurophysiology, 2016, 127(1): 641-654.
[36] LIU C, HAN T, XU Z, et al. Modulating gamma oscillations promotes brain connectivity to improve cognitive impairment [J]. Cerebral Cortex, 2022, 32(12): 2644-2656.