J Shanghai Jiaotong Univ Sci ›› 2025, Vol. 30 ›› Issue (1): 1-9.doi: 10.1007/s12204-024-2789-x

• Medicine-Engineering Interdisciplinary •     Next Articles

Video-Based Detection of Epileptic Spasms in IESS: Modeling, Detection, and Evaluation

基于视频的婴儿癫痫性痉挛综合征检测:建模、检测与评估

DING Lihui1,2(丁黎辉), FU Lijun1,3* (付立军), YANG Guang4(杨光), WAN Lin4,5 (万林), CHANG Zhijun7(常志军)   

  1. (1. Shenyang Institute of Computing Technology, Chinese Academy of Sciences, Shenyang 110168, China; 2. University of Chinese Academy of Sciences, Beijing 100049, China; 3. Laboratory of Big Data and Artificial Intelligence Technology, Shandong University, Jinan 250100, China; 4. Senior Department of Pediatrics, the Seventh Medical Center of Chinese PLA General Hospital, Beijing 100700, China; 5. Department of Pediatrics, the First Medical Center of Chinese PLA General Hospital, Beijing 100853, China; 6. The Second School of Clinical Medicine, Southern Medical University, Guangzhou 510280, China; 7. The National Science Library, Chinese Academy of Sciences, Beijing 100190, China)
  2. (1. 中国科学院 沈阳计算技术研究所,沈阳 110168;2. 中国科学院大学,北京 100049;3. 山东大学 大数据与人工智能技术实验室,济南 250100;4. 中国人民解放军总医院第七医学中心 儿科,北京 100700;5. 中国人民解放军总医院第一医学中心 儿科,北京 100853;6. 南方医科大学 第二临床医学院,广州 510280;7. 中国科学院 国家科学图书馆,北京 100190)
  • Accepted:2024-08-25 Online:2025-01-28 Published:2025-01-28

Abstract: Behavioral scoring based on clinical observations remains the gold standard for screening, diagnosing,and evaluating infantile epileptic spasm syndrome (IESS). The accurate identification of seizures is crucial for clinical diagnosis and assessment. In this study, we propose an innovative seizure detection method based on video feature recognition of patient spasms. To capture the temporal characteristics of the spasm behavior presented in the videos effectively, we incorporate asymmetric convolution and convolution–batch normalization–ReLU (CBR) modules. Specifically within the 3D-ResNet residual blocks, we split the larger convolutional kernels into two asymmetric 3D convolutional kernels. These kernels are connected in series to enhance the ability of the convolutional layers to extract key local features, both horizontally and vertically. In addition, we introduce a 3D convolutional block attention module to enhance the spatial correlations between video frame channels efficiently. To improve the generalization ability, we design a composite loss function that combines cross-entropy loss with triplet loss to balance the classification and similarity requirements. We train and evaluate our method using the PLA IESS-VIDEO dataset, achieving an average seizure recognition accuracy of 90.59%, precision of 90.94%, and recall of 87.64%. To validate its generalization capability further, we conducted external validation using six different patient monitoring videos compared with assessments by six human experts from various medical centers. The final test results demonstrate that our method achieved a recall of 0.647 6, surpassing the average level achieved by human experts (0.559 5), while attaining a high F1-score of 0.721 9. These findings have substantial significance for the long-term assessment of patients with IESS.

Key words: infantile epileptic spasm syndrome, video-based seizure analysis, asymmetric convolution, 3D-ResNet, attention mechanism

摘要: 基于临床观察的行为评分仍然是筛查、诊断和评估婴儿癫痫性痉挛综合征(IESS)的金标准。准确识别痉挛发作对于临床诊断和评估至关重要。本研究提出了一种基于视频特征识别的创新性痉挛检测方法。为了有效捕捉视频中痉挛行为的时间特征,引入了非对称卷积和CBR模块。具体来说,在3D-ResNet残差块中,将较大的卷积核拆分为两个非对称3D卷积核,这些卷积核串联连接,以增强卷积层在水平和垂直方向提取局部关键特征的能力。此外,引入了3D-CBAM注意力模块,高效增强视频帧通道之间的空间相关性。为了提高模型的泛化能力,设计了一种复合损失函数,将交叉熵损失与三元组损失结合,以平衡分类需求和相似性要求。使用PLA IESS-VIDEO数据集对我们的方法进行了训练和评估,取得了90.59%的平均痉挛识别准确率、90.94%的精准率和87.64%的召回率。为了进一步验证其泛化能力,使用六个不同的患者监测视频进行外部验证,并与来自多个医疗中心的六位专家的评估结果进行对比。最终测试结果表明:我们的方法达到了0.6476的灵敏度,超过了人类专家的平均水平0.5595,同时获得了0.7219的高F1分数。这些发现对长期评估婴儿癫痫性痉挛综合征患者具有重要意义。

关键词: 婴儿癫痫性痉挛综合征,基于视频的痉挛检测,非对称卷积,3D-ResNet,注意力机制

CLC Number: