Surgical site infections (SSIs) are the most common healthcare-related infections in patients with lung cancer. Constructing a lung cancer SSI risk prediction model requires the extraction of relevant risk factors from lung cancer case texts, which involves two types of text structuring tasks: attribute discrimination and attribute extraction. This article proposes a joint model, Multi-BGLC, around these two types of tasks, using bidirectional encoder representations from transformers (BERT) as the encoder and fine-tuning the decoder composed of graph convolutional neural network (GCNN) + long short-term memory (LSTM) + conditional random field (CRF) based on cancer case data. The GCNN is used for attribute discrimination, whereas the LSTM and CRF are used for attribute extraction. The experiment verified the effectiveness and accuracy of the model compared with other baseline models.
Mi Linhui, Yuan Junyi, Zhou Yankang, Hou Xumin
. Text Structured Algorithm of Lung Cancer Cases Based on Deep Learning[J]. Journal of Shanghai Jiaotong University(Science), 2025
, 30(4)
: 778
-789
.
DOI: 10.1007/s12204-025-2825-5
[1] MARON M E. Automatic indexing: An experimental inquiry [J]. Journal of the ACM, 1961, 8(3): 404-417.
[2] COVER T, HART P. Nearest neighbor pattern classification [J]. IEEE Transactions on Information Theory, 1967, 13(1): 21-27.
[3] JOACHIMS T. Text categorization with support vector machines: Learning with many relevant features [M]//Machine learning: ECML-98. Berlin, Heidelberg: Springer, 1998: 137-142.
[4] SCHNEIDER K M. A new feature selection score for multinomial naive Bayes text classification based on KL-divergence [C]// ACL Interactive Poster and Demonstration Sessions. Barcelona: ACL, 2004: 186-189.
[5] DAI W, XUE G R, YANG Q, et al. Transferring naive Bayes classifiers for text classification [C]// 22nd National Conference on Artificial Intelligence. Vancouver: AAAI, 2007: 540-545.
[6] CORTES C, VAPNIK V. Support-vector networks [J]. Machine Learning, 1995, 20(3): 273-297.
[7] JOACHIMS T. Transductive inference for text classification using support vector machines [C]// 16th International Conference on Machine Learning. Bled: IMLS, 1999: 200-209.
[8] LAI S W, XU L H, LIU K, et al. Recurrent convolutional neural networks for text classification [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2015, 29(1): 2267-2273.
[9] SUTSKEVER I, MARTENS J, HINTON G E. Generating text with recurrent neural networks [C]// 28th International Conference on Machine Learning. Bellevue: IMLS, 2011: 1017-1024.
[10] MANDIC D P, CHAMBERS J. Recurrent neural networks for prediction: learning algorithms, architectures and stability [M]. Chichester: John Wiley & Sons, Inc., 2001.
[11] JIANG M Y, LIANG Y C, FENG X Y, et al. Text classification based on deep belief network and softmax regression [J]. Neural Computing and Applications, 2018, 29(1): 61-70.
[12] LEWIS M, LIU Y H, GOYAL N, et al. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension [C]// 58th Annual Meeting of the Association for Computational Linguistics. Online: ACL, 2020: 7871-7880.
[13] RADFORD A, NARASIMHAN K, SALIMANS T, et al. Improving language understanding by generative pre-training [EB/OL]. [2024-12-01]. https://www.mikecaptain.com/resources/pdf/GPT-1.pdf
[14] ZHANG Q, CHEN X. Applying BERT on the classification of Chinese legal documents [M]//Advances in Internet, data & web technologies. Cham: Springer, 2023: 215-222.
[15] WANG J, ZHANG J, HU B F. Optimal class-dependent discretization-based fine-grain hypernetworks for classification of microarray data [J]. Journal of Shanghai Jiao Tong University, 2013, 47(12): 1856-1862 (in Chinese).
[16] KOWSARI K, HEIDARYSAFA M, BROWN D E, et al. RMDL: Random multimodel deep learning for classification [C]// 2nd International Conference on Information System and Data Mining. Lakeland: ACM, 2018: 19-28.
[17] WU Y, JIANG M, XU J, et al. Clinical named entity recognition using deep learning models [C]//AMIA Annual Symposium Proceedings. Washington: AMIA, 2017: 1812-1819.
[18] MAGGE A, SCOTCH M, GONZALEZ-HERNANDEZ G. Clinical NER and relation extraction using Bi-Char-LSTMs and random forest classifiers [C]// 1st International Workshop on Medication and Adverse Drug Event Detection. Worcester: PMLR, 2018: 25-30.
[19] BAXTER J. A model of inductive bias learning [J]. Journal of Artificial Intelligence Research, 2000, 12: 149-198.
[20] THRUN S. Is learning the n-th thing any easier than learning the first [C]// 9th International Conference on Neural Information Processing Systems. Denver: NIPS, 1995: 640-646.
[21] CARUANA R. Multitask learning [M]//Learning to learn. Boston,: Springer, 1998: 95-133.
[22] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.
[23] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space [DB/OL]. (2013-01-16). https://arxiv.org/abs/1301.3781
[24] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// 31st Conference on Neural Information Processing Systems. Long Beach: NIPS, 2017: 1-11.
[25] DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding [C]// 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis: ACL, 2019: 4171-4186.
[26] DAI Z H, YANG Z L, YANG Y M, et al. Transformer-XL: Attentive language models beyond a fixed-length context [DB/OL]. (2019-01-09). https://arXiv.org/abs/1901.02860
[27] SUN Y, WANG S, FENG S, et al. Ernie 3.0: Large-scale knowledge enhanced pre-training for language understanding and generation [DB/OL]. (2021-07-05). https://arxiv.org/abs/2107.02137
[28] DAUPHIN Y N, FAN A, AULI M, et al. Language modeling with gated convolutional networks [C]// 34th International Conference on Machine Learning. Sydney: PMLR, 2017: 933-941.
[29] LAFFERTY J, MCCALLUM A, PEREIRA F. Conditional random fields: Probabilistic models for segmenting and labeling sequence data [C]//18th International Conference on Machine Learning. Williamstown: IMLS, 2001: 282-289.
[30] CHUNG J, GULCEHRE C, CHO K H, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling [DB/OL]. (2014-12-11). https://arxiv.org/abs/1412.3555