Aimed at the issues of high feature dimensionality, excessive data redundancy, and low recognition accuracy of using single classifiers on ground-glass lung nodule recognition, a recognition method was proposed based on CatBoost feature selection and Stacking ensemble learning. First, the method uses a feature selection algorithm to filter important features and remove features with less impact, achieving the effect of data dimensionality reduction. Second, random forests classifier, decision trees, K-nearest neighbor classifier, and light gradient boosting machine were used as base classifiers, and support vector machine was used as meta classifier to fuse and construct the ensemble learning model. This measure increases the accuracy of the classification model while maintaining the diversity of the base classifiers. The experimental results show that the recognition accuracy of the proposed method reaches 94.375%. Compared to the random forest algorithm with the best performance among single classifiers, the accuracy of the proposed method is increased by 1.875%. Compared to the recent deep learning methods (ResNet+GBM+Attention and MVCSNet) on ground-glass pulmonary nodule recognition, the proposed method’s performance is also better or comparative. Experiments show that the proposed model can effectively select features and make recognition on ground-glass pulmonary nodules.
Miao Jun, Chang Yiru, Chen Chen, Zhang Maoyuan, Liu Yan, Qi Honggang, Guo Zhijun, Xu Qian
. Ground-Glass Lung Nodules Recognition Based on CatBoost Feature Selection and Stacking Ensemble Learning[J]. Journal of Shanghai Jiaotong University(Science), 2025
, 30(4)
: 790
-799
.
DOI: 10.1007/s12204-024-2761-9
[1] AGGARWAL P, VIG R, SARDANA H K. Semantic and content-based medical image retrieval for lung cancer diagnosis with the inclusion of expert knowledge and proven pathology [C]//2013 IEEE Second International Conference on Image Information Processing. Shimla: IEEE, 2013: 346-351.
[2] WANG X, MA D. Advances in computer-aided diagnosis in pulmonary nodules [J]. Chinese Journal of Radiology, 2006, 40(4): 443-445 (in Chinese).
[3] GAO L, YU X X, KANG B, et al. Predictive value of CT-based radiomics nomogram for the invasiveness of lung pure ground-glass nodules [J]. Journal of Shandong University (Health Science), 2022, 60(5): 87-97 (in Chinese).
[4] WAN H Y, LI J, WANG B, et al. Establishment of prediction model for isolated pulmonary benign or malignant nodule by Bayesian network [J]. Journal of Chinese Oncology, 2022, 28(5): 380-384 (in Chinese).
[5] CAI J H, DUAN S F, YUAN H, et al. Machine learning in differentiating pulmonary invasive adenocarcinoma from non-invasive adenocarcinoma manifested as pure ground-glass nodule [J]. Chinese Journal of Medical Imaging Technology, 2020, 36(3): 405-410 (in Chinese).
[6] MAĆKIEWICZ A, RATAJCZAK W. Principal components analysis (PCA) [J]. Computers & Geosciences, 1993, 19(3): 303-342.
[7] LIU X F. The clinical value of CT radiomics in the diagnosis of ground-glass pulmonary nodules [D]. Wuhu: Wannan Medical College, 2021 (in Chinese).
[8] DAI Y Q, GUO X Y, WANG M, et al. Feature selection of high-dimensional biomedical data based on shuffled frog leaping algorithm [J]. Application Research of Computers, 2021, 38(4): 1062-1068 (in Chinese).
[9] DARABI N, REZAI A, HAMIDPOUR S S F. Breast cancer detection using RSFS-based feature selection algorithms in thermal images J]. Biomedical Engineering: Applications, Basis and Communications, 2021, 33(3): 2150020.
[10] LI Y F, LUO Y, GUO L, et al. Radiomics analysis and machine learning for classification of benign and malignant pulmonary nodules [J]. Radiologic Practice, 2021, 36(4): 464-469 (in Chinese).
[11] MIAO X F, LIU M, JIANG Y. Hepatitis C prediction based on machine learning algorithms [J]. Journal of Jilin University (Information Science Edition), 2022, 40(4): 638-643 (in Chinese).
[12] WU T F, ZHANG R S. Research on the application of machine learning in the malignant grinding glass density nodules of lung [J]. Journal of Guangzhou University (Natural Science Edition), 2018, 17(3): 33-39 (in Chinese).
[13] CHANG T T, LIU H W, FENG J. Support vector machine ensemble learning algorithm research based on heterogeneous data [J]. Journal of Xidian University, 2010, 37(1): 136-141 (in Chinese).
[14] PANG L, LAN W X, WANG Q Q, et al. Machine learning-based survival prediction model for colorectal adenocarcinoma cancer [J]. Modern Preventive Medicine, 2023, 50(2): 227-232 (in Chinese).
[15] BARTLETT P, FREUND Y, LEE W S, et al. Boosting the margin: A new explanation for the effectiveness of voting methods [J]. The Annals of Statistics, 1998, 26(5): 1651-1686.
[16] CHE X J, YU Y J, LIU Q L, et al. Enhanced Bagging ensemble learning and multi⁃target detection algorithm [J]. Journal of Jilin University (Engineering and Technology Edition), 2022, 52(12): 2916-2923 (in Chinese).
[17] KUANG J, HONG M J, LIU X C, et al. Classification of pulmonary nodules based on attention mechanism [J]. Computer Applications and Software, 2022, 39(1): 163-167 (in Chinese).
[18] ZHU Q K, WANG Y Q, CHU X P, et al. Multi-view coupled self-attention network for pulmonary nodules classification [M]// Computer vision – ACCV 2022. Cham: Springer, 2022: 37-51.
[19] KIRA K, RENDELL L. The feature selection problem: Traditional methods and a new algorithm [C]// 10th National Conference on Artificial Intelligence. San Jose: AAAI, 1992: 129-134.
[20] HE X Y, GONG J, WANG L J, et al. Feature selection based on feature vectorization on computer tomography scan of pulmonary nodules [J]. Application Research of Computers, 2018, 35(8): 2544-2548 (in Chinese).
[21] WANG J, ZHANG X L, ZHAO J J. Feature selection algorithm for diagnostic model of solitary pulmonary nodules [J]. China Sciencepaper, 2014, 9(10): 1201-1205 (in Chinese).
[22] DIMITRIADOU E, WEINGESSEL A, HORNIK K. Voting-merging: An ensemble method for clustering [M]// Artificial neural networks — ICANN 2001. Berlin, Heidelberg: Springer, 2001: 217-224.
[23] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.
[24] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks [J]. Communications of the ACM, 2017, 60(6): 84-90.
[25] HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 2261-2269.