A mortality prediction model based on small acute myocardial infarction (AMI) patients coherent
with low death rate is established. In total, 1 639 AMI patients are selected as research objects who received
treatment in seven tertiary and secondary hospitals in Shanghai between January 1, 2016 and January 1, 2018.
Among them, 72 patients deceased during the two-year follow-up. Models are established with ensemble learning
framework and machine learning algorithms based on 51 physiological indicators of the patient. Shapley additive
explanations algorithm and univariate test with point-biserial and phi correlation coefficients are employed to
determine significant features and rank feature importance. Based on 5-fold cross validation experiment and
external validation, prediction model with self-paced ensemble framework and random forest algorithm achieves
the best performance with area under receiver operating characteristic curve (AUROC) score of 0.911 and recall
of 0.864. Both feature ranking methods showed that ejection fractions, serum creatinine (admission), hemoglobin
and Killip class are the most important features. With these top-ranked features, the simplified prediction model
is capable of achieving a comparable result with AUROC score of 0.872 and recall of 0.818. This work proposes a
new method to establish mortality prediction models for AMI patients based on self-paced ensemble framework,
which allows models to achieve high performance with small scale of patients coherent with low death rate. It will
assist in medical decision and prognosis as a new reference.
颜铭萱1,苗雨桐2
,
3,盛淑茜1,甘小莺1,何 奔2,沈 兰2
,
3
. Ensemble Learning-Based Mortality Prediction After Acute
Myocardial Infarction[J]. Journal of Shanghai Jiaotong University(Science), 2025
, 30(1)
: 153
-165
.
DOI: 10.1007/s12204-023-2611-1
[1] MORROW D A, ANTMAN E M, CHARLESWORTH A, et al. TIMI risk score for ST-elevation myocardial infarction: A convenient, bedside, clinical score for risk assessment at presentation: An intravenous nPA for treatment of infarcting myocardium early II trial substudy [J]. Circulation, 2000, 102(17): 2031-2037.
[2] FOX K A A, DABBOUS O H, GOLDBERG R J, et al. Prediction of risk of death and myocardial infarction in the six months after presentation with acute coronary syndrome: Prospective multinational observational study (GRACE) [J]. BMJ (Clinical Research Ed), 2006, 333(7578): 1091.
[3] FABRIZIO, D’ASCENZO. TIMI, GRACE and alternative risk scores in Acute Coronary Syndromes: A meta-analysis of 40 derivation studies on 216, 552 patients and of 42 validation studies on 31, 625 patients [J]. Contemporary Clinical Trials, 2012, 33(3): 507-514.
[4] HEND, MANSOOR, PHARMD M, et al. Risk prediction model for in-hospital mortality in women with ST-elevation myocardial infarction: A machine learning approach [J]. Heart & Lung, 2017, 46(6): 405-411.
[5] FABRIZIO, D’ASCENZO, MD, et al. Machine learning-based prediction of adverse events following an acute coronary syndrome (PRAISE): A modelling study of pooled datasets [J]. The Lancet, 2021,397(10270): 199-207.
[6] CHAWLA N V, BOWYER K W, HALL L O, et al. SMOTE: Synthetic minority over-sampling technique [J]. Journal of Artificial Intelligence Research, 2002, 16: 321-357.
[7] ISHAQ A, SADIQ S, UMER M, et al. Improving the prediction of heart failure patients’ survival using SMOTE and effective data mining techniques [J]. IEEE Access, 2021, 9: 39707-39716.
[8] TAVARES T R, OLIVEIRA A L I, CABRAL G G, et al. Preprocessing unbalanced data using weighted support vector machines for prediction of heart disease in children [C]//2013 International Joint Conference on Neural Networks. Dallas: IEEE, 2014: 1-8.
[9] POLIKAR R. Ensemble based systems in decision making [J]. IEEE Circuits and Systems Magazine, 2006, 6(3): 21-45.
[10] ROKACH L. Ensemble-based classifiers [J]. Artificial Intelligence Review, 2010, 33(1): 1-39.
[11] WEI X, JIANG F, WEI F, et al. An ensemble model for diabetes diagnosis in large-scale and imbalanced dataset [C]//Computing Frontiers Conference. Siena: ACM, 2017: 71-78.
[12] LIU X Y, WU J X, ZHOU Z H. Exploratory undersampling for class-imbalance learning [J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2009, 39(2): 539-550.
[13] ZHANG J H, XIONG H Y, HUANG Y, et al. M-SEQ: Early detection of anxiety and depression via temporal orders of diagnoses in electronic health data [C]//2015 IEEE International Conference on Big Data. Santa Clara: IEEE, 2015: 2569-2577.
[14] LIU Z N, CAO W, GAO Z F, et al. Self-paced ensemble for highly imbalanced massive data classification [C]//2020 IEEE 36th International Conference on Data Engineering. Dallas: IEEE, 2020: 841-852.
[15] LUNDBERG S M, LEE S I. A unified approach to interpreting model predictions [C]//31st International Conference on Neural Information Processing Systems. Long Beach: NIPS, 2017: 4768-4777.
[16] GUILFORD J P. Psychometric methods [M]. 2nd ed. New York: McGraw-Hill, 1954.
[17] LV H C, YANG X L, WANG B Y, et al. Machine learning–driven models to predict prognostic outcomes in patients hospitalized with heart failure using electronic health records: Retrospective study [J]. Journal of Medical Internet Research, 2021, 23(4): e24996.
[18] KASHIRINA I L, FIRYULINA M A, BONDARENKO Y V, et al. Identification of risk factors for mortality after myocardial infarction using machine learning methods [C]//2021 XXIV International Conference on Soft Computing and Measurements. St. Petersburg: IEEE,2021: 233-236.
[19] MATTHEWS B W. Comparison of the predicted and observed secondary structure of T4 phage lysozyme [J]. Biochimica et Biophysica Acta (BBA) - Protein Structure, 1975, 405(2): 442-451.
[20] BOUGHORBEL S, JARRAY F, EL-ANBARI M. Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric [J]. PLoS One, 2017,12(6): e0177678.
[21] CHICCO D. Ten quick tips for machine learning in computational biology [J]. BioData Mining, 2017, 10:35.
[22] BATISTA G E A P A, PRATI R C, MONARD M C.A study of the behavior of several methods for balancing machine learning training data [J]. ACM SIGKDD Explorations Newsletter, 2004, 6(1): 20-29.
[23] LIU Z N, KANG J, TONG H H, et al. IMBENS: Ensemble class-imbalanced learning in python [DB/OL]. (2021-11-24). https://arxiv.org/abs/2111.12776.
[24] NATEKIN A, KNOLL A. Gradient boosting machines,a tutorial [J]. Frontiers in Neurorobotics, 2013, 7: 21.
[25] CHEN T Q, GUESTRIN C. XGBoost: A scalable tree boosting system [C]//22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco: ACM, 2016: 785-794.
[26] HERNANDEZ-SUAREZ D F, KIM Y, VILLABLANCA P, et al. Machine learning prediction models for in-hospital mortality after transcatheter aortic valve replacement [J]. JACC : Cardiovascular Interventions, 2019, 12(14): 1328-1338.
[27] CHICCO D, JURMAN G. Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone [J]. BMC Medical Informatics and Decision Making, 2020, 20(1): 16.