Classification and Diagnosis of Cardiac Arrhythmia using an ECG-based Ensemble Approach
Keywords:computer-aided diagnosis, arrhythmia, AI-based clinical decision making
Cardiovascular Disease (CVD) remains the leading cause of death, worldwide and in the United States. Approximately 30% of global deaths can be attributed to one form of CVD, including conditions such as heart disease, stroke, heart attack, and arrhythmia. In diagnosing CVD, electrocardiograms (ECG) are commonly used to measure and record the electrical activity of the heart. Their non-invasive, informative, and relatively simple nature allows for rapid deployment. However, because analysis of ECGs depends solely on a physician, ECG analysis becomes subjective, adding a potential layer of error to patient healthcare. Studies indicate that physicians often misread ECGs and disagree with each other’s interpretations. In order to develop an accurate and objective method for ECG analysis, this study evaluates various ensemble algorithms to design and create a supervised classification model. Several ensemble models were evaluated to derive one which correctly classifies CVD with sufficiently high accuracy. A boosted decision tree ensemble created to evaluate cardiac condition performs best, with an overall accuracy of 84.6% and an AUC of 0.828.
(1) Mozaffarian D, Benjamin EJ, Go AS, Arnett DK, Blaha MJ, Cushman M, et al. Heart disease and stroke statistics— 2015 update: a report from the American Heart Association. Circulation. 2014; 131(4):e29-322.
(2) Drew, B., Dracup, K., Childers, R., Criley JM, Fung G, Marcus F, et al. Finding ECG Readers in Clinical Practice. J Am Coll Cardiol. 2014;64(5):528.
(3) Magee C, Kazman J, Haigney M, Oriscello R, DeZee KJ, Deuster P, et al. Reliability and Validity of Clinician ECG Interpretation for Athletes. Ann Noninvasive Electrocardiol. 2014;19(4):319-329.
(4) Jayes RL, Larsen GC, Beshansky JR, D’Agostino RB, Selker HP. Physician electrocardiogram reading in the emergency department—Accuracy and effect on triage decisions. J Gen Intern Med. 1992;7(4):387-392.
(5) Gayathri S, Suchetha M, Latha V. ECG Arrhythmia Detection and Classification Using Relevance Vector Machine. Procedia Eng. 2012;38:1333-1339.
(6) Güvenir HA, Kaçar M, Demiroz G, Cekin A. A supervised machine learning algorithm for arrhythmia analysis. Computers in Cardiology. 1997;24:433 - 436. 10.1109/CIC.1997.647926.
(7) Dua D and Karra Taniskidou E. UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. 2017.
(8) Mathworks. Statistics and Machine Learning Toolbox: User's Guide (r2018a). Retrieved August 01, 2018 from https://www.mathworks.com/help/pdf_doc/stats/stats.pdf
(9) Schapire RE, Freund Y, Bartlett P, Lee WS. Boosting the margin: a new explanation for the effectiveness of voting methods. The Annals of Statistics. 1998;26(5):1651-1686.
(10) Friedman J, Tibshirani R, Hastie T. Additive logistic regression: a statistical view of boosting (With discussion and a rejoinder by the authors). The Annals of Statistics. 2000;28(2):337-407.
(11) Friedman J. Greedy Function Approximation: A Gradient Boosting Machine. Lecture presented at the: 1999; Institute of Mathematical Statistics.
(12) Snoek J, Larochelle H, Adams RP. Practical Bayesian Optimization of Machine Learning Algorithms. arXiv:1206.2944 [stat.ML]
(13) Breiman, L. Machine Learning. 2001; 45: 5.
(14) Gray K, Aljabar P, Heckemann R, Hammers A, Rueckert D. Random forest-based similarity measures for multi-modal classification of Alzheimer's disease. Neuroimage. 2013;65:167-175.
(15) Svetnik V, Liaw A, Tong C, Culberson J, Sheridan R, Feuston B. Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling. J Chem Inf Comput Sci. 2003;43(6):1947-1958.
(16) Chen T, Cao Y, Zhang Y, Liu J, Bao Y, Wang C, et al. Random Forest in Clinical Metabolomics for Phenotypic Discrimination and Biomarker Selection. Evidence-Based Complementary and Alternative Medicine. 2013;2013:1-11.
(17) Seiffert C, Khoshgoftaar TM, Van Hulse J, Napolitano A. RUSBoost: A Hybrid Approach to Alleviating Class Imbalance. IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans. 2010;40(1):185-197.
(18) Zuo WM, Lu WG, Wang KQ, Zhang H. Diagnosis of cardiac arrhythmia using kernel difference weighted KNN classifier. Computers in Cardiology. 2008; 35:253−256.
(19) Uyar A, Gurgen F. Arrhythmia Classification Using Serial Fusion of Support Vector Machines and Logistic Regression. 2007 4th IEEE Workshop on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications. 2007:560-565
(20) Soman T, Bobbie PO. Classification of Arrhythmia Using Machine Learning Techniques. WSEAS Transactions on Computers. 2004;4(6).
(21) Blecker S, Katz SD, Horwitz LI, Kuperman G, Park H, Gold A, et al. Comparison of Approaches for Heart Failure Case Identification From Electronic Health Record Data. JAMA Cardiol. 2016;1(9):1014.
(22) Shiyovich A, Wolak A, Yacobovich L, Grosbard A, Katz A. Accuracy of Diagnosing Atrial Flutter and Atrial Fibrillation From a Surface Electrocardiogram by Hospital Physicians: Analysis of Data From Internal Medicine Departments. The American Journal of the Medical Sciences. 2010;340(4):271-275.