Detecting hemorrhagic stroke from computed tomographic scans using machine learning models comparison
DOI:
https://doi.org/10.56294/dm2024.548Keywords:
Hemorrhagic Strokes, Machine learning, Healthcare, Computed tomographyAbstract
Introduction: Stroke is the most leading cause of death and disability worldwide, with hemorrhagic stroke being the most dangerous due to bleeding in the brain. To minimize the impacts, early detection is crucial for effective management and timely intervention. This is precisely the motivation behind our research, which aims to develop a reliable and rapid diagnostic support system.
Methods: In this study, the authors combined machine learning (ML) models to detect stroke using a dataset of computerized tomography (CT) images. The study was conducted on a real database containing CT images collected from Moroccan patients. The method used in data organization and preprocessing were performed, followed by feature extraction from each image, such as intensity, grayscale, and histogram characteristics. These extracted features were then compressed using several algorithms, including Principal Component Analysis (PCA). The processed data were fed into the most robust machine learning classifiers based on existing literature.
Results: As a result, the XGBoost model achieved the highest classification accuracy, with 93% precision, using a Leave-One-Subject-Out (LOSO) validation scheme.
Conclusion: This study is a step forward in improving patient healthcare by enabling early detection, which could lead to timely, potentially life-saving interventions.
References
1. Benjamin EJ, Muntner P, Alonso A, Bittencourt MS, Callaway CW, Carson AP, et al. Heart Disease and Stroke Statistics—2019 Update: A Report From the American Heart Association. Circulation [Internet]. 2019 Mar 5;139(10). https://doi.org/10.1161/cir.0000000000000659
2. Feigin VL, Forouzanfar MH, Krishnamurthi R, Mensah GA, Connor M, Bennett DA, et al. Global and regional burden of stroke during 1990–2010: findings from the Global Burden of Disease Study 2010. The Lancet. 2014 Jan;383(9913):245–55. https://doi.org/10.1016/s0140-6736(13)61953-4
3. Benjamin EJ, Blaha MJ, Chiuve SE, Cushman M, Das SR, Deo R, et al. Heart Disease and Stroke Statistics—2017 Update: A Report From the American Heart Association. Circulation [Internet]. 2017 Mar 7;135(10). https://doi.org/10.1161/cir.0000000000000485
4. Campbell BCV, De Silva DA, Macleod MR, Coutts SB, Schwamm LH, Davis SM, et al. Ischaemic stroke. Nature Reviews Disease Primers. 2019 Oct 10;5(1). https://doi.org/10.1038/s41572-019-0118-8
5. Donnan GA, Fisher M, Macleod M, Davis SM. Stroke. Lancet [Internet]. 2008;371(9624):1612–23. https://doi.org/10.1016/S0140-6736(08)60694-7
6. Morais Filho AB de, Rego TL de H, Mendonça L de L, Almeida SS de, Nóbrega ML da, Palmieri T de O, et al. The physiopathology of spontaneous hemorrhagic stroke: a systematic review. Reviews in the Neurosciences [Internet]. 2021 Feb 15; https://doi.org/10.1515/revneuro-2020-0131
7. Emon MU, Rahman MM, Ferdousi R, et al. Performance analysis of machine learning approaches in stroke prediction. IEEE Xplore. 2020. Available from: https://ieeexplore.ieee.org/document/9297525
8. Tazin T, Alam MN, Dola NN, Bari MS, Bourouis S, Monirujjaman Khan M. Stroke Disease Detection and Prediction Using Robust Learning Approaches. Journal of Healthcare Engineering [Internet]. 2021 Nov 26;2021:e7633381. https://doi.org/10.1155/2021/7633381
9. Ovbiagele B, Nguyen-Huynh MN. Stroke Epidemiology: Advancing Our Understanding of Disease Mechanism and Therapy. Neurotherapeutics [Internet]. 2011 Jun 21;8(3):319–29. https://doi.org/10.1007/s13311-011-0053-1
10. Donkor ES. Stroke in the 21st century: A snapshot of the burden, epidemiology, and quality of life. Stroke Research and Treatment. 2018 Nov 27;2018(3238165):1–10. https://doi.org/10.1155/2018/3238165
11. Boehme AK, Esenwa C, Elkind MSV. Stroke Risk Factors, Genetics, and Prevention. Circulation research [Internet]. 2017 Feb 3;120(3):472–95.
12. Engels T, Baglione Q, Audibert M, Viallefont A, Mourji F, El Alaoui Faris M. Socioeconomic Status and Stroke Prevalence in Morocco: Results from the Rabat-Casablanca Study. Ikram MA, editor. PLoS ONE [Internet]. 2014 Feb 28;9(2):e89271. https://doi.org/10.1371/journal.pone.0089271
13. Kharbach A, M. Obtel, L. Lahlou, Jehanne Aasfara, Nour Mekaoui, Rachid Razine. Ischemic stroke in Morocco: a systematic review. BMC Neurology. 2019 Dec 30;19(1). https://doi.org/10.1186/s12883-019-1558-1
14. Herpich F, Rincon F. Management of Acute Ischemic Stroke. Critical Care Medicine [Internet]. 2020 Oct 9;48(11):1654–63. https://doi.org/10.1097/ccm.0000000000004597
15. Soun JE, Chow DS, Nagamine M, Takhtawala RS, Filippi CG, Yu W, et al. Artificial Intelligence and Acute Stroke Imaging. American Journal of Neuroradiology [Internet]. 2021 Jan 1;42(1):2–11. https://doi.org/10.3174/ajnr.A6883
16. Sirsat MS, Fermé E, Câmara J. Machine Learning for Brain Stroke: A Review. Journal of Stroke and Cerebrovascular Diseases [Internet]. 2020 Oct 1;29(10):105162. https://doi.org/10.1016/j.jstrokecerebrovasdis.2020.105162
17. Shoily TI, Islam T, Jannat S, Tanna SA, Alif TM, Ema RR. Detection of Stroke Disease using Machine Learning Algorithms. 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT). 2019 Jul; https://doi.org/10.1109/icccnt45670.2019.8944689
18. Kotva Goudoungou S, Dayang P, Tchomte ND, Ngossaha JM, Moffo FM, Mitton N. Covid-19 Data Preprocessing Approach in Machine Learning for Prediction. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering. 2024;328–44. https://doi.org/10.1007/978-3-031-56396-6_21
19. Kazi S, Vakharia P, Shah P, Gupta R, Tailor Y, Mantry P, et al. Preprocessy: A Customisable Data Preprocessing Framework with High-Level APIs [Internet]. IEEE Xplore. 2022 [cited 2023 Jun 17]. p. 206–11. https://doi.org/10.1109/CDMA54072.2022.00039
20. Cuevas E, Rodríguez AN. Image Processing and Machine Learning, Volume 1. 2024. https://doi.org/10.1201/9781003287414
21. Cuevas E, Rodríguez AN. Image Processing and Machine Learning, Volume 2. 2024 Jan 2; https://doi.org/10.1201/9781032662466
22. Zarinabegam Mundargi, Bhatti S, Chandra A, Kamble A, Bijin Jiby, Rohit Arole. PrePy - A Customize Library for Data Preprocessing in Python. 2023 Jan 24; https://doi.org/10.1109/iconat57137.2023.10080134
23. Miquilini P, Barros RC, de V, Basgalupp MP. Enhancing discrimination power with genetic feature construction: A grammatical evolution approach. 2022 IEEE Congress on Evolutionary Computation (CEC). 2016 Jul 1; https://doi.org/10.1109/cec.2016.7744274
24. Gara M, Tasi TS, Péter Balázs. Machine Learning as a Preprocessing Phase in Discrete Tomography. Lecture notes in computer science. 2012 Jan 1;109–24. https://doi.org/10.1007/978-3-642-32313-3_8
25. Halder TK, Sarkar K, Mandal A, Sarkar S. A novel histogram feature for brain tumor detection. International Journal of Information Technology. 2022 Apr 4;14(4):1883–92. https://doi.org/10.1007/s41870-022-00917-w
26. Tunuri Sundeep, Uppalapati Divyasree, Karumanchi Tejaswi, Ummadi Reddy Vinithanjali, Anumandla Kiran Kumar. Feature Extraction of Ophthalmic Images Using Deep Learning and Machine Learning Algorithms. Engineering Proceedings [Internet]. 2023 Oct 26 [cited 2024 Aug 31];56(1). https://doi.org/10.3390/asec2023-15231
27. Korichi M, Meraoumia A, Aiadi KE. Improved biometric identification system using a new scheme of 3D local binary pattern. International Journal of Information and Communication Technology. 2019;14(4):439. https://doi.org/10.1504/ijict.2019.101863
28. Rajan AP, Mathew AR. Evaluation and Applying Feature Extraction Techniques for Face Detection and Recognition. Indonesian Journal of Electrical Engineering and Informatics (IJEEI). 2019 Dec 4;7(4). https://doi.org/10.11591/ijeei.v7i4.935
29. Almohamad TA, Mohd Salleh MF, Mahmud MN, Sa’D AHY. Simultaneous Determination of Modulation Types and Signal-to-Noise Ratios Using Feature-Based Approach. IEEE Access. 2018;6:9262–71. https://doi.org/10.1109/access.2018.2809448
30. Sun Z, Xing W, Guo W, Kim S, Li H, Li W, et al. A Survey on Dimension Reduction Algorithms in Big Data Visualization. Springer eBooks. 2020 Jan 1;375–95. https://doi.org/10.1007/978-3-030-48513-9_31
31. Sharma N, Saroha K. A novel dimensionality reduction method for cancer dataset using PCA and Feature Ranking. 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI). 2015 Aug; https://doi.org/10.1109/icacci.2015.7275954
32. Kvinge H, Farnell E, Kirby M, Peterson C. Monitoring the shape of weather, soundscapes, and dynamical systems: a new statistic for dimension-driven data analysis on large datasets. 2021 IEEE International Conference on Big Data (Big Data). 2018 Dec 1;41:1045–51. https://doi.org/10.1109/bigdata.2018.8622365
33. Abdi H, Williams LJ. Principal component analysis. Wiley Interdisciplinary Reviews: Computational Statistics. 2010 Jun 30;2(4):433–59. https://doi.org/10.1002/wics.101
34. Chowdhury A, Bose A, Zhou S, Woodruff DP, Petros Drineas. A Fast, Provably Accurate Approximation Algorithm for Sparse Principal Component Analysis Reveals Human Genetic Variation Across the World. Lecture notes in computer science. 2022 Jan 1;13278:86–106. https://doi.org/10.1007/978-3-031-04749-7_6
35. Salem N, Hussein S. Data dimensional reduction and principal components analysis. Procedia Computer Science. 2019;163:292–9. https://doi.org/10.1016/j.procs.2019.12.111
36. Hasan MM, Bala B, Atsuo Yoshitaka. SVD aided eigenvector decomposition to compute PCA and it’s application in image denoising. 4th International Conference on Informatics, Electronics and Vision, ICIEV 2015. 2015 Jun 1;1–6. https://doi.org/10.1109/iciev.2015.7334007
37. Wang Dongshu. Linear Projection Based Dimension Reduction Analysis in Algebra Space. Jisuanji gongcheng. 2005 Jan 1;
38. Mallegowda M, Tanupriya R, Vishnupriya C, Kanavalli A. Serial vs parallel execution of Principal Component Analysis using Singular Value Decomposition. Proceedings of the 2nd International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics, ICIITCEE 2024. 2024 Jan 24; https://doi.org/10.1109/iitcee59897.2024.10467693
39. Niu Y, Ye S. Data Prediction Based on Support Vector Machine (SVM)—Taking Soil Quality Improvement Test Soil Organic Matter as an Example. IOP Conference Series: Earth and Environmental Science. 2019 Jul 1;295(2):012021. https://doi.org/10.1088/1755-1315/295/2/012021
40. Khumukcham Robindro, Singh YR, Clinton UB, Linthoingambi Takhellambam, Hoque N. CD-KNN: A Modified K-Nearest Neighbor Classifier with Dynamic K Value. Lecture notes in electrical engineering. 2022 Jan 1;753–62. https://doi.org/10.1007/978-981-19-4831-2_62
41. Niu Y, Wang X. On the k-Nearest Neighbor Classifier with Locally Structural Consistency. Lecture notes in electrical engineering. 2013 Oct 11;269–77. https://doi.org/10.1007/978-3-642-40630-0_34
42. Chen S. K-Nearest Neighbor Algorithm Optimization in Text Categorization. IOP Conference Series: Earth and Environmental Science. 2018 Jan;108:052074. https://doi.org/10.1088/1755-1315/108/5/052074
43. Mitu MM, Arefin S, Saurav Z, Hasan MdA, Farid DMd. Pruning-Based Ensemble Tree for Multi-Class Classification. 2024 6th International Conference on Electrical Engineering and Information & Communication Technology (ICEEICT). 2024 May 2;481–6. https://doi.org/10.1109/iceeict62016.2024.10534584
44. Rutkowski L, Jaworski M, Duda P. Decision Trees in Data Stream Mining. Studies in big data. 2019 Mar 16;37–50. https://doi.org/10.1007/978-3-030-13962-9_3
45. Patra SS, Jena OP, Kumar G, Sreyashi Pramanik, Misra C, Singh KN. Random Forest Algorithm in Imbalance Genomics Classification. In Data Analytics in Bioinformatics: A Machine Learning Perspective. 2021 Jan 18;173–90. https://doi.org/10.1002/9781119785620.ch7
46. Gu X, Angelov PP. Multi-Class Fuzzily Weighted Adaptive Boosting-based Self-Organising Fuzzy Inference Ensemble Systems for Classification. IEEE Transactions on Fuzzy Systems. 2021;30(9):1–1. https://doi.org/10.1109/tfuzz.2021.3126116
47. D Sudharson, S Ashfia Fathima, Kailas PS, K S Thrisha Vaishnavi, S Darshana, A Bhuvaneshwaran. Performance Evaluation of Improved Adaboost Framework in Randomized Phases Through Stumps. 2021 International Conference on Advancements in Electrical, Electronics, Communication, Computing and Automation (ICAECA). 2021 Oct 8;1–6. https://doi.org/10.1109/icaeca52838.2021.9675739
48. Milad Niroumand-Jadidi, Bovolo F. Extreme gradient boosting machine learning for total suspended matter (TSM) retrieval from Sentinel-2 imagery. Proceedings of SPIE - The International Society for Optical Engineering. 2022 Oct 28;12263:7–7. https://doi.org/10.1117/12.2638465
49. Li Y, Gou J, Fan Z. Particle swarm optimization-based extreme gradient boosting for concrete strength prediction. 2022 IEEE 6th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC ). 2019 Dec 1;982–986:982–6. https://doi.org/10.1109/iaeac47372.2019.8997825
50. Pan B. Application of XGBoost algorithm in hourly PM2.5 concentration prediction. IOP Conference Series: Earth and Environmental Science. 2018 Feb;113(1):012127. https://doi.org/10.1088/1755-1315/113/1/012127
Published
Issue
Section
License
Copyright (c) 2024 Zaynab Boujelb, Ahmed Idrissi, Achraf Benba, El Mahjoub Chakir (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.
The article is distributed under the Creative Commons Attribution 4.0 License. Unless otherwise stated, associated published material is distributed under the same licence.