An efficient prediction system for diabetes disease based on machine learning algorithms

Authors

  • Mariame Oumoulylte Laboratory of Applied Sciences; Team: SDIC; National School of Applied Sciences Al-Hoceima, Abdelmalek Esaadi University, Tétouan, Morocco Author
  • Abdelkhalak Bahri Laboratory of Applied Sciences; Team: SDIC; National School of Applied Sciences Al-Hoceima, Abdelmalek Esaadi University, Tétouan, Morocco Author
  • Yousef Farhaoui L-STI, T-IDMS, FST Errachidia, Moulay Ismail University of Meknes, Morocco Author
  • Ahmad El Allaoui L-STI, T-IDMS, FST Errachidia, Moulay Ismail University of Meknes, Morocco Author

DOI:

https://doi.org/10.56294/dm2023173

Keywords:

Machine Learning, Healthcare, Diabetes Disease, KNN, Naive Bayes, SVM, Decision Tree, Random Forest, Logistic Regression

Abstract

Diabetes is a persistent medical condition that arises when the pancreas loses its ability to produce insulin or when the body is unable to utilize the insulin it generates effectively. In today's world, diabetes stands as one of the most prevalent and, unfortunately, one of the deadliest diseases due to certain complications. Timely detection of diabetes plays a crucial role in facilitating its treatment and preventing the disease from advancing further. In this study, we have developed a diabetes prediction model by leveraging a variety of machine learning classification algorithms, including K-Nearest Neighbors (KNN), Naive Bayes, Support Vector Machine (SVM), Decision Tree, Random Forest, and Logistic Regression, to determine which algorithm yields the most accurate predictive outcomes. we employed the famous PIMA Indians Diabetes dataset, comprising 768 instances with nine distinct feature attributes. The primary objective of this dataset is to ascertain whether a patient has diabetes based on specific diagnostic metrics included in the collection. In the process of preparing the data for analysis, we implemented a series of preprocessing steps. The evaluation of performance metrics in this study encompassed accuracy, precision, recall, and the F1 score. The results from our experiments indicate that the K-nearest neighbors’ algorithm (KNN) surpasses other algorithms in effectively differentiating between individuals with diabetes and those without in the PIMA dataset

References

1. PIMA Indians Diabetes Database. (2016, October 6). https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database

2. sklearn.model_selection.GridSearchCV. (n.d.). Scikit-learn. https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html

3. What is the k-nearest neighbors algorithm? | IBM. (n.d.). https://www.ibm.com/topics/knn

4. Ray, S. (2023, December 1). Naive Bayes Classifier explained: Applications and practice problems of Naive Bayes Classifier. Analytics Vidhya. https://www.analyticsvidhya.com/blog/2017/09/naive-bayes-explained/

5. Support Vector Machine (SVM) explained. (n.d.). MATLAB & Simulink. https://se.mathworks.com/discovery/support-vector-machine.htmlhttps://www.ibm.com/topics/decision-trees

6. Aurélien, G.: Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. O’Reilly Media, Inc., Sebastopol, CA

7. Towards data science. Available https://towardsdatascience.com/introduction-to-logistic-regression-66248243c148

8. Beghriche, T., Djerioui, M., Brik, Y., Attallah, B., & Belhaouari, S. B. (2021). An efficient prediction system for diabetes disease based on deep neural network. Complexity, 2021, 1–14. https://doi.org/10.1155/2021/6053824

9. Tasin, I., Nabil, T. U., Islam, S., & Khan, R. (2022). Diabetes prediction using machine learning and explainable AI techniques. Healthcare Technology Letters, 10(1–2), 1–10. https://doi.org/10.1049/htl2.12039

10. AD Association. Classification and diagnosis of diabetes: standards of medical care in diabe-tes-2020. Diabetes Care. 2019. https://doi.org/10.2337/dc20-S002.

11. Sanal MG, Paul K, Kumar S, Ganguly NK. Artificial intelligence and deep learning: the future of medicine and medical practice. J Assoc Physicians India. 2019;67(4):71–3.

12. Muhammad LJ, Algehyne EA, Usman SS. Predictive supervised machine learning models for diabetes mellitus. SN Comput Sci. 2020;1(5):1–10. https://doi.org/10.1007/s42979-020-00250-8.

13. Hasan, M.K., Alam, M.A., Das, D., Hossain, E., Hasan, M.: Diabetes prediction using en-sembling of different machine learning classifiers. IEEE Access 8, 76516–76531, (2020)

14. Pranto, B., et al.: Evaluating machine learning methods for predicting diabetes among female patients in Bangladesh. Information 11, 1–20 (2020)

15. Jackins, V., Vimal, S., Kaliappan, M., Lee, M.Y.: AI-based smart prediction of clinical dis-ease using random forest classifier and Naive Bayes. J. Supercomput. 77, 5198–5219 (2021)

16. F. Mohanty, S. Rup, and B. Dash, “Automated diagnosis of breast cancer using parameter optimized kernel extreme learning machine,” Biomedical Signal Processing and Control, vol. 62, pp. 102–108, 2020.

17. E. Martinez-R´ıos, L. Montesinos, M. Alfaro-Ponce, and L. Pecchia, “A review of machine learning in hypertension detection and blood pressure estimation based on clinical and phys-iological data,” Biomedical Signal Processing and Control, vol. 68, Article ID 102813, 2021.

18. H. Naz and S. Ahuja, “Deep learning approach for diabetes prediction using PIMA Indian dataset,” Journal of Diabetes & Metabolic Disorders, vol.19(1), pp.391-403, 2020.

19. F. Mohanty, S. Rup, and B. Dash, “Automated diagnosis of breast cancer using parameter optimized kernel extreme learning machine,” Biomedical Signal Processing and Control, vol. 62, pp. 102–108, 2020.

20. Oumoulylte, M., El Allaoui, A., Farhaoui, Y., Amounas, F. & Qaraai, Y. Deep Learning Algorithms for Skin Cancer Classification. Artificial Intelligence and Smart Environment. ICAISE 2022. Lecture Notes in Networks and Systems, Springer, Cham, 2022, vol. 635, pp. 345-351. DOI: 10.1007/978-3-031-26254-8_49.

21. A Novel Diabetes Healthcare Disease Prediction Framework Using Machine Learning Techniques. Krishnamoorthi R, Joshi S, Almarzouki HZ, Shukla PK, Rizwan A, Kalpana C, Tiwari B.J Healthc Eng. 2022 Jan 11;2022:1684017. doi: 10.1155/2022/1684017. eCollec-tion 2022.

22. Mujumdar, A., & Vaidehi, V. (2019). Diabetes Prediction using Machine Learning Algo-rithms. Procedia Computer Science, 165, 292–299. https://doi.org/10.1016/j.procs.2020.01.047

Downloads

Published

2023-12-20

Issue

Section

Original

How to Cite

1.
Oumoulylte M, Bahri A, Farhaoui Y, El Allaoui A. An efficient prediction system for diabetes disease based on machine learning algorithms. Data and Metadata [Internet]. 2023 Dec. 20 [cited 2024 Dec. 21];2:173. Available from: https://dm.ageditor.ar/index.php/dm/article/view/119