Predictive analytics in education: machine learning approaches and performance metrics for student success – a systematic literature review
DOI:
https://doi.org/10.56294/dm2025730Keywords:
Student Performance, Educational Data, Machine Learning, Deep Learning, Ensemble ModelsAbstract
Higher education institutions rely on student performance to improve grades and enhance academic outcomes. Universities face challenges in evaluating student achievement, providing high-quality instruction, and analyzing performance in a dynamic and competitive context. However, due to limited research on prediction techniques and the critical factors influencing performance, making accurate forecasts is challenging. The utilization of educational data and machine learning has the potential to improve the learning environment. Ensemble models in educational data mining enhance accuracy and robustness by combining predictions from multiple models. Approaches such as bagging and boosting effectively mitigate the risk of overfitting. Machine learning techniques, including Support Vector Machines, Random Forests, K-Nearest Neighbors, Artificial neural networks, Decision Trees, and convolutional neural networks, have been employed in performance prediction. In this study, we examined 85 papers that focused on student performance prediction using machine learning, data mining, and deep learning techniques. The thorough analysis underscores the importance of various factors in forecasting academic performance, offering valuable insights for improving educational strategies and interventions in higher education contexts.
References
1. Ajibade SSM, Ahmad NB, Shamsuddin SM. A heuristic feature selection algorithm to evaluate academic performance of students. In: 2019 IEEE 10th Control and System Graduate Research Colloquium (ICSGRC); 2019 Aug; IEEE. p. 110-4. DOI: https://doi.org/10.1109/ICSGRC.2019.8837067
2. Joshi A, Vinay M, Bhaskar P. Impact of coronavirus pandemic on the Indian education sector: perspectives of teachers on online teaching and assessments. Interact Technol Smart Educ. 2020. DOI: https://doi.org/10.1108/ITSE-06-2020-0087
3. Huber SG, Helm C. COVID-19 and schooling: evaluation, assessment, and accountability in times of crises—reacting quickly to explore key issues for policy, practice and research with the school barometer. Educ Assess Eval Account. 2020;32(2):237-70. DOI: https://doi.org/10.1007/s11092-020-09322-y
4. Houlden S, Veletsianos G. The problem with flexible learning: Neoliberalism, freedom, and learner subjectivities. Learn Media Technol. 2021;46(2):144-55. DOI: https://doi.org/10.1080/17439884.2020.1833920
5. Horta H, Santos JM. Organizational factors, and academic research agendas: an analysis of academics in the social sciences. Stud High Educ. 2020;45(12):2382-97. DOI: https://doi.org/10.1080/03075079.2019.1612351
6. Kapur R. Factors influencing the students’ academic performance in secondary schools in India. Univ Delhi. 2018;575-87.
7. Leal-Rodriguez AL, Albort-Morant G. Promoting innovative experiential learning practices to improve academic performance: Empirical evidence from a Spanish Business School. J Innov Knowl. 2019;4(2):97-103. DOI: https://doi.org/10.1016/j.jik.2017.12.001
8. Singh M, Adebayo SO, Saini M, Singh J. Indian government E-learning initiatives in response to COVID-19 crisis: A case study on online learning in Indian higher education system. Educ Inf Technol. 2021;26(6):7569-607. DOI: https://doi.org/10.1007/s10639-021-10585-1
9. Hooshyar D, Pedaste M, Yang Y. Mining educational data to predict students’ performance through procrastination behavior. Entropy. 2019;22(1):12. DOI: https://doi.org/10.3390/e22010012
10. Shin J, Chen F, Lu C, Bulut O. Analyzing students’ performance in computerized formative assessments to optimize teachers’ test administration decisions using deep learning frameworks. J Comput Educ. 2022;9(1):71-91. DOI: https://doi.org/10.1007/s40692-021-00196-7
11. Waheed H, Hassan SU, Aljohani NR, Hardman J, Alelyani S, Nawaz R. Predicting academic performance of students from VLE big data using deep learning models. Comput Human Behav. 2020;104:106189.
12. Nabil A, Seyam M, Abou-Elfetouh A. Prediction of students’ academic performance based on courses’ grades using deep neural networks. IEEE Access. 2021;9:140731-46. DOI: https://doi.org/10.1109/ACCESS.2021.3119596
13. Dien TT, Luu SH, Thanh-Hai N, Thai-Nghe N. Deep learning with data transformation and factor analysis for student performance prediction. Int J Adv Comput Sci Appl. 2020;11(8). DOI: https://doi.org/10.14569/IJACSA.2020.0110886
14. Song X, Li J, Sun S, Yin H, Dawson P, Doss RRM. SEPN: a sequential engagement based academic performance prediction model. IEEE Intell Syst. 2020;36(1):46-53. DOI: https://doi.org/10.1109/MIS.2020.3006961
15. Sultana J, Rani MU, Farquad MAH. Student’s performance prediction using deep learning and data mining methods. Int J Recent Technol Eng. 2019;8(1S4):1018-21.
16. Rao AS, Aruna Kumar SV, Jogi P, Chinthan Bhat K, Kuladeep Kumar B, Gouda P. Student placement prediction model: A data mining perspective for outcome-based education system. Int J Recent Technol Eng (IJRTE). 2019;8:2497-507. DOI: https://doi.org/10.35940/ijrte.C4710.098319
17. Livieris IE, Drakopoulou K, Tampakas VT, Mikropoulos TA, Pintelas P. Predicting secondary school students' performance utilizing a semi-supervised learning approach. J Educ Comput Res. 2019;57(2):448-70. DOI: https://doi.org/10.1177/0735633117752614
18. Khakata E, Omwenga V, Msanjila S. A stochastic modeling approach to student performance prediction on an internet-mediated environment. J Syst Integr. 2020;11(1). DOI: https://doi.org/10.1109/ISAECT47714.2019.9069689
19. Abazeed A, Khder M. A classification and prediction model for student's performance in university level. J Comput Sci. 2017;13(7):228-33. DOI: https://doi.org/10.3844/jcssp.2017.228.233
20. Anuradha C, Velmurugan T. A comparative analysis on the evaluation of classification algorithms in the prediction of students' performance. Indian J Sci Technol. 2015;8(15):1-12. DOI: https://doi.org/10.17485/ijst/2015/v8i15/74555
21. Wang X, Zhang L, He T. Learning performance prediction-based personalized feedback in online learning via machine learning. Sustainability. 2022;14(13):7654. DOI: https://doi.org/10.3390/su14137654
22. Kitchenham B, Brereton OP, Budgen D, Turner M, Bailey J, Linkman S. Systematic literature reviews in software engineering–a systematic literature review. Inf Softw Technol. 2009;51(1):7-15. DOI: https://doi.org/10.1016/j.infsof.2008.09.009
23. Soares DL, Lemos GC, Primi R, Almeida LS. The relationship between intelligence and academic achievement throughout middle school: The role of students' prior academic performance. Learn Individ Differ. 2015;41:73-8. DOI: https://doi.org/10.1016/j.lindif.2015.02.005
24. Cutumisu M, Schwartz DL, Lou NM. The relation between academic achievement and the spontaneous use of design-thinking strategies. Comput Educ. 2020;149:103806. DOI: https://doi.org/10.1016/j.compedu.2020.103806
25. Cutumisu M. Exploring the relationship between law students’ prior performance and academic achievement at university. Educ Sci. 2019;9(3):236. DOI: https://doi.org/10.3390/educsci9030236
26. Hettler PL. Student demographics and the impact of team-based learning. Int Adv Econ Res. 2015;21(4):413-22. DOI: https://doi.org/10.1007/s11294-015-9539-7
27. Pincus KV, Stout DE, Sorensen JE, Stocks KD, Lawson RA. Forces for change in higher education and implications for the accounting academy. J Account Educ. 2017;40:1-18. DOI: https://doi.org/10.1016/j.jaccedu.2017.06.001
28. Yilmaz R. Exploring the role of e-learning readiness on student satisfaction and motivation in flipped classroom. Comput Human Behav. 2017;70:251-6. DOI: https://doi.org/10.1016/j.chb.2016.12.085
29. Oztekin A, Delen D, Turkyilmaz A, Zaim S. A machine learning-based usability evaluation method for eLearning systems. Decis Support Syst. 2013;56:63-73. DOI: https://doi.org/10.1016/j.dss.2013.05.003
30. Klašnja-Milićević A, Vesin B, Ivanović M, Budimac Z. E-Learning personalization based on hybrid recommendation strategy and learning style identification. Comput Educ. 2011;56(3):885-99. DOI: https://doi.org/10.1016/j.compedu.2010.11.001
31. Tomasevic N, Gvozdenovic N, Vranes S. An overview and comparison of supervised data mining techniques for student exam performance prediction. Comput Educ. 2020;143:103676. DOI: https://doi.org/10.1016/j.compedu.2019.103676
32. Mubarak AA, Cao H, Hezam IM. Deep analytic model for student dropout prediction in massive open online courses. Comput Electr Eng. 2021;93:107271. DOI: https://doi.org/10.1016/j.compeleceng.2021.107271
33. Asif R, Merceron A, Ali SA, Haider NG. Analyzing undergraduate students' performance using educational data mining. Comput Educ. 2017;113:177-94. DOI: https://doi.org/10.1016/j.compedu.2017.05.007
34. Costa EB, Fonseca B, Santana MA, de Araújo FF, Rego J. Evaluating the effectiveness of educational data mining techniques for early prediction of students' academic failure in introductory programming courses. Comput Human Behav. 2017;73:247-56. DOI: https://doi.org/10.1016/j.chb.2017.01.047
35. Natek S, Zwilling M. Student data mining solution–knowledge management system related to higher education institutions. Expert Syst Appl. 2014;41(14):6400-7. DOI: https://doi.org/10.1016/j.eswa.2014.04.024
36. Kabakchieva D. Student performance prediction by using data mining classification algorithms. Int J Comput Sci Manag Res. 2012;1(4):686-90.
37. Feng G, Fan M, Chen Y. Analysis and prediction of students’ academic performance based on educational data mining. IEEE Access. 2022;10:19558-71.
38. Abubakaria MS, Arifin F, Hungilo GG. Predicting students’ academic performance in educational data mining based on deep learning using TensorFlow. Int J Educ Manag Eng (IJEME). 2020;10(6):27-33. DOI: https://doi.org/10.5815/ijeme.2020.06.04
39. Injadat M, Moubayed A, Nassif AB, Shami A. Systematic ensemble model selection approach for educational data mining. Knowl-Based Syst. 2020;200:105992. DOI: https://doi.org/10.1016/j.knosys.2020.105992
40. Pal AK, Pal S. Analysis and mining of educational data for predicting the performance of students. Int J Electron Commun Comput Eng. 2013;4(5):1560-5.
41. Hussain S, Dahan NA, Ba-Alwib FM, Ribata N. Educational data mining and analysis of students’ academic performance using WEKA. Indones J Electr Eng Comput Sci. 2018;9(2):447-59. DOI: https://doi.org/10.11591/ijeecs.v9.i2.pp447-459
42. Injadat M, Moubayed A, Nassif AB, Shami A. Multi-split optimized bagging ensemble model selection for multi-class educational data mining. Appl Intell. 2020;50(12):4506-28. DOI: https://doi.org/10.1007/s10489-020-01776-3
43. El Aouifi H, El Hajji M, Es-Saady Y, Douzi H. Predicting learner’s performance through video sequences viewing behavior analysis using educational data-mining. Educ Inf Technol. 2021;26(5):5799-814. DOI: https://doi.org/10.1007/s10639-021-10512-4
44. Ramaswami G, Susnjak T, Mathrani A. On developing generic models for predicting student outcomes in educational data mining. Big Data Cogn Comput. 2022;6(1):6. DOI: https://doi.org/10.3390/bdcc6010006
45. Feng G, Fan M, Chen Y. Analysis and prediction of students’ academic performance based on educational data mining. IEEE Access. 2022;10:19558-71. DOI: https://doi.org/10.1109/ACCESS.2022.3151652
46. Ghorbani R, Ghousi R. Comparing different resampling methods in predicting students’ performance using machine learning techniques. IEEE Access. 2020;8:67899-911. DOI: https://doi.org/10.1109/ACCESS.2020.2986809
47. Kadambande A, Thakur S, Mohol A, Ingole AM. Predicting students’ performance system. Int Res J Eng Technol. 2017;4(5):2814-6.
48. Zulfiker MS, Kabir N, Biswas AA, Chakraborty P, Rahman MM. Predicting students’ performance of the private universities of Bangladesh using machine learning approaches. Int J Adv Comput Sci Appl. 2020;11(3). DOI: https://doi.org/10.14569/IJACSA.2020.0110383
49. Wang X, Yu X, Guo L, Liu F, Xu L. Student performance prediction with short-term sequential campus behaviors. Inf. 2020;11(4):201.
50. Pushpa SK, Manjunath TN, Mrunal TV, Singh A, Suhas C. Class result prediction using machine learning. In: 2017 International Conference on Smart Technologies for Smart Nation (SmartTechCon); 2017 Aug; IEEE. p. 1208-12. DOI: https://doi.org/10.1109/SmartTechCon.2017.8358559
51. Villagrá-Arnedo CJ, Gallego-Durán FJ, Llorens-Largo F, Compañ-Rosique P, Satorre-Cuerda R, Molina-Carmona R. Improving the expressiveness of black-box models for predicting student performance. Comput Hum Behav. 2017;72:621-31. DOI: https://doi.org/10.1016/j.chb.2016.09.001
52. Chui KT, Fung DCL, Lytras MD, Lam TM. Predicting at-risk university students in a virtual learning environment via a machine learning algorithm. Comput Hum Behav. 2020;107:105584. DOI: https://doi.org/10.1016/j.chb.2018.06.032
53. Sarwat S, Ullah N, Sadiq S, Saleem R, Umer M, Eshmawi AA, Mohamed A, Ashraf I. Predicting Students’ Academic Performance with Conditional Generative Adversarial Network and Deep SVM. Sensors. 2022;22(13):4834. DOI: https://doi.org/10.3390/s22134834
54. Shaziya H, Zaheer R, Kavitha G. Prediction of students performance in semester exams using a Naïve Bayes classifier. Int J Innov Res Sci Eng Technol. 2015;4(10):9823-9.
55. Hamoud A, Humadi A, Awadh WA, Hashim AS. Students’ success prediction based on Bayes algorithms. Int J Comput Appl. 2017;178(7):6-12. DOI: https://doi.org/10.2139/ssrn.3080633
56. Asril T, Isa SM. Prediction of student’s study period using K-Nearest Neighbor algorithm. Int J. 2020;8(6). DOI: https://doi.org/10.30534/ijeter/2020/60862020
57. Brown JM. Predicting math test scores using k-nearest neighbor. In: 2017 IEEE Integrated STEM Education Conference (ISEC); March 2017; 104-106. DOI: https://doi.org/10.1109/ISECon.2017.7910221
58. Mardolkar M, Kumaran N. Forecasting and Avoiding Student Dropout Using the K-Nearest Neighbor Approach. SN Comput Sci. 2020;1(2):1-8. DOI: https://doi.org/10.1007/s42979-020-0102-0
59. Kabakchieva D. Student performance prediction by using data mining classification algorithms. Int J Comput Sci Manag Res. 2012;1(4):686-90.
60. Singh R, Pal S. Machine learning algorithms and ensemble technique to improve prediction of student’s performance. IJATCSE. 2020;9(3). DOI: https://doi.org/10.30534/ijatcse/2020/221932020
61. Son LH, Fujita H. Neural-fuzzy with representative sets for prediction of student performance. Appl Intell. 2019;49(1):172-87. DOI: https://doi.org/10.1007/s10489-018-1262-7
62. Hasan R, Palaniappan S, Mahmood S, Abbas A, Sarker KU, Sattar MU. Predicting student performance in higher educational institutions using video learning analytics and data mining techniques. Appl Sci. 2020;10(11):3894. DOI: https://doi.org/10.3390/app10113894
63. Ahmed NS, Sadiq MH. Clarify of the random forest algorithm in an educational field. In: 2018 International Conference on Advanced Science and Engineering (ICOASE); October 2018;179-84. DOI: https://doi.org/10.1109/ICOASE.2018.8548804
64. Dass S, Gary K, Cunningham J. Predicting student dropout in self-paced MOOC course using random forest model. Inf. 2021;12(11):476. DOI: https://doi.org/10.3390/info12110476
65. Huynh-Cam T, Chen LS, Le H. Using decision trees and random Forest algorithms to predict and determine factors contributing to first-year university students’ learning performance. Algorithms. 2021;14(11):318.
66. Hamsa H, Indiradevi S, Kizhakkethottam JJ. Student academic performance prediction model using decision tree and fuzzy genetic algorithm. Procedia Technol. 2016;25:326-32. DOI: https://doi.org/10.1016/j.protcy.2016.08.114
67. Park E, Dooris J. Predicting student evaluations of teaching using decision tree analysis. Assess Eval High Educ. 2020;45(5):776-93. DOI: https://doi.org/10.1080/02602938.2019.1697798
68. Hoque MI, Azad AK, Tuhin MAH, Salehin ZU. University students result analysis and prediction system by decision tree algorithm. Adv Sci Technol Eng Syst J. 2020;5:115-22. DOI: https://doi.org/10.25046/aj050315
69. Huynh-Cam TT, Chen LS, Le H. Using decision trees and random Forest algorithms to predict and determine factors contributing to first-year university students’ learning performance. Algorithms. 2021;14(11):318. DOI: https://doi.org/10.3390/a14110318
70. Fang N, Lu J. Work in progress-a decision tree approach to predicting student performance in a high-enrollment, high-impact, and core engineering course. In: 2009 IEEE Frontiers in Education Conference; October 2009;1-3. DOI: https://doi.org/10.1109/FIE.2009.5350757
71. Márquez-Vera C, Cano A, Romero C, Ventura S. Predicting student failure at school using genetic programming and different data mining approaches with high dimensional and imbalanced data. Appl Intell. 2013;38(3):315-30. DOI: https://doi.org/10.1007/s10489-012-0374-8
72. Crockett K, Latham A, Whitton N. On predicting learning styles in conversational intelligent tutoring systems using fuzzy decision trees. Int J Hum-Comput Stud. 2017;97:98-115. DOI: https://doi.org/10.1016/j.ijhcs.2016.08.005
73. Zaffar M, Hashmani M, Savita K, Sajjad S, Rehman M. Role of FCBF Feature Selection in Educational Data Mining. Mehran Univ Res J Eng Technol. 2020;39:772-8. DOI: https://doi.org/10.22581/muet1982.2004.09
74. Jiang P, Wang X. Preference Cognitive Diagnosis for Student Performance Prediction. IEEE Access. 2020;8:219775-87. DOI: https://doi.org/10.1109/ACCESS.2020.3042775
75. Gitinabard N, Xu Y, Heckman S, Barnes T, Lynch C. How Widely Can Prediction Models Be Generalized? Performance Prediction in Blended Courses. IEEE Trans Learn Technol. 2019;12:184-97. DOI: https://doi.org/10.1109/TLT.2019.2911832
76. Aydogdu S. Predicting student final performance using artificial neural networks in online learning environments. Educ Inf Technol. 2020;25:1913-27. DOI: https://doi.org/10.1007/s10639-019-10053-x
77. He Y, Chen R, Li X, Hao C, Liu S, Zhang G, Jiang B. Online At-Risk Student Identification using RNN-GRU Joint Neural Networks. [citation incomplete].
78. Mengash H. Using Data Mining Techniques to Predict Student Performance to Support Decision Making in University Admission Systems. IEEE Access. 2020;8:55462-70. DOI: https://doi.org/10.1109/ACCESS.2020.2981905
79. Yang J, Devore S, Hewagallage D, Miller P, Ryan Q, Stewart J. Using machine learning to identify the most at-risk students in physics classes. Phys Rev Phys Educ Res. 2020;16:020130. DOI: https://doi.org/10.1103/PhysRevPhysEducRes.16.020130
80. Figueroa-Cañas J, Sancho-Vinuesa T. Early Prediction of Dropout and Final Exam Performance in an Online Statistics Course. IEEE Rev Iberoam Tecnol Aprendiz. 2020;15:86-94.
81. Waheed H, Ul Hassan S, Aljohani N, Hardman J, Alelyani S, Nawaz R. Predicting academic performance of students from VLE big data using deep learning models. Comput Hum Behav. 2020;104:106189. DOI: https://doi.org/10.1016/j.chb.2019.106189
82. Deo R, Yaseen Z, Al-Ansari N, Nguyen-Huy T, Langlands T, Galligan L. Modern Artificial Intelligence Model Development for Undergraduate Student Performance Prediction: An Investigation on Engineering Mathematics Courses. IEEE Access. 2020;8:136697-136724. DOI: https://doi.org/10.1109/ACCESS.2020.3010938
83. Turabieh H, Azwari S, Rokaya M, Alosaimi WA, Alhakami AAW, Alnfiai W. Enhanced Harris Hawks optimization as a feature selection for the prediction of student performance. Comput. 2021;103:1417-38. DOI: https://doi.org/10.1007/s00607-020-00894-7
84. Wang X, Yu X, Guo L, Liu F, Xu L. Student Performance Prediction with Short-Term Sequential Campus Behaviors. Inf. 2020;11:201. DOI: https://doi.org/10.3390/info11040201
85. Tsiakmaki M, Kostopoulos G, Kotsiantis S, Ragos O. Transfer Learning from Deep Neural Networks for Predicting Student Performance. Appl Sci. 2020;10:2145. DOI: https://doi.org/10.3390/app10062145
86. Yan L, Liu Y. An Ensemble Prediction Model for Potential Student Recommendation Using Machine Learning. Symmetry. 2020;12:728. DOI: https://doi.org/10.3390/sym12050728
87. Figueroa-Cañas J, Sancho-Vinuesa T. Early Prediction of Dropout and Final Exam Performance in an Online Statistics Course. IEEE Rev Iberoam Tecnol Aprendiz. 2020;15:86-94. DOI: https://doi.org/10.1109/RITA.2020.2987727
88. TK S, Midhunchakkravarthy. Academic Performance Prediction of At-Risk Students using Machine Learning Techniques. In: 2023 3rd International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE); Greater Noida, India; 2023:1222-1227. doi: 10.1109/ICACITE57410.2023.10183199. DOI: https://doi.org/10.1109/ICACITE57410.2023.10183199
89. Orji F, Fatahi S, Vassileva J. Data-Driven Approach for Student Engagement Modelling Based on Learning Behaviour. In: 2023. p. 46. doi: 10.1007/978-3-031-35998-9_46. DOI: https://doi.org/10.1007/978-3-031-35998-9_46
90. Zhang H, Wu H, Li Z, Gong W, Yan Y. Machine Learning based Analysis of the Effect of Team Competition on College Students’ Academic Performance. 2023. doi: 10.21203/rs.3.rs-3519859/v1. DOI: https://doi.org/10.21203/rs.3.rs-3519859/v1
91. Thi H, Hai L, Trung D. Using machine learning to identify influential factors and predict student academic performance in blended learning. J Sci Nat Sci. 2023;68:63-76. doi: 10.18173/2354-1059.2023-0006. DOI: https://doi.org/10.18173/2354-1059.2023-0006
92. Sunil Kumar Das. AI-Powered Predictive Analytics in Financial Forecasting: Implications for Corporate Planning and Risk Management. Int J Intell Syst Appl Eng. 2024;12(21s):3512-3516. Available from: https://ijisae.org/index.php/IJISAE/article/view/6061.
93. Tk S. A Study on Predictive Modelling of Student Academic Performance using Machine Learning Method. J Inf Syst Eng Manag. 2024;10:78-88. doi: 10.52783/jisem.v10i1s.103. DOI: https://doi.org/10.52783/jisem.v10i1s.103
94. Adewale M, Azeta A, Adebayo-Alli A, Sambo-Magaji A. Empirical Investigation of Multilayered Framework for Predicting Academic Performance in Open and Distance Learning. Electronics. 2024;13:2808. doi: 10.3390/electronics13142808. DOI: https://doi.org/10.3390/electronics13142808
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Shoukath T K , Midhun Chakkaravarthy (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.
The article is distributed under the Creative Commons Attribution 4.0 License. Unless otherwise stated, associated published material is distributed under the same licence.
