Development of a new predictive hiring system with multi-model voting sets and advanced stacking techniques for assessing semantic soft skills

Authors

  • Asmaa Lamjid Department of Computer Science, Intelligent Processing Systems & Security Team, Faculty of Sciences, Mohammed V University in Rabat, Morocco Author https://orcid.org/0009-0004-2548-6526
  • Anass Ariss Department of Computer Science, Intelligent Processing Systems & Security Team, Faculty of Sciences, Mohammed V University in Rabat, Morocco Author https://orcid.org/0000-0002-5215-6862
  • Jamal Mabrouki Laboratory of Spectroscopy, Molecular Modelling, Materials, Nanomaterial, Water and Environment, CERNE2D, Mohammed V, University in Rabat, Faculty of Science, Rabat, Morocco Author https://orcid.org/0000-0002-3841-7755
  • Karim El Bouchti Department of Computer Science, Intelligent Processing Systems & Security Team, Faculty of Sciences, Mohammed V University in Rabat, Morocco Author
  • Soumia Ziti Department of Computer Science, Intelligent Processing Systems & Security Team, Faculty of Sciences, Mohammed V University in Rabat, Morocco Author https://orcid.org/0000-0002-5357-9170

DOI:

https://doi.org/10.56294/dm20261287

Keywords:

predictive hiring, soft skills, machine learning, embeddings models, semantic data base, AI in HR, automation of the recruitment process, BERT

Abstract

Human resources face a major challenge in extracting and identifying the semantic correspondence of data, in particular the soft skills most recruiters seek from heterogeneous data. The complexity lies in identifying the relationships between textual descriptions in CVs, keywords, descriptions in professional networks, and relevant soft skills such as communication, persuasion skills, negotiation, relationship building, empathy, teamwork, conflict resolution, emotional intelligence, time management, work ethics, after analysis and research, we chose these soft skills as input data because they encompass all the soft skills that a recruiter might look for in a candidate for any position. The present study introduces a predictive hiring system that assesses candidate performance based on soft skills extracted from three main sources, namely resumes, professional social network profiles, and psychometric assessments. A dataset of over one million candidate records was processed. Data analysis relied on state-of-the-art NLP techniques, including word embeddings and contextual language models, in order to build a semantic database linking keywords, phrases, and descriptions to targeted soft skills. Machine and deep learning models were applied, followed by an ensemble approach integrating KNN, Decision Tree, and Random Forest.  To overcome prediction accuracy and overfitting limitations, a meta-model XGBoost was developed, achieving superior results with an accuracy of 98%. The results demonstrate that the proposed meta-model outperforms baseline approaches, delivering high predictive accuracy and robust generalization. These findings highlight the potential of combining semantic analysis with advanced machine learning to support more reliable and scalable predictive recruitment systems.

References

[1] Chao Xue, Di Liang, Sirui Wang, Wei Wu, Jing Zhang. “Dual Path Modeling for Semantic Matching by Perceiving Subtle Conflicts”, 2023; DOI: https://doi.org/10.1109/ICASSP49357.2023.10096590

[2] Lin Li, Peipei Wang, Xinhao Zheng, Qing Xie (2023). Code-Enhanced Fine-Grained Semantic Matching For Tag Recommendation In Software Information Sites. Retrieved from; DOI: https://doi.org/10.1109/ICASSP49357.2023.10095822

[3] Yanmin Chen, Hao Wang, Ruijun Sun, Enhong Chen (2022). Context-Aware Semantic Matching with Self Attention Mechanism; DOI: https://doi.org/10.1109/PRAI55851.2022.9904084

[4] Kaili Wang, Xiaopeng Cao, Xiao Yang, Yuan Cao (2022). Multi-granularity Text Semantic Matching Model Based on Knowledge Enhancement; DOI: https://doi.org/10.1145/3573942.3573966

[5] Han-Jia Ye, Yi Shi, De-chuan Zhan (2022). Identifying Ambiguous Similarity Conditions via Semantic Matching;

[6] Zanxia Jin, Mike Zheng Shou, Fang Zhou, Satoshi Tsutsui, Jingyan Qin, Xu-Cheng Yin (2022). From Token to Word: OCR Token Evolution via Contrastive Learning and Semantic Matching for Text-VQA ;

[7] Chen Xu, Jun Xu, Zhenhua Dong, Jirong Wen (2022). Semantic Sentence Matching via Interacting Syntax Graphs

[8] Mustafa Agaoglu, « Predicting Instructor Performance Using Data Mining Techniques in Higher Education », Department of Computer Engineering, Marmara University, anuary 2016 IEEE Access 4 :1-1 ; DOI: https://doi.org/10.1109/ACCESS.2016.2568756

[9] Shiqiang Guo, Folami Alamudun & Tracy Hammond, « RésuMatcher : A personalized résumé-job matching system ». Expert Systems with Applications Volume 60, 30 October 2016, Pages 169-182 ; DOI: https://doi.org/10.1016/j.eswa.2016.04.013

[10] William B. Cavnar and John M. Trenkle, « N-Gram-Based Text Categorization ». 2019, Environmental Research Institute of Michigan P.O. Box 134001 Ann Arbor MI 48113-4001 ;

[11] Bambang Suryadi, Bahrul Hayat & Muhammad Dwirifqi Kharisma Putra, « The Influence of Adolescent-Parent Career Congruence and Counselor Roles in Vocational Guidance on the Career Orientation of Students ». April 2020 International Journal of Instruction 13 (2) : 45-60 ; DOI: https://doi.org/10.29333/iji.2020.1324a

[12] Sushruta Mishra, Pradeep K Mallick, Hrudaya K Tripathy, Lambodar Jena, and Gyoo-Soo Chae. « Stacked KNN with hard voting predictive approach to assist hiring process in IT organizations ». International Journal of Electrical Engineering & Education. 2021 ; DOI: https://doi.org/10.1177/0020720921989015

[13] Sridevi G, M & S. Kamala Suganthi. « AI based suitability measurement and prediction between job description and job seeker profiles ». International Journal of Information Management Data Insights Volume 2, Issue 2, November 2022, 100109 ; DOI: https://doi.org/10.1016/j.jjimei.2022.100109

[14] C H Ayishathahira, C Sreejith & C Raseek. « Combination of Neural Networks and Conditional Random Fields for Efficient Resume Parsing ». 2018 International CET Conference on Control, Communication, and Computing (IC4). INSPEC Accession Number : 18233630 ; DOI: https://doi.org/10.1109/CETIC4.2018.8530883

[15] Marcu Florentina, « WEB DATA EXTRACTION WITH ROBOT PROCESS AUTOMATION. STUDY ON LINKEDIN WEB SCRAPING USING UIPATH STUDIO ». Annals of'Constantin Brancusi'University of Targu-Jiu. Engineering Series, 2020 ;

[16] Ivo Wings, Rohan Nanda & Kolawole John Adebayo. « A Context-Aware Approach for Extracting Hard and Soft Skills ». Procedia Computer Science Volume 193, 2021, Pages 163-172 ; DOI: https://doi.org/10.1016/j.procs.2021.10.016

[17] Asmaa Lamjid, Karim El Bouchti, Soumia Ziti, Reda Oussama Mohamed, Hicham Labrim, Anouar Riadsolh & Mourad Belkacemi. « Predictive Hiring System : Information Technology Consultants Soft Skills », 2022 : International Conference on Advanced Intelligent Systems for Sustainable Development pp 680–685 DOI: https://doi.org/10.1007/978-3-031-26384-2_59

[18] Tongshan Chang ED. Data mining : a magic technology for college recruitment’s, Paper of Overseas Chinese Association for Institutional Research, 2021 ;

[19] Bodhvi Gaur, Gurpreet Singh Saluja, Hamsa Bharathi Sivakumar and Sanjay Singh. « Semi-supervised deep learning based named entity recognition model to parse education section of resumes ». Neural Computing and Applications volume 33.2021, 5705–5718 ; DOI: https://doi.org/10.1007/s00521-020-05351-2

[20] Silvia Fareri, Nicola Melluso, Filippo Chiarello and Gualtiero Fantoni. « SkillNER: Mining and mapping soft skills from any text ». Expert Systems with Applications Volume 184,2021; DOI: https://doi.org/10.1016/j.eswa.2021.115544

[21] Asmaa Lamjid, Anass Ariss, Imane Ennejjai, Jamal Mabrouki and Soumia Ziti. “Enhancing the hiring process: A predictive system for soft skills assessment”. Department of Computer Science, Faculty of Sciences, Mohammed V University, 2024. DOI: https://doi.org/10.56294/dm2024.387

[22] Lal, Nishka, and Omar Benkraouda. "Exploring the Implementation of AI in Early Onset Interviews to Help Mitigate Bias." arXiv preprint arXiv:2501.09890 (2025). DOI: https://doi.org/10.70251/HYJR2348.323339

[23] Peng, Andi, et al. "Investigations of performance and bias in human-AI teamwork in hiring." Proceedings of the AAAI conference on artificial intelligence. Vol. 36. No. 11. 2022. DOI: https://doi.org/10.1609/aaai.v36i11.21468

[24] Yanamala, Kiran Kumar Reddy. "Transparency, privacy, and accountability in AI-enhanced HR processes." Journal of Advanced Computing Systems 3.3 (2023): 10-18. DOI: https://doi.org/10.69987/JACS.2023.30302

Downloads

Published

2026-01-01

Issue

Section

Original

How to Cite

1.
Lamjid A, Ariss A, Mabrouki J, El Bouchti K, Ziti S. Development of a new predictive hiring system with multi-model voting sets and advanced stacking techniques for assessing semantic soft skills. Data and Metadata [Internet]. 2026 Jan. 1 [cited 2026 Feb. 25];5:1287. Available from: https://dm.ageditor.ar/index.php/dm/article/view/1287