Transformative Progress in Document Digitization: An In-Depth Exploration of Machine and Deep Learning Models for Character Recognition

Ali Benaissa; Abdelkhalak  Bahri; Ahmad El Allaoui; My  Abdelouahab Salahddine

doi:10.56294/dm2023174

Authors

Ali Benaissa ENSAH, Laboratory of Applied Science - Data Science and Competitive Intelligence Team (DSCI), Abdelmalek Essaadi University (UAE), Tetouan, Morocco Author https://orcid.org/0009-0000-8944-5708
Abdelkhalak Bahri ENSAH, Laboratory of Applied Science - Data Science and Competitive Intelligence Team (DSCI), Abdelmalek Essaadi University (UAE), Tetouan, Morocco Author https://orcid.org/0000-0002-8527-7281
Ahmad El Allaoui Faculty of Sciences and Techniques Errachidia, Engineering Sciences and Techniques, STI-Laboratory - Decisional Computing and Systems Modelling Team, Moulay Ismail University of Meknes, Morocco Author https://orcid.org/0000-0002-8897-3565
My Abdelouahab Salahddine The National School of Management Tangier, Governance and Performance of Organizations laboratory - Finance and Governance of Organizations team, Abdelmalek Essaadi University, Tangier, Morocco Author https://orcid.org/0009-0001-0997-6099

DOI:

https://doi.org/10.56294/dm2023174

Keywords:

Character Recognition, Machine Learning/Deep Learning Models, Document Digitization

Abstract

Introduction: this paper explores the effectiveness of character recognition models for document digitization, leveraging diverse machine learning and deep learning techniques. The study, driven by the increasing relevance of image classification in various applications, focuses on evaluating Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Recurrent Neural Network (RNN), Convolutional Neural Network (CNN), and VGG16 with transfer learning. The research employs a challenging French alphabet dataset, comprising 82 classes, to assess the models' capacity to discern intricate patterns and generalize across diverse characters.
Objective: This study investigates the effectiveness of character recognition models for document digitization using diverse machine learning and deep learning techniques.
Methods: the methodology initiates with data preparation, involving the creation of a merged dataset from distinct sections, encompassing digits, French special characters, symbols, and the French alphabet. The dataset is subsequently partitioned into training, test, and evaluation sets. Each model undergoes meticulous training and evaluation over a specific number of epochs. The recording of fundamental metrics includes accuracy, precision, recall, and F1-score for CNN, RNN, and VGG16, while SVM and KNN are evaluated based on accuracy, macro avg, and weighted avg.
Results: the outcomes highlight distinct strengths and areas for improvement across the evaluated models. SVM demonstrates remarkable accuracy of 98,63 %, emphasizing its efficacy in character recognition. KNN exhibits high reliability with an overall accuracy of 97 %, while the RNN model faces challenges in training and generalization. The CNN model excels with an accuracy of 97,268 %, and VGG16 with transfer learning achieves notable enhancements, reaching accuracy rates of 94,83 % on test images and 94,55 % on evaluation images.
Conclusion: our study evaluates the performance of five models—Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Recurrent Neural Network (RNN), Convolutional Neural Network (CNN), and VGG16 with transfer learning—on character recognition tasks. SVM and KNN demonstrate high accuracy, while RNN faces challenges in training. CNN excels in image classification, and VGG16, with transfer learning, enhances accuracy significantly. This comparative analysis aids in informed model selection for character recognition applications

References

1. Dean J. A Golden Decade of Deep Learning: Computing Systems & Applications. Daedalus. 2022 May 1;151(2):58–74. DOI: https://doi.org/10.1162/daed_a_01900

2. Choraś M, Burduk R, Giełczyk A, Kozik R, Marciniak T. Advances in Computer Recognition, Image Processing and Communications. Entropy. 2022 Jan 10;24(1):108. DOI: https://doi.org/10.3390/e24010108

3. Cortes C, Vapnik V. Support-vector networks. 1995;(20(3)):273–97. DOI: https://doi.org/10.1007/BF00994018

4. Mucherino A, Papajorgji PJ, Pardalos PM. k-Nearest Neighbor Classification. In: Data Mining in Agriculture [Internet]. New York, NY: Springer New York; 2009 [cited 2023 Dec 22]. p. 83–106. (Springer Optimization and Its Applications; vol. 34). Available from: http://link.springer.com/10.1007/978-0-387-88615-2_4 DOI: https://doi.org/10.1007/978-0-387-88615-2_4

5. Marhon SA, Cameron CJF, Kremer SC. Recurrent Neural Networks. In: Bianchini M, Maggini M, Jain LC, editors. Handbook on Neural Information Processing [Internet]. Berlin, Heidelberg: Springer Berlin Heidelberg; 2013 [cited 2023 Dec 22]. p. 29–65. (Intelligent Systems Reference Library; vol. 49). Available from: https://link.springer.com/10.1007/978-3-642-36657-4_2 DOI: https://doi.org/10.1007/978-3-642-36657-4_2

6. Albawi S, Mohammed TA, Al-Zawi S. Understanding of a convolutional neural network. In: 2017 International Conference on Engineering and Technology (ICET) [Internet]. Antalya: IEEE; 2017 [cited 2023 Dec 22]. p. 1–6. Available from: https://ieeexplore.ieee.org/document/8308186/

7. Simonyan K, Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition. 2014 [cited 2023 Dec 22]; Available from: https://arxiv.org/abs/1409.1556

8. Fischer N, Hartelt A, Puppe F. Line-Level Layout Recognition of Historical Documents with Background Knowledge. Algorithms. 2023 Mar 3;16(3):136. DOI: https://doi.org/10.3390/a16030136

9. Benaissa A, Bahri A, El Allaoui A. Multilingual character recognition dataset for Moroccan official documents. Data in Brief. 2024 Feb;52:109953. DOI: https://doi.org/10.1016/j.dib.2023.109953

10. Harris CR, Millman KJ, Van Der Walt SJ, Gommers R, Virtanen P, Cournapeau D, et al. Array programming with NumPy. Nature. 2020 Sep 17;585(7825):357–62. DOI: https://doi.org/10.1038/s41586-020-2649-2

11. Pedregosa F. Scikit-learn: Machine Learning in Python. MACHINE LEARNING IN PYTHON.

12. Albattah W, Albahli S. Intelligent Arabic Handwriting Recognition Using Different Standalone and Hybrid CNN Architectures. Applied Sciences. 2022;12(19):10155. DOI: https://doi.org/10.3390/app121910155

13. Mor SS, Solanki S, Gupta S, Dhingra S, Jain M, Saxena R. Handwritten text recognition: with deep learning and android. International Journal of Engineering and Advanced Technology. 2019;8(3S):819–25.

14. Prashanth DS, Mehta RVK, Sharma N. Classification of Handwritten Devanagari Number – An analysis of Pattern Recognition Tool using Neural Network and CNN. Procedia Computer Science. 2020;167:2445–57. DOI: https://doi.org/10.1016/j.procs.2020.03.297

15. Hamdan YB, others. Construction of statistical SVM based recognition model for handwritten character recognition. Journal of Information Technology. 2021;3(02):92–107. DOI: https://doi.org/10.36548/jitdw.2021.2.003

Transformative Progress in Document Digitization: An In-Depth Exploration of Machine and Deep Learning Models for Character Recognition

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite

compendex