Toward Innovative Recognition of Handwritten Arabic Characters: A Hybrid Approach with SIFT, BoVW, and SVM classification
DOI:
https://doi.org/10.56294/dm2023176Keywords:
Character Recognition, Arabic Handwritten Characters, SIFT, K-meansAbstract
The goal of handwriting recognition has been a top priority for those who want to enter data into computer systems for more than thirty years. In several fields, the advent of handwriting recognition technology is highly anticipated. OCR technology has made it possible for computers to recognize characters as visual objects and collect data about their unique characteristics in recent years. In particular, several studies in this field have focused on Arabic writing. The use of machines to examine handwritten papers is the first step in the character identification process. The identification of specific Arabic characters is the main goal of this particular investigation. In computer vision, Arabic character recognition is very important since it's necessary to correctly recognize and classify Arabic letters and characters in manuscripts. In this research, an innovative approach based on identifying Arabic character characteristics using BoVW (bag of visual words) and SIFT (Scale Invariant Feature Transform) features is proposed. These features are clustered using k-means clustering to produce a dictionary. Following that, SVM (Support Vector Machine) is utilized to classify the word images in a visual codebook created using these terms. The proposed approach is an innovative method to deal with the difficulties associated with Arabic hand-writing recognition. The utilization of BoVW and SIFT features is expected to enhance the system's robustness in recognizing and classifying Arabic characters. The proposed approach will be experimentally evaluated using a dataset that includes a variety of Arabic characters written in various styles. The results of this study will offer important new perspectives on the effectiveness and practicality of the approach suggested
References
1. N. Boudad, R. Faizi, O. haj thami Rachid, et R. Chiheb, « Sentiment analysis in Arabic: A review of the literature », Ain Shams Engineering Journal, vol. 9, juill. 2017, doi: 10.1016/j.asej.2017.04.007.
2. M. Alheraki, R. Al-Matham, et H. Al-Khalifa, « Handwritten Arabic Charac-ter Recognition for Children Writing Using Convolutional Neural Network and Stroke Identification », Human-Centric Intelligent Systems, vol. 3, nov. 2022, doi: 10.1007/s44230-023-00024-4.
3. M. AbdElNafea et S. Heshmat, « Novel Databases for Arabic Online Hand-writing Recognition System », févr. 2020, p. 263‑267. doi: 10.1109/ITCE48509.2020.9047778.
4. Byerly, T. Kalganova, et I. Dear, « No routing needed between capsules », Neurocomputing, vol. 463, p. 545‑553, nov. 2021, doi: 10.1016/j.neucom.2021.08.064.
5. Baldominos, Y. Sáez, et P. Isasi, « A Survey of Handwritten Character Recognition with MNIST and EMNIST », Applied Sciences, vol. 2019, p. 3169, août 2019, doi: 10.3390/app9153169.
6. V. Jayasundara, S. Jayasekara, H. Jayasekara, J. Rajasegaran, S. Senevirat-ne, et R. Rodrigo, « TextCaps : Handwritten Character Recognition with Very Small Datasets », in 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), janv. 2019, p. 254‑262. doi: 10.1109/WACV.2019.00033.
7. M. Arif, H. Hassan, D. Nasien, et H. Haron, « A Review on Feature Extrac-tion and Feature Selection for Handwritten Character Recognition », ijacsa, vol. 6, no 2, 2015, doi: 10.14569/IJACSA.2015.060230.
8. G. Raju, B. S. Moni, et M. S. Nair, « A novel handwritten character recogni-tion system using gradient based features and run length count », Sadhana, vol. 39, no 6, p. 1333‑1355, déc. 2014, doi: 10.1007/s12046-014-0274-1.
9. Benaissa, A. Bahri, et A. El Allaoui, « A Combined Approach of Comput-er Vision and NLP for Documents Data Extraction », in Artificial Intelligence and Smart Environment, Y. Farhaoui, A. Rocha, Z. Brahmia, et B. Bhushab, Éd., in Lecture Notes in Networks and Systems. Cham: Springer International Publishing, 2023, p. 7‑13. doi: 10.1007/978-3-031-26254-8_2.
10. D. G. Lowe, « Distinctive Image Features from Scale-Invariant Keypoints », International Journal of Computer Vision, vol. 60, no 2, p. 91‑110, nov. 2004, doi: 10.1023/B:VISI.0000029664.99615.94.
11. M. Torki, M. E. Hussein, A. Elsallamy, M. Fayyaz, et S. Yaser, « Window-Based Descriptors for Arabic Handwritten Alphabet Recognition: A Comparative Study on a Novel Dataset ». arXiv, 17 novembre 2014. Consulté le: 6 juin 2023. [En ligne]. Disponible sur: http://arxiv.org/abs/1411.3519
12. L. Chergui et M. Kef, « SIFT descriptors for Arabic handwriting recognition », IJCVR, vol. 5, no 4, p. 441, 2015, doi: 10.1504/IJCVR.2015.072193.
13. Korichi et M. L. Kherfi, « A Comparative Study on Arabic Handwritten Words Recognition Using Textures Descriptors ».
14. Y. Boulid, A. Souhar, et Y. Elkettani, « Handwritten Character Recognition Based on the Specificity and the Singularity of the Arabic Language », Interna-tional Journal of Interactive Multimedia and Artificial Intelligence, vol. 4, p. 45‑53, juin 2017, doi: 10.9781/ijimai.2017.446.
15. Y. Boulid, A. Souhar, et M. Ouagague, « Spatial and Textural Aspects for Arabic Handwritten Characters Recognition », International Journal of Interac-tive Multimedia and Artificial Intelligence, vol. 5, p. 86‑91, juin 2018, doi: 10.9781/ijimai.2017.12.002.
16. H. Nemmour et Y. Chibani, « Handwritten Arabic word recognition based on Ridgelet transform and support vector machines », présenté à Proceedings of the 2011 International Conference on High Performance Computing and Simulation, HPCS 2011, août 2011, p. 357‑361. doi: 10.1109/HPCSim.2011.5999846.
17. N. Ayat, « Un système neuro-flou pour la reconnaissance de montants nu-mériques de chèques arabes ».
18. M. Shams, Amira. A., et Wael. Z., « Arabic Handwritten Character Recogni-tion based on Convolution Neural Networks and Support Vector Machine », IJACSA, vol. 11, no 8, 2020, doi: 10.14569/IJACSA.2020.0110819.
19. S. Djaghbellou, A. Bouziane, A. Attia, et Z. Akhtar, « A Survey on Arabic Handwritten Script Recognition Systems », International Journal of Artificial In-telligence and Machine Learning (IJAIML), vol. 11, no 2, p. 1‑17, 2021, doi: 10.4018/IJAIML.20210701.oa9.
20. N. Kumar et S. Gupta, « Offline Handwritten Gurmukhi Character Recogni-tion: A Review », International Journal of Software Engineering and Its Applica-tions, vol. 10, p. 77‑86, mai 2016, doi: 10.14257/ijseia.2016.10.5.08.
21. Bataineh, « A Printed PAW Image Database of Arabic Language for Document Analysis and Recognition », Journal of ICT Research and Applications, vol. 11, p. 199‑211, août 2017, doi: 10.5614/itbj.ict.res.appl.2017.11.2.6.
22. « Arabic Handwritten Characters Dataset ». Consulté le: 27 juin 2023. [En ligne]. Disponible sur: https://www.kaggle.com/datasets/mloey1/ahcd1
23. Q. Wu et D.-X. Zhou, « Analysis of support vector machine classification », Journal of Computational Analysis and Applications, vol. 8, avr. 2006.
24. V. N. Vapnik, « An overview of statistical learning theory », IEEE Trans. Neural Netw., vol. 10, no 5, p. 988‑999, sept. 1999, doi: 10.1109/72.788640.
25. Romero-Carazas R. Prompt lawyer: a challenge in the face of the integration of artificial intelligence and law. Gamification and Augmented Reality 2023;1:7–7. https://doi.org/10.56294/gr20237.
26. Gonzalez-Argote J. A Bibliometric Analysis of the Studies in Modeling and Simulation: Insights from Scopus. Gamification and Augmented Reality 2023;1:5–5. https://doi.org/10.56294/gr20235.
27. Gonzalez-Argote D, Gonzalez-Argote J, Machuca-Contreras F. Blockchain in the health sector: a systematic literature review of success cases. Gamification and Augmented Reality 2023;1:6–6. https://doi.org/10.56294/gr20236.
Published
Issue
Section
License
Copyright (c) 2023 Othmane Farhaoui, Mohamed Rida Fethi, Imad Zeroual, Ahmad El Allaoui (Author)
This work is licensed under a Creative Commons Attribution 4.0 International License.
The article is distributed under the Creative Commons Attribution 4.0 License. Unless otherwise stated, associated published material is distributed under the same licence.