Development of a Hybrid CNN-BiLSTM Architecture to Enhance Text Classification Accuracy

Ade  Oktarino; Sarjon  Defit; YUhandri

doi:10.56294/dm2025726

Authors

Ade Oktarino Universitas Adiwangsa Jambi, Information System. Jambi, Indonesia Author https://orcid.org/0000-0002-8747-4676
Sarjon Defit Universitas Putra Indonesia YPTK Padang, Technology Information. Padang, Indonesia Author https://orcid.org/0000-0001-7538-9274
YUhandri Universitas Putra Indonesia YPTK Padang, Technology Information. Padang, Indonesia Author https://orcid.org/0000-0002-8576-5488

DOI:

https://doi.org/10.56294/dm2025726

Keywords:

Hybrid CNN-BiLSTM, CNN, BiLSTM, FastText, Early Stopping

Abstract

Introduction: Natural Language Processing (NLP) has experienced significant advancements to address the growing demand for efficient and accurate text classification. Despite numerous methodologies, achieving a balance between high accuracy and model stability remains a critical challenge. This research aims to explore the implementation of a hybrid architecture integrating Convolutional Neural Networks (CNN) and Bidirectional Long Short-Term Memory (BiLSTM) with FastText embeddings, targeting effective text classification.
Methods: The proposed hybrid architecture combines the CNN's ability to capture local patterns and BiLSTM's temporal feature extraction capabilities, enhanced by FastText embeddings for richer word representation. Regulatory mechanisms such as Dropout and Early Stopping were employed to mitigate overfitting. Comparative experiments were conducted to evaluate the performance of the model with and without Early Stopping.
Results: The experimental findings reveal that the model without Early Stopping achieved a remarkable accuracy of 99%, albeit with a higher susceptibility to overfitting. Conversely, the implementation of Early Stopping resulted in a stable accuracy of 73%, demonstrating enhanced generalization capabilities while preventing overfitting. The inclusion of Dropout further contributed to model regularization and stability.
Conclusions: This study underscores the significance of balancing accuracy and stability in deep learning models for text classification. The proposed hybrid architecture effectively combines the strengths of CNN, BiLSTM, and FastText embeddings, providing valuable insights into the trade-offs between achieving high accuracy and ensuring robust generalization. Future work could further explore optimization techniques and datasets for broader applicability.

References

1. Anam MK, Defit S, Haviluddin H, Efrizoni L, Firdaus MB. Early Stopping on CNN-LSTM Development to Improve Classification Performance. J Appl Data Sci. 2024;5(3):1175–1188. https://doi.org/10.47738/jads.v5i3.312

2. Jim JR, Talukder MAR, Malakar P, Kabir MM, Nur K, Mridha MF. Recent advancements and challenges of NLP-based sentiment analysis: A state-of-the-art review. Natural Language Processing Journal. 2024;6(100059):1-30. https://doi.org/10.1016/j.nlp.2024.100059

3. Bharal R, V Vamsi Krishna O. Social Media Sentiment Analysis Using CNN-BiLSTM. International Journal of Science and Research. 2021;10(9):656–661. https://doi.org/10.21275/sr21913110537

4. Gupta B, Prakasam P, Velmurugan T. Integrated BERT embeddings, BiLSTM-BiGRU and 1-D CNN model for binary sentiment classification analysis of movie reviews. Multimed Tools Appl. 2022; 81:33067–33086. https://doi.org/10.1007/s11042-022-13155-w

5. Sun F, Chu N. Text Sentiment Analysis Based on CNN-BiLSTM-Attention Model. In: International Conference on Robots & Intelligent System (ICRIS). Sanya, China: IEEE; 2020:749–752. https://doi.org/10.1109/ICRIS52159.2020.00186

6. Oyewale CT, Akinyemi JD, Ibitoye AOJ, Onifade OFW. Predicting Suicide Ideation from Social Media Text Using CNN-BiLSTM. Soft Computing and Its Engineering Applications. 2024: 274–286. https://doi.org/10.1007/978-3-031-53731-8_22

7. Xiaoyan L, Raga RC, Xuemei S. GloVe-CNN-BiLSTM Model for Sentiment Analysis on Text Reviews. Journal of Sensors. 2022:1–12. https://doi.org/10.1109/ICCWAMTIP53232.2021.9674171

8. Asudani DS, Nagwani NK, Singh P. Impact of word embedding models on text analytics in deep learning environment: a review. Artif Intell Rev. 2023;56(9):10345–10425. https://doi.org/10.1007/s10462-023-10419-1

9. E.Almandouh M, Alrahmawy MF, Eisa M, Elhoseny M, Tolba AS. Ensemble based high performance deep learning models for fake news detection. Sci Rep. 2024;14(26591):1-24. https://doi.org/10.1038/s41598-024-76286-0

10. Akgül İ. A Pooling Method Developed for Use in Convolutional Neural Networks. CMES- Computer Modeling in Engineering and Sciences. 2024;141(1):751–770. https://doi.org/10.32604/cmes.2024.052549

11. Nissa NF, Janiati A, Cahya N, Anton A, Astuti P. Application of Deep Learning Using Convolutional Neural Network (CNN) Method For Women’s Skin Classification. Scientific Journal Informatics. 2021;8(1):144–153. https://doi.org/10.15294/sji.v8i1.26888

12. Grimm D, Tollner D, Kraus D, Török Á, Sax E, Szalay Z. A numerical verification method for multi-class feed-forward neural networks. Expert Systems with Applications. 2024;247(123345):1-15. https://doi.org/10.1016/j.eswa.2024.123345

13. Eang C, Lee S. Improving the Accuracy and Effectiveness of Text Classification Based on the Integration of the Bert Model and a Recurrent Neural Network (RNN_Bert_Based). Applied Sciences. 2024;14(18):1-26. https://doi.org/10.3390/app14188388

14. Salehin I, Kang DK. A Review on Dropout Regularization Approaches for Deep Neural Networks within the Scholarly Domain. Electronics. 2023;12(14):1-23. https://doi.org/10.3390/electronics12143106

15. Saeed AM. An Automated New Approach in Fast Text Classification: A Case Study for Kurdish Text. Science Journal of University of Zakho. 2024;12(3):329-335. https://doi.org/10.25271/sjuoz.2024.12.3.1296

16. Zhao X, Wang L, Zhang Y, Han X, Deveci M, Parmar M. A review of convolutional neural networks in computer vision. Artif Intell Rev. 2024;57(99):1-43. https://doi.org/10.1007/s10462-024-10721-6

17. Zafar A, Aamir M, Mohd Nawi N, Arshad A, Riaz S, Alruban A, et al. A Comparison of Pooling Methods for Convolutional Neural Networks. Applied Sciences. 2022;12(17):1-21. https://doi.org/10.3390/app12178643

18. Fan Y, Tang Q, Guo Y, Wei Y. BiLSTM-MLAM: A Multi-Scale Time Series Prediction Model for Sensor Data Based on Bi-LSTM and Local Attention Mechanisms. Sensors. 2024;24(12):1-16. https://doi.org/10.3390/s24123962

19. Zhang D, Leng J, Li X, He W, Chen W. Three-Stream and Double Attention-Based DenseNet-BiLSTM for Fine Land Cover Classification of Complex Mining Landscapes. Sustainability. 2022 Sep 30;14(19):12465. https://doi.org/10.3390/su141912465

20. Bałazy K, Struski Ł, Śmieja M, Tabor J. r-softmax: Generalized Softmax with Controllable Sparsity Rate. arXiv; 2023:1-15. https://doi.org/10.48550/arXiv.2304.05243

21. Reyad M, Sarhan AM, Arafa M. A modified Adam algorithm for deep neural network optimization. Neural Comput & Applic. 2023;35(23):17095–17112. https://doi.org/10.1007/s00521-023-08568-z

22. Anam MK, Van Fc LL, Hamdani H, Rahmaddeni R, Junadhi J, Firdaus MB, et al. Sara Detection on Social Media Using Deep Learning Algorithm Development. Journal of Applied Engineering and Technological Science (JAETS). 2024;6(1):225–237. https://doi.org/10.37385/jaets.v6i1.5390

23. Anam MK, Munawir M, Efrizoni L, Fadillah N, Agustin W, Syahputra I, et al. Improved Performance of Hybrid GRU-BiLSTM for Detection Emotion on Twitter Dataset. J Appl Data Sci. 2024;6(1):354–365. https://doi.org/10.47738/jads.v6i1.459

24. Riyadi W, Jasmir J. Prediction Performance of Airport Traffic Using BiLSTM and CNN-Bi-LSTM Models. jitk. 2023;9(1):1–7. https://doi.org/10.33480/jitk.v9i1.4191

25. Lu W, Li J, Wang J, Qin L. A CNN-BiLSTM-AM method for stock price prediction. Neural Comput & Applic. 2021;33(10):4741–4753. https://doi.org/10.1007/s00521-020-05532-z

26. Abdelhady N, Hassan A. Soliman T, F. Farghally M. Stacked-CNN-BiLSTM-COVID: an effective stacked ensemble deep learning framework for sentiment analysis of Arabic COVID-19 tweets. J Cloud Comp. 2024;13(85):1-21. https://doi.org/10.1186/s13677-024-00644-6

27. Nie Q, Wan D, Wang R. CNN-BiLSTM water level prediction method with attention mechanism. In: International Conference on Artificial Intelligence Technologies and Applications. 2021:1-9. https://doi.org/10.1088/1742-6596/2078/1/012032

28. Li Z. Exploiting CNN-BiLSTM Model for Distributed Acoustic Sensing Event Recognition. In: Wang Y, editor. In: International Conference on Artificial Intelligence and Communication. Dordrecht: Atlantis Press International BV; 2024: 333–341. https://doi.org/10.2991/978-94-6463-512-6_36

29. Sari WK, Azhar ISB, Yamani Z, Florensia Y. Fake News Detection Using Optimized Convolutional Neural Network and Bidirectional Long Short-Term Memory. ComEngApp. 2024;13(03):25–33. https://doi.org/10.18495/comengapp.v13i03.492

Development of a Hybrid CNN-BiLSTM Architecture to Enhance Text Classification Accuracy

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite

Scopus

citescore

sjr