Investigating the Influence of Convolutional Operations on LSTM Networks in Video Classification

Manal   Benzyane; Mourade  Azrour; Imad  Zeroual; Said   Agoujil

doi:10.56294/dm2023152

Authors

Manal Benzyane STI, IDMS, FST Errachidia, Moulay Ismail University of Meknes, Morocco Author
Mourade Azrour STI, IDMS, FST Errachidia, Moulay Ismail University of Meknes, Morocco Author
Imad Zeroual STI, IDMS, FST Errachidia, Moulay Ismail University of Meknes, Morocco Author
Said Agoujil MMIS, MAIS, FST Errachidia, Moulay Ismail University, Meknes, Morocco Author

DOI:

https://doi.org/10.56294/dm2023152

Keywords:

Video Classification, Convolution, LSTM, ConvLSTM;, LRCN

Abstract

Video classification holds a foundational position in the realm of computer vision, involving the categorization and labeling of videos based on their content. Its significance resonates across various applications, including video surveil-lance, content recommendation, action recognition, video indexing, and more. The primary objective of video classification is to automatically analyze and comprehend the visual information embedded in videos, facilitating the efficient organization, retrieval, and interpretation of extensive video collections. The integration of convolutional neural networks (CNNs) and long short-term memory (LSTM) networks has brought about a revolution in video classification. This fusion effectively captures both spatial and temporal dependencies within video sequences, leveraging the strengths of CNNs in extracting spatial features and LSTMs in modeling sequential and temporal information. ConvLSTM and LRCN (Long-term Recurrent Convolutional Networks) are two widely embraced architectures that embody this fusion. This paper seeks to investigate the impact of convolutions on LSTM networks in the context of video classification, aiming to compare the performance of ConvLSTM and LRCN

References

1. Z. Sun et M. Zhao, « Short-Term Wind Power Forecasting Based on VMD Decomposition, ConvLSTM Networks and Error Analysis », IEEE Access, vol. 8, p. 134422‑134434, 2020, doi: 10.1109/ACCESS.2020.3011060. DOI: https://doi.org/10.1109/ACCESS.2020.3011060

2. M. S. Uzzaman, C. Debnath, D. M. A. Uddin, M. M. Islam, M. A. Talukder, et S. Parvez, « LRCN Based Human Activity Recognition from Video Data ». Roch-ester, NY, 25 août 2022. doi: 10.2139/ssrn.4173741. DOI: https://doi.org/10.2139/ssrn.4173741

3. M. Alharbi, S. K. Rajagopal, S. Rajendran, et M. Alshahrani, « Plant Disease Classification Based on ConvLSTM U-Net with Fully Connected Convolutional Layers », TS, vol. 40, no 1, p. 157‑166, févr. 2023, doi: 10.18280/ts.400114. DOI: https://doi.org/10.18280/ts.400114

4. U. Singh et N. Singhal, « Exploiting Video Classification Using Deep Learning Models for Human Activity Recognition », in Computer Vision and Robotics, P. K. Shukla, K. P. Singh, A. K. Tripathi, et A. Engelbrecht, Éd., in Algorithms for Intelli-gent Systems. Singapore: Springer Nature, 2023, p. 169‑179. doi: 10.1007/978-981-19-7892-0_14. DOI: https://doi.org/10.1007/978-981-19-7892-0_14

5. W.-Y. Wang, H.-C. Li, Y.-J. Deng, L.-Y. Shao, X.-Q. Lu, et Q. Du, « Generative Adversarial Capsule Network With ConvLSTM for Hyperspectral Image Classifi-cation », IEEE Geoscience and Remote Sensing Letters, vol. 18, no 3, p. 523‑527, mars 2021, doi: 10.1109/LGRS.2020.2976482. DOI: https://doi.org/10.1109/LGRS.2020.2976482

6. Y. Tang, J. Huang, et S. Gao, « Research on Fault Classification Model of TE Chemical Process Based on LRCN », in 2021 IEEE 1st International Conference on Digital Twins and Parallel Intelligence (DTPI), juill. 2021, p. 118‑122. doi: 10.1109/DTPI52967.2021.9540171. DOI: https://doi.org/10.1109/DTPI52967.2021.9540171

7. S. Gogineni, G. Suryanarayana, et K. L. S. Soujanya, « Pruning Long-term Recurrent Convolutional Networks for Video Classification and captioning », in 2020 International Conference on Smart Electronics and Communication (ICOSEC), sept. 2020, p. 215‑221. doi: 10.1109/ICOSEC49089.2020.9215414. DOI: https://doi.org/10.1109/ICOSEC49089.2020.9215414

8. S. Zebhi, S. M. T. AlModarresi, et V. Abootalebi, « Action Recognition in Videos Using Global Descriptors and Pre-trained Deep Learning Architecture », in 2020 28th Iranian Conference on Electrical Engineering (ICEE), Tabriz, Iran: IEEE, août 2020, p. 1‑4. doi: 10.1109/ICEE50131.2020.9261038. DOI: https://doi.org/10.1109/ICEE50131.2020.9261038

9. Y. Cheng, Y. Yang, H.-B. Chen, N. Wong, et H. Yu, « S3-Net: A Fast and Lightweight Video Scene Understanding Network by Single-shot Segmentation », in 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), Wai-koloa, HI, USA: IEEE, janv. 2021, p. 3328‑3336. doi: 10.1109/WACV48630.2021.00337. DOI: https://doi.org/10.1109/WACV48630.2021.00337

10. M. Benzyane, I. Zeroual, M. Azrour, et S. Agoujil, « Convolutional Long Short-Term Memory Network Model for Dynamic Texture Classification: A Case Study », in International Conference on Advanced Intelligent Systems for Sus-tainable Development, J. Kacprzyk, M. Ezziyyani, et V. E. Balas, Éd., in Lecture Notes in Networks and Systems. Cham: Springer Nature Switzerland, 2023, p. 383‑395. doi: 10.1007/978-3-031-26384-2_33. DOI: https://doi.org/10.1007/978-3-031-26384-2_33

11. Y. LeCun, Y. Bengio, et G. Hinton, « Deep learning », Nature, vol. 521, no 7553, Art. no 7553, mai 2015, doi: 10.1038/nature14539. DOI: https://doi.org/10.1038/nature14539

12. T. J. Brinker et al., « Skin Cancer Classification Using Convolutional Neural Networks: Systematic Review », Journal of Medical Internet Research, vol. 20, no 10, p. e11936, oct. 2018, doi: 10.2196/11936. DOI: https://doi.org/10.2196/11936

13. Z. Wu, X. Wang, Y.-G. Jiang, H. Ye, et X. Xue, « Modeling Spatial-Temporal Clues in a Hybrid Deep Learning Framework for Video Classification », in Pro-ceedings of the 23rd ACM international conference on Multimedia, Brisbane Aus-tralia: ACM, oct. 2015, p. 461‑470. doi: 10.1145/2733373.2806222. DOI: https://doi.org/10.1145/2733373.2806222

14. K. Luan et T. Matsumaru, « Dynamic Hand Gesture Recognition for Robot Arm Teaching based on Improved LRCN Model », in 2019 IEEE International Conference on Robotics and Biomimetics (ROBIO), déc. 2019, p. 1269‑1274. doi: 10.1109/ROBIO49542.2019.8961787. DOI: https://doi.org/10.1109/ROBIO49542.2019.8961787

15. W. Ye, J. Cheng, F. Yang, et Y. Xu, « Two-Stream Convolutional Network for Improving Activity Recognition Using Convolutional Long Short-Term Memory Networks », IEEE Access, vol. 7, p. 67772‑67780, 2019, doi: 10.1109/ACCESS.2019.2918808. DOI: https://doi.org/10.1109/ACCESS.2019.2918808

16. H. Sun, Y. Yang, Y. Chen, X. Liu, et J. Wang, « Tourism demand forecasting of multi-attractions with spatiotemporal grid: a convolutional block attention module model », Information Technology & Tourism, p. 1‑29, avr. 2023, doi: 10.1007/s40558-023-00247-y. DOI: https://doi.org/10.1007/s40558-023-00247-y

17. J. Choi, J. S. Lee, M. Ryu, G. Hwang, G. Hwang, et S. J. Lee, « Attention-LRCN: Long-term Recurrent Convolutional Network for Stress Detection from Photoplethysmography », in 2022 IEEE International Symposium on Medical Measurements and Applications (MeMeA), juin 2022, p. 1‑6. doi: 10.1109/MeMeA54994.2022.9856417. DOI: https://doi.org/10.1109/MeMeA54994.2022.9856417

18. Romero-Carazas R. Prompt lawyer: a challenge in the face of the integration of artificial intelligence and law. Gamification and Augmented Reality 2023;1:7–7. https://doi.org/10.56294/gr20237. DOI: https://doi.org/10.56294/gr20237

19. Gonzalez-Argote D, Gonzalez-Argote J, Machuca-Contreras F. Blockchain in the health sector: a systematic literature review of success cases. Gamification and Augmented Reality 2023;1:6–6. https://doi.org/10.56294/gr20236. DOI: https://doi.org/10.56294/gr20236

20. Tarik, A., and all."Recommender System for Orientation Student" Lecture Notes in Networks and Systems, 2020, 81, pp. 367–370. https://doi.org/10.1007/978-3-030-23672-4_27 DOI: https://doi.org/10.1007/978-3-030-23672-4_27

21. Gonzalez-Argote J. Analyzing the Trends and Impact of Health Policy Research: A Bibliometric Study. Health Leadership and Quality of Life 2023;2:28-28. https://doi.org/10.56294/hl202328. DOI: https://doi.org/10.56294/hl202328

22. Gonzalez-Argote J. A Bibliometric Analysis of the Studies in Modeling and Simulation: Insights from Scopus. Gamification and Augmented Reality 2023;1:5–5. https://doi.org/10.56294/gr20235. DOI: https://doi.org/10.56294/gr20235

23. Sossi Alaoui, S., and all. "A comparative study of the four well-known classi-fication algorithms in data mining", Lecture Notes in Networks and Systems, 2018, 25, pp. 362–373. https://doi.org/10.1007/978-3-319-69137-4_32 DOI: https://doi.org/10.1007/978-3-319-69137-4_32

Investigating the Influence of Convolutional Operations on LSTM Networks in Video Classification

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite

compendex