Deep Spatiotemporal Analysis of Cardiac Motion from Video and Range Data Images for Early Detection of Heart Diseases
DOI:
https://doi.org/10.56294/dm20261227Keywords:
Cardiac analysis, Classification, Data fusion, Deep learning, Disease detection, Efficiency, Heart monitoring, Interpretability, Medical imaging, SpatiotemporalAbstract
Introduction; The detection of motion-based cardiac abnormalities at an early stage proves difficult because individual systems fail to measure how motion affects depth and structural changes. A multimodal spatiotemporal system would enhance the accuracy of medical diagnoses.
Objective; The research aims to create a real-time system which unites cardiac video data with range/depth information to detect cardiac conditions at an early stage.
Method; The system operates through independent encoders which join their data streams through a gated fusion module. The system performs denoising operations followed by statistical normalization and geometric transformation of the input data. The system uses beat-level temporal attention to identify essential time segments for clinical evaluation. The research evaluated system performance through comparison with video transformers and traditional temporal analysis methods.
Result; The model produced F1 reached 0.945 while AUROC reached 0.9978 and the model achieved sensitivity at 0.950 and specificity at 0.940 and precision at 0.940 and AUPRC at 0.972. The system demonstrated excellent calibration performance through its ECE and Brier values which approached perfect results (slope ≈ 1.01, ≈ 0). The system produced useful screening results when using 10% and 20% thresholds which produced 0.142 and 0.118 respectively. The system performed real-time processing at 4.9 GFLOPs while maintaining a processing time of ~98 ms.
Conclusion; The combination of intensity dynamics with depth-derived geometry allows for accurate real-time cardiac prediction with precise calibration. The proposed method delivers superior results than single-signal systems and conventional temporal methods which makes it a useful advancement for early detection and point-of-care cardiology.
References
[1] M. Kumar, A. Veeraraghavan, and A. Sabharwal, “DistancePPG: Robust non-contact vital signs monitoring using a camera,” Biomed. Opt. Express, vol. 6, pp. 1565–1588, 2015. DOI: https://doi.org/10.1364/BOE.6.001565
[2] M. Lewandowska, J. Rumiński, T. Kocejko, and J. Nowak, “Measuring pulse rate with a webcam—A non-contact method for evaluating cardiac activity,” in Proc. Federated Conf. Comput. Sci. Inf. Syst. (FedCSIS), Szczecin, Poland, Sep. 18–21, 2011.
[3] Y. Tong, Z. Huang, Z. Zhang, M. Yin, G. Shan, J. Wu, and F. Qin, “Detail-preserving arterial pulse wave measurement based biorthogonal wavelet decomposition from remote RGB observations,” Measurement, vol. 222, p. 113605, 2023. DOI: https://doi.org/10.1016/j.measurement.2023.113605
[4] K. Kurihara, Y. Maeda, D. Sugimura, and T. Hamamoto, “Spatio-Temporal Structure Extraction of Blood Volume Pulse Using Dynamic Mode Decomposition for Heart Rate Estimation,” IEEE Access, vol. 11, pp. 59081–59096, 2023. DOI: https://doi.org/10.1109/ACCESS.2023.3284465
[5] G. De Haan and V. Jeanne, “Robust pulse rate from chrominance-based rPPG,” IEEE Trans. Biomed. Eng., vol. 60, pp. 2878–2886, 2013. DOI: https://doi.org/10.1109/TBME.2013.2266196
[6] Alharbi M, Ahmad S. Enhancing COVID-19 detection using CT-scan image analysis and disease classification: the DI-QL approach. Health Technol (Berl). 2025;1–12. DOI: https://doi.org/10.1007/s12553-025-00952-0
[7] Y. Tong, Z. Huang, F. Qiu, T. Wang, Y. Wang, F. Qin, and M. Yin, “An Accurate Non-contact Photoplethysmography via Active Cancellation of Reflective Interference,” IEEE J. Biomed. Health Inform., early access, 2024. DOI: https://doi.org/10.1109/JBHI.2024.3443988
[8] X. Liu, Y. Zhang, Z. Yu, H. Lu, H. Yue, and J. Yang, “rPPG-MAE: Self-supervised pretraining with masked autoencoders for remote physiological measurements,” IEEE Trans. Multimedia, vol. 26, pp. 7278–7293, 2024. DOI: https://doi.org/10.1109/TMM.2024.3363660
[9] Z. Yu, Y. Shen, J. Shi, H. Zhao, Y. Cui, J. Zhang, P. Torr, and G. Zhao, “Physformer++: Facial video-based physiological measurement with slowfast temporal difference transformer,” Int. J. Comput. Vis., vol. 131, pp. 1307–1330, 2023. DOI: https://doi.org/10.1007/s11263-023-01758-1
[10] Y. Zhang, J. Shi, J. Wang, Y. Zong, W. Zheng, and G. Zhao, “MaskFusionNet: A Dual-Stream Fusion Model with Masked Pre-training Mechanism for rPPG Measurement,” IEEE Trans. Circuits Syst. Video Technol., in press, 2024. DOI: https://doi.org/10.1109/TCSVT.2024.3422849
[11] L. W. Chiu, Y. R. Chou, Y. C. Wu, and B. F. Wu, “Deep-Learning Based Remote Photoplethysmography Measurement in Driving Scenarios with Color and Near-Infrared Images,” IEEE Trans. Instrum. Meas., vol. 72, p. 5031612, 2023. DOI: https://doi.org/10.1109/TIM.2023.3328703
[12] Z. Yu, X. Li, and G. Zhao, “Remote photoplethysmograph signal measurement from facial videos using spatio-temporal networks,” arXiv preprint, arXiv:1905.02419, 2019.
[13] X. Niu, S. Shan, H. Han, and X. Chen, “Rhythmnet: End-to-end heart rate estimation from face via spatial-temporal representation,” IEEE Trans. Image Process., vol. 29, pp. 2409–2423, 2019. DOI: https://doi.org/10.1109/TIP.2019.2947204
[14] K. B. Jaiswal and T. Meenpal, “Heart rate estimation network from facial videos using spatiotemporal feature image,” Comput. Biol. Med., vol. 151, p. 106307, 2022. DOI: https://doi.org/10.1016/j.compbiomed.2022.106307
[15] P. Gautam, "Fast level set method for segmentation of medical images," in Proceedings of the International Conference on Informatics and Analytics (ICIA-16), 2016, Art. No. 20, pp. 1-7, doi: 10.1145/2980258.2980302. DOI: https://doi.org/10.1145/2980258.2980302
[16] H. Kuang, F. Lv, X. Ma, and X. Liu, “Efficient spatiotemporal attention network for remote heart rate variability analysis,” Sensors, vol. 22, p. 1010, 2022. DOI: https://doi.org/10.3390/s22031010
[17] Ansari GA, ShafiBhat S, Ansari MD, Ahmad S, Abdeljaber HAM. Prediction and Diagnosis of Breast Cancer using Machine Learning Techniques. 2024. DOI: https://doi.org/10.56294/dm2024.346
[18] Z. Yu, Y. Shen, J. Shi, H. Zhao, P. H. Torr, and G. Zhao, “Physformer: Facial video-based physiological measurement with temporal difference transformer,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), New Orleans, LA, USA, Jun. 18–24, 2022, pp. 4186–4196. DOI: https://doi.org/10.1109/CVPR52688.2022.00415
[19] R. X. Wang, H. M. Sun, R. R. Hao, A. Pan, and R. S. Jia, “TransPhys: Transformer-based unsupervised contrastive learning for remote heart rate measurement,” Biomed. Signal Process. Control, vol. 86, p. 105058, 2023. DOI: https://doi.org/10.1016/j.bspc.2023.105058
[20] Haq AU, Li JP, Khan I, Agbley BLY, Ahmad S, Uddin MI, et al. DEBCM: deep learning-based enhanced breast invasive ductal carcinoma classification model in IoMT healthcare systems. IEEE J Biomed Heal Informatics. 2022;28(3):1207–17. DOI: https://doi.org/10.1109/JBHI.2022.3228577
[21] R. Nair, A. A. Fadhil, M. M. Hamed, and A. H. O. Al Mansor, “Spine surgery uses of artificial learning and machine learning: A LDH treatment,” in 2023 IEEE International Conference on Distributed Computing, VLSI, Electrical Circuits and Robotics (DISCOVER), Mangalore, India, 2023, pp. 238–243. doi: 10.1109/DISCOVER58830.2023.10316719. DOI: https://doi.org/10.1109/DISCOVER58830.2023.10316719
[22] S. Kado, Y. Monno, K. Moriwaki, K. Yoshizaki, M. Tanaka, and M. Okutomi, “Remote heart rate measurement from RGB-NIR video based on spatial and spectral face patch selection,” in Proc. 40th Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. (EMBC), Honolulu, HI, USA, Jul. 17–21, 2018, pp. 5676–5680. DOI: https://doi.org/10.1109/EMBC.2018.8513464
[23] Ahmad S, Neal Joshua ES, Rao NT, Ghoniem RM, Taye BM, Bharany S. A multi stage deep learning model for accurate segmentation and classification of breast lesions in mammography. Scientific Reports. 2025 Oct 23;15(1):37103. DOI: https://doi.org/10.1038/s41598-025-21146-8
[24] Shafi S, Ahmad S, Ansari GA, Abdeljaber HA, Alanazi S, Nazeer J. Cuckoo-Inspired Algorithms for Selecting Features in the Prediction of Diabetes Using Machine Learning Models. SN Computer Science. 2025 Sep 29;6(7):860. DOI: https://doi.org/10.1007/s42979-025-04392-5
[25] Osman AA, Nair R, Ahmad S, Al-Adhaileh MH, Kashyap R, Abdeljaber HA, Morsi SA, Shehab RT. Exploring Deep Learning Approaches for Multimodal Breast Cancer Dataset Classification and Detection. Data and Metadata. 2025;4:1136
[26] Rajawat AS, Ahmad S, Muqeem M, Abdeljaber HA, Alanazi S, Nazeer J. Advanced Deep Learning Integration for Early Pneumonia Detection for Smart Healthcare. International Journal of Online & Biomedical Engineering. 2025 Mar 1;21(3). DOI: https://doi.org/10.3991/ijoe.v21i03.53107
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Ahmed A.F Osman, Asma Abdulmana Alhamadi, Rajit Nair, Mosleh Hmoud Al-Adhaileh, Sultan Ahmad, Theyazn H.H Aldhyani, Hikmat A. M. Abdeljaber, Mohammed Ataelfadiel (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.
The article is distributed under the Creative Commons Attribution 4.0 License. Unless otherwise stated, associated published material is distributed under the same licence.
