Computer Science ›› 2026, Vol. 53 ›› Issue (2): 236-244.doi: 10.11896/jsjkx.250300103

• Computer Grapnics & Multimedia • Previous Articles     Next Articles

Constrained Multi-loss Video Anomaly Detection with Dual-branch Feature Fusion

HAN Lei1, SHANG Haoyu1, QIAN Xiaoyan2, GU Yan2, LIU Qingsong2, WANG Chuang1   

  1. 1 School of Computer Science,School of Software,School of Cyberspace Security,Nanjing University of Posts and Telecommunications, Nanjing 210023,China
    2 College of Civil Aviation,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China
  • Received:2025-03-20 Revised:2025-05-27 Published:2026-02-10
  • About author:HAN Lei,born in 1979,Ph.D,senior engineer.His main research interests include computer vision and data center networking.
  • Supported by:
    National Key Research and Development Program of China(2024YFB2906704),National Natural Science Foundation of China(62202237,62372248,U2033207) and Key Program of Natural Science Foundation of Jiangsu,China(24KJA520006).

Abstract: To address the significant impact of spatiotemporal correlation learning on video anomaly detection performance,this paper proposes a dual-branch feature fusion-based constrained multi-loss video anomaly detection method(DBF-CML-transMIL).This method considers the saliency and correlation of segments in multiple instance learning(MIL),utilizing a multi-layer linear neural network to learn the spatial saliency features of each segment.A cascaded Transformer fusion module is designed to capture multi-level temporal correlations among instances.Then,a multi-loss model is employed to perform supervised learning on the fused features,enriching prediction diversity.To address the discreteness issue of the existing top-k method,a constrained sliding window top-k mechanism is introduced to enhance the correlation of anomalous events.Comparative and ablation experiments conducted on the ShanghaiTech and UCF-Crime datasets demonstrate that DBF-CML-transMIL achieves AUC scores of 97.33% and 83.82%,respectively.Furthermore,each module effectively enhances the performance of video anomaly detection.

Key words: Video anomaly detection, Multiple instance learning, Cascaded attention mechanism, Multi-loss function, Sliding window top-k

CLC Number: 

  • TP183
[1]CHEN J,LI L,SU L,et al.Prompt-enhanced multiple instance learning for weakly supervised video anomaly detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:18319-18329.
[2]WU P,ZHOU X,PANG G,et al.Weakly supervised videoanomaly detection and localization with spatio-temporal prompts[C]//Proceedings of the 32nd ACM International Conference on Multimedia.2024:9301-9310.
[3]SULTANI W,CHEN C,SHAH M.Real-world anomaly detection in surveillance videos[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:6479-6488.
[4]WAN B,FANG Y,XIA X,et al.Weakly supervised video anomaly detection via center-guided discriminative learning[C]//2020 IEEE International Conference on Multimedia and Expo(ICME).IEEE,2020:1-6.
[5]LI S,LIU F,JIAO L.Self-training multi-sequence learning with transformer for weakly supervised video anomaly detection[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2022:1395-1403.
[6]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[J].Advances in Neural Information Processing Systems,2017,30:5998-6008.
[7]ZHOU H,ZHAN Y Z,MAO Q R.Video Anomaly Detection Based on Space-Time Fusion Graph Network Learning[J].Journal of Computer Research and Development,2021,58(1):48-59.
[8]TIAN Y,PANG G,CHEN Y,et al.Weakly-supervised videoanomaly detection with robust temporal feature magnitude learning[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:4975-4986.
[9]SHAO Z,BIAN H,CHEN Y,et al.Transmil:Transformerbased correlated multiple instance learning for whole slide image classification[J].Advances in Neural Information Processing Systems,2021,34:2136-2147.
[10]ZHU X R,QIAN X Y,SHI Y Z,et al.Video anomaly detection with long-and-short term time series correlations[J].Journal of Image and Graphics,2024,29(7):1998-2010.
[11]SHI RN,HE Q,WANG H Y,et al.Dynamic motion constraints for video anomaly detection[J].Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition),2025,37(1):110-120.
[12]WU P,LIU J,SHI Y,et al.Not only look,but also listen:Learning multimodal violence detection under weak supervision[C]//Computer Vision-ECCV 2020:16th European Conference.Springer International Publishing,2020:322-339.
[13]HUSSAIN A,ULLAH W,KHAN N,et al.TDS-Net:Trans-former enhanced dual-stream network for video Anomaly Detection[J].Expert Systems with Applications,2024,256:124846.
[14]MA H,ZHANG L.Attention-based framework for weakly supervised video anomaly detection[J].The Journal of Supercomputing,2022,78(6):8409-8429.
[15]ZHONG J X,LI N,KONG W,et al.Graph convolutional label noise cleaner:Train a plug-and-play action classifier for anomaly detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:1237-1246.
[16]FENG J C,HONG F T,ZHENG W S.Mist:Multiple instance self-training framework for video anomaly detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:14009-14018.
[17]LI G,CAI G,ZENG X,et al.Scale-aware spatio-temporal relation learning for video anomaly detection[C]//European Conference on Computer Vision.Cham:Springer Nature Switzerland,2022:333-350.
[18]CHEN H,MEI X,MA Z,et al.Spatial-temporal graph attention network for video anomaly detection[J].Image and Vision Computing,2023,131:104629.
[19]ZHANG J,QING L,MIAO J.Temporal convolutional network with complementary inner bag loss for weakly supervised anomaly detection[C]//2019 IEEE International Conference on Image Processing(ICIP).IEEE,2019:4030-4034.
[20]GONG Y,WANG C,DAI X,et al.Multi-scale continuity-aware refinement network for weakly supervised video anomaly detection[C]//2022 IEEE International Conference on Multimedia and Expo(ICME).IEEE,2022:1-6.
[21]ZHEN Y,GUO Y,WEI J,et al.Multi-scale background sup-pression anomaly detection in surveillance videos[C]//2021 IEEE International Conference on Image Processing(ICIP).IEEE,2021:1114-1118.
[22]ULLAH W,ULLAH A,HAQ I U,et al.CNN features with bi-directional LSTM for real-time anomaly detection in surveillance networks[J].Multimedia Tools and Applications,2021,80:16979-16995.
[23]UL AMIN S,KIM Y,SAMI I,et al.An Efficient Attention-Based Strategy for Anomaly Detection in Surveillance Video[J].Computer Systems Science & Engineering,2023,46(3):3939-3958.
[1] GUO Fangyuan, JI Genlin. Video Anomaly Detection Method Based on Dual Discriminators and Pseudo Video Generation [J]. Computer Science, 2024, 51(8): 217-223.
[2] ZHOU Wenhao, HU Hongtao, CHEN Xu, ZHAO Chunhui. Weakly Supervised Video Anomaly Detection Based on Dual Dynamic Memory Network [J]. Computer Science, 2024, 51(1): 243-251.
[3] LENG Jia-xu, TAN Ming-pi, HU Bo, GAO Xin-bo. Video Anomaly Detection Based on Implicit View Transformation [J]. Computer Science, 2022, 49(2): 142-148.
[4] QING Lai-yun, ZHANG Jian-gong, MIAO Jun. Temporal Modeling for Online Anomaly Detection [J]. Computer Science, 2021, 48(7): 206-212.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!