计算机科学 ›› 2021, Vol. 48 ›› Issue (7): 206-212.doi: 10.11896/jsjkx.200900093
卿来云1, 张建功1, 苗军2
QING Lai-yun1, ZHANG Jian-gong1, MIAO Jun2
摘要: 弱监督异常事件检测是一项极富挑战性的任务,其目标是在已知正常和异常视频标签的监督下,定位出异常发生的具体时序区间。文中采用多示例排序网络来实现弱监督异常事件检测任务,该框架在视频被切分为固定数量的片段后,将一个视频抽象为一个包,每个片段相当于包中的示例,多示例学习在已知包类别的前提下训练示例分类器。由于视频有丰富的时序信息,因此重点关注监控视频在线检测的时序关系。从全局和局部角度出发,采用自注意力模块学习出每个示例的权重,通过自注意力值与示例异常得分的线性加权,来获得视频整体的异常分数,并采用均方误差损失训练自注意力模块。另外,引入 LSTM 和时序卷积两种方式对时序建模,其中时序卷积又分为单一类别的时序空洞卷积和融合了不同空洞率的多尺度的金字塔时序空洞卷积。实验结果显示,多尺度的时序卷积优于单一类别的时序卷积,时序卷积联合包内包外互补损失的方法在当前 UCF-Crime 数据集上比不包含时序模块的基线方法的AUC指标高出了3.2%。
中图分类号:
[1]BAI S,KOLTER J Z,KOLTUN V.An empirical evaluation of generic convolutional and recurrent networks for sequence mo-deling[J].arXiv:1803.01271,2018. [2]SULTANI W,CHEN C,SHAH M.Real-world anomaly detection in surveillance videos[J].arXiv:1801.04264,2018. [3]BILEN H,VEDALDI A.Weakly supervised deep detection networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:2846-2854. [4]TANG P,WANG X,BAI X,et al.Multiple instance detectionnetwork with online instance classifier refinement[C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2843-2851. [5]LI D,HUANG J,LI Y,et al.Weakly supervised object localization with progressive domain adaptation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:3512-3520. [6]ZHANG Y,BAI Y,DING M,et al.W2f:A weakly-supervised to fully-supervised framework for object detection[C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:928-936. [7]NGUYEN P,HAN B,LIU T,et al.Weakly supervised action localization by sparse temporal pooling network[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.2018. [8]ZHOU B,KHOSLA A,LAPEDRIZA A,et al.Learning deepfeatures for discriminative localization[C]//2016 IEEE Confe-rence on Computer Vision and Pattern Recognition (CVPR).2016 [9]PAUL S,ROY S,ROY-CHOWDHURY A K.W-talc:Weakly-supervised temporal activity localization and classification[C]//Proceedings of the European Conference on Computer Vision.2018:563-579. [10]LEE P,UH Y,BYUN H.Background suppression network for weakly-supervised temporal action localization[J].arXiv:1911.09963.2019. [11]HASAN M,CHOI J,NEUMANN J,et al.Learning temporal regularity in video sequences[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:733-742. [12]LU C,SHI J,JIA J.Abnormal event detection at 150 fps in matlab[C]//Proceedings of the IEEE International Conference on Computer Vision.2013:2720-2727. [13]ZHAO Y,DENG B,SHEN C,et al.Spatio-temporal autoencoder for video anomaly detection [C]//Proceedings of the 2017 ACM on Multimedia Conference.ACM,2017:1933-1941. [14]LIU W,LUO W,LIAN D,et al.Future frame prediction foranomaly detection-a new baseline [J].arXiv:1712.09867,2017. [15]DOSOVITSKIY A,FISCHER P,ILG E,et al.Flownet:Lear-ning optical flow with convolutional networks[C]//2015 IEEE International Conference on Computer Vision (ICCV).2015. [16]IONESCU R T,KHAN F S,GEORGESCU M I,et al.Object-centric auto-encoders and dummy anomalies for abnormal event detection in video[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).2019. [17]GONG D,LIU L,LE V,et al.Memorizing normality to detectanomaly:Memory-augmented deep autoencoder for unsupervised anomaly detection[C]//2019 IEEE/CVF International Confe-rence on Computer Vision (ICCV).2019. [18]ZHANG J G,QING L Y,MIAO J.Temporal convolutional network with complementary inner bag loss for weakly supervised anomaly detection[C]//Proceedings of IEEE International Conference on Image Processing.2019:4030-4034. [19]TRAN D,BOURDEV L,FERGUS R,et al.Learning spatiotemporal features with 3d convolutional networks[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:4489-4497. [20]ZHU Y,NEWSAM S.Motion-aware feature for improved video anomaly detection[J].arXiv:1907.10211,2019. [21]WANG W,PENG X,QIAO Y,et al.A comprehensive study on temporal modeling for online action detection[J].arXiv:2001.07501,2020. [22]OORD A V D,DIELEMAN S,ZEN H,et al.Wavenet:A gene-rative model for raw audio[J].arXiv:1609.03499,2016. [23]LI J,ZHANG S,WANG J,et al.Global-local temporal representations for video person reidentification[C]//2019 IEEE/CVF International Conference on Computer Vision (ICCV).2019. [24]DUCHI J,HAZAN E,SINGER Y.Adaptive subgradient me-thods for online learning and stochastic optimization[J].Journal of Machine Learning Research,2011,12(7):2121-2159. |
[1] | 饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277 |
[2] | 周芳泉, 成卫青. 基于全局增强图神经网络的序列推荐 Sequence Recommendation Based on Global Enhanced Graph Neural Network 计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085 |
[3] | 戴禹, 许林峰. 基于文本行匹配的跨图文本阅读方法 Cross-image Text Reading Method Based on Text Line Matching 计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032 |
[4] | 周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026 |
[5] | 熊丽琴, 曹雷, 赖俊, 陈希亮. 基于值分解的多智能体深度强化学习综述 Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization 计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112 |
[6] | 姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046 |
[7] | 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153 |
[8] | 孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061 |
[9] | 闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042 |
[10] | 汪鸣, 彭舰, 黄飞虎. 基于多时间尺度时空图网络的交通流量预测模型 Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction 计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188 |
[11] | 金方焱, 王秀利. 融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取 Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM 计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190 |
[12] | 熊罗庚, 郑尚, 邹海涛, 于化龙, 高尚. 融合双向门控循环单元和注意力机制的软件自承认技术债识别方法 Software Self-admitted Technical Debt Identification with Bidirectional Gate Recurrent Unit and Attention Mechanism 计算机科学, 2022, 49(7): 212-219. https://doi.org/10.11896/jsjkx.210500075 |
[13] | 彭双, 伍江江, 陈浩, 杜春, 李军. 基于注意力神经网络的对地观测卫星星上自主任务规划方法 Satellite Onboard Observation Task Planning Based on Attention Neural Network 计算机科学, 2022, 49(7): 242-247. https://doi.org/10.11896/jsjkx.210500093 |
[14] | 张颖涛, 张杰, 张睿, 张文强. 全局信息引导的真实图像风格迁移 Photorealistic Style Transfer Guided by Global Information 计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036 |
[15] | 曾志贤, 曹建军, 翁年凤, 蒋国权, 徐滨. 基于注意力机制的细粒度语义关联视频-文本跨模态实体分辨 Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism 计算机科学, 2022, 49(7): 106-112. https://doi.org/10.11896/jsjkx.210500224 |
|