计算机科学 ›› 2020, Vol. 47 ›› Issue (8): 195-201.doi: 10.11896/jsjkx.190600148
王教金1, 蹇木伟1, 刘翔宇1, 林培光1, 耿蕾蕾1, 崔超然1, 尹义龙2
WANG Jiao-jin1, JIAN Mu-wei1, LIU Xiang-yu1, LIN Pei-guang1, GEN Lei-lei1, CUI Chao-ran1, YIN Yi-long2
摘要: 视觉是人类感知世界的重要途径之一。视频显著性检测旨在通过计算机模拟人类的视觉注意机制, 智能地检测出视频中的显著性物体。目前, 基于传统方法的视频显著性检测已经达到一定的水平, 但是在时空信息一致性利用方面仍不能令人满意。因此, 文中提出了一种基于全时序卷积神经网络的视频显著性检测方法。首先, 利用全时序卷积对输入视频进行空间信息和时间信息的时空特征提取;然后, 利用3D池化层进行降维;其次, 在解码层中用3D反卷积和3D上采样对前端特征进行解码;最后, 通过把时空信息有机地提取与融合, 来有效地提升显著图的质量。实验结果表明, 所提算法在3个广泛使用的视频显著性检测数据集(DAVIS, FBMS, SegTrack)上的性能优于当前主流的视频显著性检测方法。
中图分类号:
[1]RUSSAKOVSKY O, DENG J, SU H, et al.ImageNet large scale visual recognition challenge[J].Internationl Journal ofCompu-ter Vision, 2015, 115(3):211-252. [2]BROX, MALIK J.Object segmentation by long term analysis of point trajectories[C]∥Proc. Eur. Conf. Comput. Vis..2010:282-295. [3]LI F, KIM T, HUMAYUN A, et al.Video segmentation bytracking many figure-ground segments[C]∥Proc. IEEE Int. Conf. Comput. Vis..2013:2192-2199. [4]LI J, XIA C, CHEN X.A benchmark dataset and saliency-guided stacked autoencoders for video-based salient object detection[J]. IEEE Trans.Image Process., 2018, 27(1):349-364. [5]GALASSO F, NAGARAJA N S, CARDENAS T, et al.A uni-fied video segmentation benchmark:Annotation, metrics and analysis[C]∥Proc.IEEE ICCV.2013:3527-3534. [6]LIU Z, ZHANG X, LUO S, et al.Superpixel-based spatiotemporal saliency detection[J].IEEE TCSVT, 2014, 24(9):1522-1540. [7]FANG Y, WANG Z, LIN W, et al.Video saliency incorporating spatiotemporal cues and uncertainty weighting[J].IEEE TIP, 2014, 23(9):3910-3921. [8]WANG L, WANG L, LU H, et al.Saliency detection with recurrent fully convolutional networks[C]∥ECCV.2016:825-841. [9]LIU Z, LI J, YE L, et al.Saliency detection for unconstrainedvideos using superpixel-level graph and spatiotemporal propagation[J].IEEE Trans.Circuits Syst.Video Technol., 2017, PP(9):1-17. [10]WANG W, SHEN J, PORIKLI F.Saliency-aware geodesic video object segmentation[C]∥IEEE CVPR.2015:3395-3402. [11]CHENG M M, MITRA N J, HUANG X, et al.Global contrast based salient region detection[J].IEEE TPAMI, 2015, 37(3):569-582. [12]HOCHREITER S, SCHMIDHUBER J.Long short-term memory[J].Neural Computation, 1997, 9(8):1735-1780. [13]SHI X, CHEN Z, WANG H, et al.Convolutional LSTM network:A machine learning approach for precipitation nowcasting[C]∥NIPS.2015. [14]CONG R, LEI J, FU H, et al.Co-saliency detection for rgbd images based on multi-constraint feature matching and cross label propagation[J].IEEE TIP, 2018, 27(2):568-579. [15]FU H, XU D, ZHANG B, et al.Object-based multiple fore-ground video co-segmentation via multi-state selection graph[J].IEEE TIP, 2015, 24(11):3415-3424. [16]HE K, ZHANG X, REN S, et al.Spatial pyramid pooling in deep convolutional networks for visual recognition[J].IEEE TPAMI, 2015, 37(9):1904-1916. [17]KOH Y J, KIM C S.Primary object segmentation in videosbased on region augmentation and reduction[C]∥IEEE CVPR.2017:7417-7425. [18]LIU Z, LI J, YE L, et al.Saliency detection for unconstrainedvideos using superpixel-level graph and spatiotemporal propagation[J].IEEE TCSVT, 2017, 27(12):2527-2542. [19]WANG W, SHEN J, SHAO L.Consistent video saliency using local gradient flow optimization and global refinement[J].IEEE TIP, 2015, 24(11):4185-4196. [20]KIM H, KIM Y, SIM J Y, et al.Spatiotemporal saliency detection for video sequences based on random walk with restart[J].IEEE Trans.Image Process., 2015, 24(8):2552-2564. [21]CHEN C, LI S, WANG Y, et al.Video saliency detection viaspatial-temporal fusion and low-rank coherency diffusion[J].IEEE Trans.Image Process., 2017, 26(7):3156-3170. [22]CHENG M M, MITRA N J, HUANG X L, et al.Salient shape:group saliency in image collections[J].The Visual Computer, 2014, 30(4):443-453. [23]FANG Y, LIN W, CHEN Z, et al.A video saliency detection model in compressed domain[J].IEEE Trans.Circuits Syst.Video Technol., 2014, 24(1):27-38. [24]LI G, XIE Y, WEI T, et al.Flow guided recurrent neural encoder for video salient object detection[C]∥IEEE CVPR.2018:3243-3252. [25]ILG E, MAYER N, SAIKIA T, et al.Flownet 2.0:Evolution of optical flow estimation with deep networks[C]∥IEEE CVPR.2017:2462-2470. [26]WANG W, SHEN J, SHAO L.Video salient object detection via fully convolutional networks[J].IEEE TIP, 2018, 27(1):38-49. [27]SHI X, CHEN Z, WANG H, et al.Convolutional LSTM network:A machine learning approach for precipitation nowcas-ting[C]∥NIPS.2015. [28]YANG C, ZHANG L, LU H, et al.Saliency detection via graphbased manifold ranking[C]∥IEEE CVPR.2013:3166-3173. [29]ZHANG P, WANG D, LU H, et al.Amulet:Aggregating multi-level convolutional features for salient object detection[C]∥IEEE ICCV.2017:202-211. [30]LE T N, SUGIMOTO A.Deeply supervised 3D recurrent FCN for salient object detection in videos[C]∥BMVC.2017:1-13. [31]PERAZZI F, PONT-TUSET J, MCWILLIAMS B, et al.A ben-chmark dataset and evaluation methodology for video object segmentation[C]∥Proc.CVPR..2016:724-732. [32]HOU Q, CHENG M M, HU X, et al.Deeply supervised salient object detection with short connections[C]∥Proc.IEEE Conf.Comput.Vis.Pattern Recognit..2017:5300-5309. [33]FANG Y, WANG Z, LIN W, et al.Video saliency incorporating spatiotemporal cues and uncertainty weighting.IEEE Trans.Image Process., 2014, 22(9):3910-3921. [34]XI T, ZHAO W, WANG H, et al.Salient object detection with spatiotemporal background priors for video[J].IEEE Trans.Ima-ge Process., 2017, 26(7):3425-3436. [35]FAN D P, CHENG M M, LIU Y, et al.Structure-measure:Anew way to evaluate foreground maps[C]∥Proceedings of the IEEE International Conference on Computer Vision.2017:4548-4557. [36]FAN D P, GONG C, CAO Y, et al.Enhanced-alignment measure for binary foreground map evaluation[J].arXiv:1805.10421, 2018. [37]FAN D P, CHENG M M, LIU J J, et al.Salient objects in clutter:Bringing salient object detection to the foreground[C]∥IEEE ECCV.2018:186-202. [38]JIAN M, LAM K M, DONG J, et al.Visual-patch-attention-aware Saliency Detection[J].IEEE Transactions on Cyberne-tics, 2015, 45(8):1575-1586. [39]JIAN M, QI Q, DONG J, et al.Integrating QDWD with Pattern Distinctness and Local Contrast for Underwater Saliency Detection[J].Journal of Visual Communication and Image Representation, 2018, 53:31-41. [40]JIAN M, ZHOU Q, CUI C, et al.Assessment of Feature Fusion Strategies in Visual Attention Mechanism for Saliency Detection, Pattern Recognition Letters[OL]. |
[1] | 周芳泉, 成卫青. 基于全局增强图神经网络的序列推荐 Sequence Recommendation Based on Global Enhanced Graph Neural Network 计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085 |
[2] | 周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026 |
[3] | 宁晗阳, 马苗, 杨波, 刘士昌. 密码学智能化研究进展与分析 Research Progress and Analysis on Intelligent Cryptology 计算机科学, 2022, 49(9): 288-296. https://doi.org/10.11896/jsjkx.220300053 |
[4] | 王润安, 邹兆年. 基于物理操作级模型的查询执行时间预测方法 Query Performance Prediction Based on Physical Operation-level Models 计算机科学, 2022, 49(8): 49-55. https://doi.org/10.11896/jsjkx.210700074 |
[5] | 陈泳全, 姜瑛. 基于卷积神经网络的APP用户行为分析方法 Analysis Method of APP User Behavior Based on Convolutional Neural Network 计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121 |
[6] | 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153 |
[7] | 檀莹莹, 王俊丽, 张超波. 基于图卷积神经网络的文本分类方法研究综述 Review of Text Classification Methods Based on Graph Convolutional Network 计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064 |
[8] | 闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042 |
[9] | 李宗民, 张玉鹏, 刘玉杰, 李华. 基于可变形图卷积的点云表征学习 Deformable Graph Convolutional Networks Based Point Cloud Representation Learning 计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023 |
[10] | 郝志荣, 陈龙, 黄嘉成. 面向文本分类的类别区分式通用对抗攻击方法 Class Discriminative Universal Adversarial Attack for Text Classification 计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077 |
[11] | 齐秀秀, 王佳昊, 李文雄, 周帆. 基于概率元学习的矩阵补全预测融合算法 Fusion Algorithm for Matrix Completion Prediction Based on Probabilistic Meta-learning 计算机科学, 2022, 49(7): 18-24. https://doi.org/10.11896/jsjkx.210600126 |
[12] | 杨炳新, 郭艳蓉, 郝世杰, 洪日昌. 基于数据增广和模型集成策略的图神经网络在抑郁症识别上的应用 Application of Graph Neural Network Based on Data Augmentation and Model Ensemble in Depression Recognition 计算机科学, 2022, 49(7): 57-63. https://doi.org/10.11896/jsjkx.210800070 |
[13] | 张颖涛, 张杰, 张睿, 张文强. 全局信息引导的真实图像风格迁移 Photorealistic Style Transfer Guided by Global Information 计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036 |
[14] | 戴朝霞, 李锦欣, 张向东, 徐旭, 梅林, 张亮. 基于DNGAN的磁共振图像超分辨率重建算法 Super-resolution Reconstruction of MRI Based on DNGAN 计算机科学, 2022, 49(7): 113-119. https://doi.org/10.11896/jsjkx.210600105 |
[15] | 刘月红, 牛少华, 神显豪. 基于卷积神经网络的虚拟现实视频帧内预测编码 Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network 计算机科学, 2022, 49(7): 127-131. https://doi.org/10.11896/jsjkx.211100179 |
|