计算机科学 ›› 2020, Vol. 47 ›› Issue (8): 195-201.doi: 10.11896/jsjkx.190600148

• 计算机图形学&多媒体 • 上一篇    下一篇

基于3D全时序卷积神经网络的视频显著性检测

王教金1, 蹇木伟1, 刘翔宇1, 林培光1, 耿蕾蕾1, 崔超然1, 尹义龙2   

  1. 1 山东财经大学计算机科学与技术学院 济南 2500142
    山东大学软件学院 济南 250101
  • 出版日期:2020-08-15 发布日期:2020-08-10
  • 通讯作者: 蹇木伟(jianmuweihk@163.com)
  • 作者简介:125453468@qq.com
  • 基金资助:
    国家自然科学基金(61601427, 61976123, 61771230);泰山学者青年专家支持计划

Video Saliency Detection Based on 3D Full ConvLSTM Neural Network

WANG Jiao-jin1, JIAN Mu-wei1, LIU Xiang-yu1, LIN Pei-guang1, GEN Lei-lei1, CUI Chao-ran1, YIN Yi-long2   

  1. 1 School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan 250014, China
    2 School of Software Engineering, Shandong University, Jinan 250101, China
  • Online:2020-08-15 Published:2020-08-10
  • About author:WANG Jiao-jin, born in 1993, postgra-duate.His main research interests include image processing and visual significance detection.
    JIAN Mu-wei, professor, Ph.D supervisor, is a member of China Computer Federation.His main research interests include image processing, pattern recognition, multimedia computing.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China (61601427, 61976123, 61771230), Taishan Young Scholars Program of Shandong Province.

摘要: 视觉是人类感知世界的重要途径之一。视频显著性检测旨在通过计算机模拟人类的视觉注意机制, 智能地检测出视频中的显著性物体。目前, 基于传统方法的视频显著性检测已经达到一定的水平, 但是在时空信息一致性利用方面仍不能令人满意。因此, 文中提出了一种基于全时序卷积神经网络的视频显著性检测方法。首先, 利用全时序卷积对输入视频进行空间信息和时间信息的时空特征提取;然后, 利用3D池化层进行降维;其次, 在解码层中用3D反卷积和3D上采样对前端特征进行解码;最后, 通过把时空信息有机地提取与融合, 来有效地提升显著图的质量。实验结果表明, 所提算法在3个广泛使用的视频显著性检测数据集(DAVIS, FBMS, SegTrack)上的性能优于当前主流的视频显著性检测方法。

关键词: 显著性检测, 时空特征, 全时序卷积, 神经网络

Abstract: Video saliency detection aims to mimic human’s visual attention mechanism of perceiving the world via extracting the most attractive regions or objects in the input video.At present, it is still a challenge for video saliency detection.Traditional video saliency-detection models have reached a certain level, but exploiting the consistency of spatio-temporal information is unsatisfactory.In order to solve this issue, this paper proposes a video saliency-detection model based on 3D full ConvLSTM neural network.Firstly, the full-time convolution is utilized to extract spatio-temporal features from the input video, and then the 3D pooling layer is explored for dimensionality reduction.Secondly, the extracted features are decoded by 3D deconvolution in the decoding layer, and the interpolation algorithm is applied to restore the saliency map to the original size of the original image.The proposed method extracts the time and space information jointly so as to effectively enhance the completeness of the saliency map.Experimental results show that the performance of the proposed algorithm is superior to state-of-the-art video saliency detection methods based on three widely used data sets (DAVIS, FBMS, SegTrack) for video saliency detection.

Key words: Saliency detection, Spatio-temporal feature, ConvLSTM, Neural network

中图分类号: 

  • TP391
[1] RUSSAKOVSKY O, DENG J, SU H, et al.ImageNet large scale visual recognition challenge[J].Internationl Journal ofCompu-ter Vision, 2015, 115(3):211-252.
[2] BROX, MALIK J.Object segmentation by long term analysis of point trajectories[C]∥Proc. Eur. Conf. Comput. Vis..2010:282-295.
[3] LI F, KIM T, HUMAYUN A, et al.Video segmentation bytracking many figure-ground segments[C]∥Proc. IEEE Int. Conf. Comput. Vis..2013:2192-2199.
[4] LI J, XIA C, CHEN X.A benchmark dataset and saliency-guided stacked autoencoders for video-based salient object detection[J]. IEEE Trans.Image Process., 2018, 27(1):349-364.
[5] GALASSO F, NAGARAJA N S, CARDENAS T, et al.A uni-fied video segmentation benchmark:Annotation, metrics and analysis[C]∥Proc.IEEE ICCV.2013:3527-3534.
[6] LIU Z, ZHANG X, LUO S, et al.Superpixel-based spatiotemporal saliency detection[J].IEEE TCSVT, 2014, 24(9):1522-1540.
[7] FANG Y, WANG Z, LIN W, et al.Video saliency incorporating spatiotemporal cues and uncertainty weighting[J].IEEE TIP, 2014, 23(9):3910-3921.
[8] WANG L, WANG L, LU H, et al.Saliency detection with recurrent fully convolutional networks[C]∥ECCV.2016:825-841.
[9] LIU Z, LI J, YE L, et al.Saliency detection for unconstrainedvideos using superpixel-level graph and spatiotemporal propagation[J].IEEE Trans.Circuits Syst.Video Technol., 2017, PP(9):1-17.
[10] WANG W, SHEN J, PORIKLI F.Saliency-aware geodesic video object segmentation[C]∥IEEE CVPR.2015:3395-3402.
[11] CHENG M M, MITRA N J, HUANG X, et al.Global contrast based salient region detection[J].IEEE TPAMI, 2015, 37(3):569-582.
[12] HOCHREITER S, SCHMIDHUBER J.Long short-term memory[J].Neural Computation, 1997, 9(8):1735-1780.
[13] SHI X, CHEN Z, WANG H, et al.Convolutional LSTM network:A machine learning approach for precipitation nowcasting[C]∥NIPS.2015.
[14] CONG R, LEI J, FU H, et al.Co-saliency detection for rgbd images based on multi-constraint feature matching and cross label propagation[J].IEEE TIP, 2018, 27(2):568-579.
[15] FU H, XU D, ZHANG B, et al.Object-based multiple fore-ground video co-segmentation via multi-state selection graph[J].IEEE TIP, 2015, 24(11):3415-3424.
[16] HE K, ZHANG X, REN S, et al.Spatial pyramid pooling in deep convolutional networks for visual recognition[J].IEEE TPAMI, 2015, 37(9):1904-1916.
[17] KOH Y J, KIM C S.Primary object segmentation in videosbased on region augmentation and reduction[C]∥IEEE CVPR.2017:7417-7425.
[18] LIU Z, LI J, YE L, et al.Saliency detection for unconstrainedvideos using superpixel-level graph and spatiotemporal propagation[J].IEEE TCSVT, 2017, 27(12):2527-2542.
[19] WANG W, SHEN J, SHAO L.Consistent video saliency using local gradient flow optimization and global refinement[J].IEEE TIP, 2015, 24(11):4185-4196.
[20] KIM H, KIM Y, SIM J Y, et al.Spatiotemporal saliency detection for video sequences based on random walk with restart[J].IEEE Trans.Image Process., 2015, 24(8):2552-2564.
[21] CHEN C, LI S, WANG Y, et al.Video saliency detection viaspatial-temporal fusion and low-rank coherency diffusion[J].IEEE Trans.Image Process., 2017, 26(7):3156-3170.
[22] CHENG M M, MITRA N J, HUANG X L, et al.Salient shape:group saliency in image collections[J].The Visual Computer, 2014, 30(4):443-453.
[23] FANG Y, LIN W, CHEN Z, et al.A video saliency detection model in compressed domain[J].IEEE Trans.Circuits Syst.Video Technol., 2014, 24(1):27-38.
[24] LI G, XIE Y, WEI T, et al.Flow guided recurrent neural encoder for video salient object detection[C]∥IEEE CVPR.2018:3243-3252.
[25] ILG E, MAYER N, SAIKIA T, et al.Flownet 2.0:Evolution of optical flow estimation with deep networks[C]∥IEEE CVPR.2017:2462-2470.
[26] WANG W, SHEN J, SHAO L.Video salient object detection via fully convolutional networks[J].IEEE TIP, 2018, 27(1):38-49.
[27] SHI X, CHEN Z, WANG H, et al.Convolutional LSTM network:A machine learning approach for precipitation nowcas-ting[C]∥NIPS.2015.
[28] YANG C, ZHANG L, LU H, et al.Saliency detection via graphbased manifold ranking[C]∥IEEE CVPR.2013:3166-3173.
[29] ZHANG P, WANG D, LU H, et al.Amulet:Aggregating multi-level convolutional features for salient object detection[C]∥IEEE ICCV.2017:202-211.
[30] LE T N, SUGIMOTO A.Deeply supervised 3D recurrent FCN for salient object detection in videos[C]∥BMVC.2017:1-13.
[31] PERAZZI F, PONT-TUSET J, MCWILLIAMS B, et al.A ben-chmark dataset and evaluation methodology for video object segmentation[C]∥Proc.CVPR..2016:724-732.
[32] HOU Q, CHENG M M, HU X, et al.Deeply supervised salient object detection with short connections[C]∥Proc.IEEE Conf.Comput.Vis.Pattern Recognit..2017:5300-5309.
[33] FANG Y, WANG Z, LIN W, et al.Video saliency incorporating spatiotemporal cues and uncertainty weighting.IEEE Trans.Image Process., 2014, 22(9):3910-3921.
[34] XI T, ZHAO W, WANG H, et al.Salient object detection with spatiotemporal background priors for video[J].IEEE Trans.Ima-ge Process., 2017, 26(7):3425-3436.
[35] FAN D P, CHENG M M, LIU Y, et al.Structure-measure:Anew way to evaluate foreground maps[C]∥Proceedings of the IEEE International Conference on Computer Vision.2017:4548-4557.
[36] FAN D P, GONG C, CAO Y, et al.Enhanced-alignment measure for binary foreground map evaluation[J].arXiv:1805.10421, 2018.
[37] FAN D P, CHENG M M, LIU J J, et al.Salient objects in clutter:Bringing salient object detection to the foreground[C]∥IEEE ECCV.2018:186-202.
[38] JIAN M, LAM K M, DONG J, et al.Visual-patch-attention-aware Saliency Detection[J].IEEE Transactions on Cyberne-tics, 2015, 45(8):1575-1586.
[39] JIAN M, QI Q, DONG J, et al.Integrating QDWD with Pattern Distinctness and Local Contrast for Underwater Saliency Detection[J].Journal of Visual Communication and Image Representation, 2018, 53:31-41.
[40] JIAN M, ZHOU Q, CUI C, et al.Assessment of Feature Fusion Strategies in Visual Attention Mechanism for Saliency Detection, Pattern Recognition Letters[OL].
[1] 余雪勇, 陈涛. 边缘计算场景中基于虚拟映射的隐私保护卸载算法[J]. 计算机科学, 2021, 48(1): 65-71.
[2] 单美静, 秦龙飞, 张会兵. L-YOLO:适用于车载边缘计算的实时交通标识检测模型[J]. 计算机科学, 2021, 48(1): 89-95.
[3] 何彦辉, 吴桂兴, 吴志强. 基于域适应的X光图像的目标检测[J]. 计算机科学, 2021, 48(1): 175-181.
[4] 李亚男, 胡宇佳, 甘伟, 朱敏. 基于深度学习的miRNA靶位点预测研究综述[J]. 计算机科学, 2021, 48(1): 209-216.
[5] 张艳梅, 楼胤成. 基于深度神经网络的庞氏骗局合约检测方法[J]. 计算机科学, 2021, 48(1): 273-279.
[6] 庄世杰, 於志勇, 郭文忠, 黄昉菀. 基于Zoneout的跨尺度循环神经网络及其在短期电力负荷预测中的应用[J]. 计算机科学, 2020, 47(9): 105-109.
[7] 张佳嘉, 张小洪. 多分支卷积神经网络肺结节分类方法及其可解释性[J]. 计算机科学, 2020, 47(9): 129-134.
[8] 朱玲莹, 桑庆兵, 顾婷婷. 基于视差信息的无参考立体图像质量评价[J]. 计算机科学, 2020, 47(9): 150-156.
[9] 赵钦炎, 李宗民, 刘玉杰, 李华. 基于信息熵的级联Siamese网络目标跟踪[J]. 计算机科学, 2020, 47(9): 157-162.
[10] 游兰, 韩雪薇, 何正伟, 肖丝雨, 何渡, 潘筱萌. 基于改进Seq2Seq的短时AIS轨迹序列预测模型[J]. 计算机科学, 2020, 47(9): 169-174.
[11] 崔彤彤, 王桂玲, 高晶. 基于1DCNN-LSTM的船舶轨迹分类方法[J]. 计算机科学, 2020, 47(9): 175-184.
[12] 刘海潮, 王莉. 基于深度图卷积胶囊网络的图分类模型[J]. 计算机科学, 2020, 47(9): 219-225.
[13] 池昊宇, 陈长波. 基于神经网络的循环分块大小预测[J]. 计算机科学, 2020, 47(8): 62-70.
[14] 赵威, 林煜明, 王超强, 蔡国永. 基于依赖联系分析的观点词对协同抽取[J]. 计算机科学, 2020, 47(8): 164-170.
[15] 梁正友, 何景琳, 孙宇. 一种用于微表情自动识别的三维卷积神经网络进化方法[J]. 计算机科学, 2020, 47(8): 227-232.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 雷丽晖,王静. 可能性测度下的LTL模型检测并行化研究[J]. 计算机科学, 2018, 45(4): 71 -75 .
[2] 孙启,金燕,何琨,徐凌轩. 用于求解混合车辆路径问题的混合进化算法[J]. 计算机科学, 2018, 45(4): 76 -82 .
[3] 张佳男,肖鸣宇. 带权混合支配问题的近似算法研究[J]. 计算机科学, 2018, 45(4): 83 -88 .
[4] 伍建辉,黄中祥,李武,吴健辉,彭鑫,张生. 城市道路建设时序决策的鲁棒优化[J]. 计算机科学, 2018, 45(4): 89 -93 .
[5] 史雯隽,武继刚,罗裕春. 针对移动云计算任务迁移的快速高效调度算法[J]. 计算机科学, 2018, 45(4): 94 -99 .
[6] 周燕萍,业巧林. 基于L1-范数距离的最小二乘对支持向量机[J]. 计算机科学, 2018, 45(4): 100 -105 .
[7] 刘博艺,唐湘滟,程杰仁. 基于多生长时期模板匹配的玉米螟识别方法[J]. 计算机科学, 2018, 45(4): 106 -111 .
[8] 耿海军,施新刚,王之梁,尹霞,尹少平. 基于有向无环图的互联网域内节能路由算法[J]. 计算机科学, 2018, 45(4): 112 -116 .
[9] 崔琼,李建华,王宏,南明莉. 基于节点修复的网络化指挥信息系统弹性分析模型[J]. 计算机科学, 2018, 45(4): 117 -121 .
[10] 王振朝,侯欢欢,连蕊. 抑制CMT中乱序程度的路径优化方案[J]. 计算机科学, 2018, 45(4): 122 -125 .