基于3D全时序卷积神经网络的视频显著性检测

doi:10.11896/jsjkx.190600148

Abstract

Abstract: Video saliency detection aims to mimic human’s visual attention mechanism of perceiving the world via extracting the most attractive regions or objects in the input video.At present, it is still a challenge for video saliency detection.Traditional video saliency-detection models have reached a certain level, but exploiting the consistency of spatio-temporal information is unsatisfactory.In order to solve this issue, this paper proposes a video saliency-detection model based on 3D full ConvLSTM neural network.Firstly, the full-time convolution is utilized to extract spatio-temporal features from the input video, and then the 3D pooling layer is explored for dimensionality reduction.Secondly, the extracted features are decoded by 3D deconvolution in the decoding layer, and the interpolation algorithm is applied to restore the saliency map to the original size of the original image.The proposed method extracts the time and space information jointly so as to effectively enhance the completeness of the saliency map.Experimental results show that the performance of the proposed algorithm is superior to state-of-the-art video saliency detection methods based on three widely used data sets (DAVIS, FBMS, SegTrack) for video saliency detection.

Key words: ConvLSTM, Neural network, Saliency detection, Spatio-temporal feature

CLC Number:

TP391

WANG Jiao-jin, JIAN Mu-wei, LIU Xiang-yu, LIN Pei-guang, GEN Lei-lei, CUI Chao-ran, YIN Yi-long. Video Saliency Detection Based on 3D Full ConvLSTM Neural Network[J].Computer Science, 2020, 47(8): 195-201.

References

[1]RUSSAKOVSKY O, DENG J, SU H, et al.ImageNet large scale visual recognition challenge[J].Internationl Journal ofCompu-ter Vision, 2015, 115(3):211-252.
[2]BROX, MALIK J.Object segmentation by long term analysis of point trajectories[C]∥Proc. Eur. Conf. Comput. Vis..2010:282-295.
[3]LI F, KIM T, HUMAYUN A, et al.Video segmentation bytracking many figure-ground segments[C]∥Proc. IEEE Int. Conf. Comput. Vis..2013:2192-2199.
[4]LI J, XIA C, CHEN X.A benchmark dataset and saliency-guided stacked autoencoders for video-based salient object detection[J]. IEEE Trans.Image Process., 2018, 27(1):349-364.
[5]GALASSO F, NAGARAJA N S, CARDENAS T, et al.A uni-fied video segmentation benchmark:Annotation, metrics and analysis[C]∥Proc.IEEE ICCV.2013:3527-3534.
[6]LIU Z, ZHANG X, LUO S, et al.Superpixel-based spatiotemporal saliency detection[J].IEEE TCSVT, 2014, 24(9):1522-1540.
[7]FANG Y, WANG Z, LIN W, et al.Video saliency incorporating spatiotemporal cues and uncertainty weighting[J].IEEE TIP, 2014, 23(9):3910-3921.
[8]WANG L, WANG L, LU H, et al.Saliency detection with recurrent fully convolutional networks[C]∥ECCV.2016:825-841.
[9]LIU Z, LI J, YE L, et al.Saliency detection for unconstrainedvideos using superpixel-level graph and spatiotemporal propagation[J].IEEE Trans.Circuits Syst.Video Technol., 2017, PP(9):1-17.
[10]WANG W, SHEN J, PORIKLI F.Saliency-aware geodesic video object segmentation[C]∥IEEE CVPR.2015:3395-3402.
[11]CHENG M M, MITRA N J, HUANG X, et al.Global contrast based salient region detection[J].IEEE TPAMI, 2015, 37(3):569-582.
[12]HOCHREITER S, SCHMIDHUBER J.Long short-term memory[J].Neural Computation, 1997, 9(8):1735-1780.
[13]SHI X, CHEN Z, WANG H, et al.Convolutional LSTM network:A machine learning approach for precipitation nowcasting[C]∥NIPS.2015.
[14]CONG R, LEI J, FU H, et al.Co-saliency detection for rgbd images based on multi-constraint feature matching and cross label propagation[J].IEEE TIP, 2018, 27(2):568-579.
[15]FU H, XU D, ZHANG B, et al.Object-based multiple fore-ground video co-segmentation via multi-state selection graph[J].IEEE TIP, 2015, 24(11):3415-3424.
[16]HE K, ZHANG X, REN S, et al.Spatial pyramid pooling in deep convolutional networks for visual recognition[J].IEEE TPAMI, 2015, 37(9):1904-1916.
[17]KOH Y J, KIM C S.Primary object segmentation in videosbased on region augmentation and reduction[C]∥IEEE CVPR.2017:7417-7425.
[18]LIU Z, LI J, YE L, et al.Saliency detection for unconstrainedvideos using superpixel-level graph and spatiotemporal propagation[J].IEEE TCSVT, 2017, 27(12):2527-2542.
[19]WANG W, SHEN J, SHAO L.Consistent video saliency using local gradient flow optimization and global refinement[J].IEEE TIP, 2015, 24(11):4185-4196.
[20]KIM H, KIM Y, SIM J Y, et al.Spatiotemporal saliency detection for video sequences based on random walk with restart[J].IEEE Trans.Image Process., 2015, 24(8):2552-2564.
[21]CHEN C, LI S, WANG Y, et al.Video saliency detection viaspatial-temporal fusion and low-rank coherency diffusion[J].IEEE Trans.Image Process., 2017, 26(7):3156-3170.
[22]CHENG M M, MITRA N J, HUANG X L, et al.Salient shape:group saliency in image collections[J].The Visual Computer, 2014, 30(4):443-453.
[23]FANG Y, LIN W, CHEN Z, et al.A video saliency detection model in compressed domain[J].IEEE Trans.Circuits Syst.Video Technol., 2014, 24(1):27-38.
[24]LI G, XIE Y, WEI T, et al.Flow guided recurrent neural encoder for video salient object detection[C]∥IEEE CVPR.2018:3243-3252.
[25]ILG E, MAYER N, SAIKIA T, et al.Flownet 2.0:Evolution of optical flow estimation with deep networks[C]∥IEEE CVPR.2017:2462-2470.
[26]WANG W, SHEN J, SHAO L.Video salient object detection via fully convolutional networks[J].IEEE TIP, 2018, 27(1):38-49.
[27]SHI X, CHEN Z, WANG H, et al.Convolutional LSTM network:A machine learning approach for precipitation nowcas-ting[C]∥NIPS.2015.
[28]YANG C, ZHANG L, LU H, et al.Saliency detection via graphbased manifold ranking[C]∥IEEE CVPR.2013:3166-3173.
[29]ZHANG P, WANG D, LU H, et al.Amulet:Aggregating multi-level convolutional features for salient object detection[C]∥IEEE ICCV.2017:202-211.
[30]LE T N, SUGIMOTO A.Deeply supervised 3D recurrent FCN for salient object detection in videos[C]∥BMVC.2017:1-13.
[31]PERAZZI F, PONT-TUSET J, MCWILLIAMS B, et al.A ben-chmark dataset and evaluation methodology for video object segmentation[C]∥Proc.CVPR..2016:724-732.
[32]HOU Q, CHENG M M, HU X, et al.Deeply supervised salient object detection with short connections[C]∥Proc.IEEE Conf.Comput.Vis.Pattern Recognit..2017:5300-5309.
[33]FANG Y, WANG Z, LIN W, et al.Video saliency incorporating spatiotemporal cues and uncertainty weighting.IEEE Trans.Image Process., 2014, 22(9):3910-3921.
[34]XI T, ZHAO W, WANG H, et al.Salient object detection with spatiotemporal background priors for video[J].IEEE Trans.Ima-ge Process., 2017, 26(7):3425-3436.
[35]FAN D P, CHENG M M, LIU Y, et al.Structure-measure:Anew way to evaluate foreground maps[C]∥Proceedings of the IEEE International Conference on Computer Vision.2017:4548-4557.
[36]FAN D P, GONG C, CAO Y, et al.Enhanced-alignment measure for binary foreground map evaluation[J].arXiv:1805.10421, 2018.
[37]FAN D P, CHENG M M, LIU J J, et al.Salient objects in clutter:Bringing salient object detection to the foreground[C]∥IEEE ECCV.2018:186-202.
[38]JIAN M, LAM K M, DONG J, et al.Visual-patch-attention-aware Saliency Detection[J].IEEE Transactions on Cyberne-tics, 2015, 45(8):1575-1586.
[39]JIAN M, QI Q, DONG J, et al.Integrating QDWD with Pattern Distinctness and Local Contrast for Underwater Saliency Detection[J].Journal of Visual Communication and Image Representation, 2018, 53:31-41.
[40]JIAN M, ZHOU Q, CUI C, et al.Assessment of Feature Fusion Strategies in Visual Attention Mechanism for Saliency Detection, Pattern Recognition Letters[OL].

Related Articles 15

[1]	NING Han-yang, MA Miao, YANG Bo, LIU Shi-chang. Research Progress and Analysis on Intelligent Cryptology [J]. Computer Science, 2022, 49(9): 288-296.
[2]	ZHOU Fang-quan, CHENG Wei-qing. Sequence Recommendation Based on Global Enhanced Graph Neural Network [J]. Computer Science, 2022, 49(9): 55-63.
[3]	ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[4]	HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[5]	WANG Run-an, ZOU Zhao-nian. Query Performance Prediction Based on Physical Operation-level Models [J]. Computer Science, 2022, 49(8): 49-55.
[6]	CHEN Yong-quan, JIANG Ying. Analysis Method of APP User Behavior Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(8): 78-85.
[7]	ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[8]	YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[9]	QI Xiu-xiu, WANG Jia-hao, LI Wen-xiong, ZHOU Fan. Fusion Algorithm for Matrix Completion Prediction Based on Probabilistic Meta-learning [J]. Computer Science, 2022, 49(7): 18-24.
[10]	YANG Bing-xin, GUO Yan-rong, HAO Shi-jie, Hong Ri-chang. Application of Graph Neural Network Based on Data Augmentation and Model Ensemble in Depression Recognition [J]. Computer Science, 2022, 49(7): 57-63.
[11]	ZHANG Ying-tao, ZHANG Jie, ZHANG Rui, ZHANG Wen-qiang. Photorealistic Style Transfer Guided by Global Information [J]. Computer Science, 2022, 49(7): 100-105.
[12]	DAI Zhao-xia, LI Jin-xin, ZHANG Xiang-dong, XU Xu, MEI Lin, ZHANG Liang. Super-resolution Reconstruction of MRI Based on DNGAN [J]. Computer Science, 2022, 49(7): 113-119.
[13]	LIU Yue-hong, NIU Shao-hua, SHEN Xian-hao. Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(7): 127-131.
[14]	XU Ming-ke, ZHANG Fan. Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition [J]. Computer Science, 2022, 49(7): 132-141.
[15]	PENG Shuang, WU Jiang-jiang, CHEN Hao, DU Chun, LI Jun. Satellite Onboard Observation Task Planning Based on Attention Neural Network [J]. Computer Science, 2022, 49(7): 242-247.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Video Saliency Detection Based on 3D Full ConvLSTM Neural Network

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0