Computer Science ›› 2024, Vol. 51 ›› Issue (4): 217-228.doi: 10.11896/jsjkx.231000051

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Video and Image Salient Object Detection Based on Multi-task Learning

LIU Zeyu, LIU Jianwei   

  1. College of Information Science and Engineering,China University of Petroleum,Beijing 102249,China
  • Received:2023-10-09 Revised:2024-01-04 Online:2024-04-15 Published:2024-04-10

Abstract: Salient object detection(SOD) can quickly identify high-value salient objects in complex scenes,which simulates human attention and lays the foundation for further vision understanding tasks.Currently,the mainstream methods for image-based salient object detection are usually trained on DUTS-TR dataset,while video-based salient object detection(VSOD) methods are trained on DAVIS,DAVSOD,and DUTS-TR datasets.Because image and video salient object detection tasks have general and specific characteristics,independent models need to be deployed for separate training,which greatly increases computational resources and training time.Current research typically focuses on independent solution for a single task.However,a unified method for both image and video salient object detection is lack of research.To address on aforementioned issues,this paper proposes a multi-task learning-based method for image and video salient object detection,aiming to build a universal framework which simultaneously adapts to both tasks with a single training process,and further bridges the performance gaps between image and video salient object detection models.Qualitative and quantitative experimental results on 12 datasets show that the proposed method can not only adapt to both tasks,but also achieve better detection results than single-task models.

Key words: Video-based salient object detection, Image-based salient object detection, Multi-task learning, Performance gaps

CLC Number: 

  • TP391
[1]TANG X,CHEN K,HAN L,et al.Salient object detection method for breast tumor in ultrasound images based on absor-bing Markov chain [J].Journal of X-Ray Science and Technology,2019,27(4):685-701.
[2]XUE X,LI Y,DONG H,et al.Robust Correlation Tracking for UAV Videos via Feature Fusion and Saliency Proposals [J].Remote Sensing,2018,10(10):1644-1665.
[3]LI S F,CHEN C L Z,WANG H S.Object saliency ranking awareness network for efficient image retrieval [J].Application Research of Computers,2023,40(10):3186-3193.
[4]SHAO Z,WANG L,WANG Z,et al.Saliency-Aware Convolution Neural Network for Ship Detection in Surveillance Video [J].IEEE Transactions on Circuits and Systems for Video Technology,2020,30:781-794.
[5]LI C Y,YUAN Y C,CAI W D,et al.Robust saliency detection via regularized random walks ranking [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Re-cognition.Boston,MA,USA:IEEE Press,2015:2710-2717.
[6]ZHU C B,LI G,WANG W M,et al.An innovative salient object detection using center-dark channel prior [C]//Proceedings of the IEEE International Conference on Computer Vision Workshops.Venice,Italy:lEEE Press,2017:1509-1515.
[7]QIN X B,ZHANG Z C,HUANG C Y,et al.BASNet:Boundary-Aware Salient Object Detection [C]//Proceedings of the IEEE International Conference on Computer Vision and Pattern Re-cognition.Long Beach,CA,USA:IEEE Press,2019:7479-7489.
[8]ZHAO J X,LIU J J,FAN D P,et al.Egnet:Edge guidance network for salient object detection [C]//Proceedings of the IEEE International Conference on Computer Vision.Seoul,Korea(South):IEEE Press,2019:8778-8787.
[9]LIU J J,HOU Q,CHENG M M,et al.A simple pooling based design for real-time salient object detection [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seoul,Korea(South):IEEE Press,2019:3912-3921.
[10]WANG L,LU H,WANG Y,et al.Learning to detect salient ob-jects with image-level supervision [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu,HI,USA:IEEE Press,2017:3796-3805.
[11]GU Y,WANG L,WANG Z,et al.Pyramid constrained self-attention network for fast video salient object detection [C]//Proceedings of the AAAI Conference on Artificial Intelligence.2010:10869-10876.
[12]PERAZZI F,PONT-TUSET J,BRIAN M,et al.A benchmark dataset and evaluation methodology for video object segmentation [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas,NV,USA:IEEE Press,2018:724-732.
[13]FAN D P,WANG W G,CHENG M M,et al.Shifting More Attention to Video Salient Object Detection [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach,CA,USA:IEEE Press,2019:8554-8564.
[14]LI H,CHEN G,LI G B,et al.Motion Guided Attention for Vi-deo Salient Object Detection [C]//Proceedings of the IEEE International Conference on Computer Vision.Seoul,Korea(South):IEEE Press,2019:7273-7282.
[15]ITTI L,DHAVALE N,PIGHIN F,et al.Realistic avatar eye and head animation using a neurobiological model of visual attention [C]//Proceedings of the SPIE Annual Meeting.San Diego,California,USA,2003:64-78.
[16]SONG H M,WANG W G,ZHAO S Y,et al.Pyramid dilated deeper convlstm for video salient object detection [C]//Proceedings of the European Conference on Computer Vision.Seoul,Korea(South):IEEE Press,2018:715-731.
[17]LE T N,SUGIMOTO A.Video salient object detection using spatiotemporal deep features [J].IEEE Transactions on Image Processing,2018,27:5002-5015.
[18]TANG Y,ZOU W,HUA Y,et al.Video salient object detection via spatiotemporal attention neural networks [J].Neurocompu-ting,2020,377:27-37.
[19]ZHENG Q,LI Y,ZHENG L,et al.Progressively real-time video salient object detection via cascaded fully convolutional networks with motion attention [J].Neurocomputing,2022,467:465-475.
[20]WU R,FENG M,GUAN W L,et al.A Mutual Learning Me-thod for Salient Object Detection with Intertwined Multi-Supervision [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach,CA,USA:IEEE Press,2019:8142-8151.
[21]LI X,ZHAO L M,WEI L N,et al,DeepSaliency:Multi-Task Deep Neural Network Model for Salient Object Detection [J].IEEE Transactions on Image Processing,2016,25:3919-3930.
[22]HOU Q B,LIU J J,CHENG M M,et al.Three Birds OneStone:A General Architecture for Salient Object Segmentation,Edge Detection and Skeleton Extraction [J].arxiv:1803.09860,2018.
[23]HE K,ZHANG X,REN S Q,et al.Deep residual learning forimage recognition [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas,NV,USA:IEEE Press,2016:770-778.
[24]RUSSAKOVSKY O,DENG J,SU H,et al.ImageNet LargeScale Visual Recognition Challenge [J].International Journal of Computer Vision,2015,115:211-252.
[25]WANG X L,GIRSHICK R,GUPTA A,et al.Non-Local Neural Networks [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City,UT,USA:IEEE Press,2018:7794-7803.
[26]YUAN Y,MOU L,LU X,et al.Scene recognition by manifold regularized deep learning architecture [J].IEEE Transactions on Neural Networks and Learning Systems,2015,26:2222-2233.
[27]WANG W,SHEN J,SHAO L,et al.Consistent video saliency using local gradient flow optimization and global refinement [J].IEEE Transactions on Image Processing,2015,24:4185-4196.
[28]LI J,XIA C,CHEN X,et al.A benchmark dataset and saliency guided stacked autoencoders for video-based salient object detection [J].IEEE Transactions on Image Processing,2018,27:349-364.
[29]LI F,KIN T,HUMAYUN A,et al.Video segmentation bytracking many figure-ground segments [C]//Proceedings of the IEEE International Conference on Computer Vision.Sydney,NSW,Australia:IEEE Press,2013:2192-2199.
[30]YAN Q,XU L,SHI J,et al.Hierarchical saliency detection [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Portland,OR,USA:IEEE Press,2013:1155-1162.
[31]LI Y,HOU X,KOCH C,et al.The secrets of salient object segmentation [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Columbus,OH,USA:IEEE Press,2014:280-287.
[32]LI G,YU Y.Visual saliency based on multiscale deep features [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Boston,MA:IEEE Press,2015:5455-5463
[33]YANG C,ZHANG L,LU H C,et al.Saliency detection viagraph-based manifold ranking [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Portland,OR,USA:IEEE Press,2013:3166-3173.
[34]MOVAHEDI V,ELDER J H.Design and perceptual validation of performance measures for salient object segmentation [C]//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.San Francisco,CA,USA:IEEE Press,2010:49-56.
[35]WU Z,SU L,HUANG Q,et al.Cascaded partial decoderfor fast and accurate salient object detection [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach,CA,USA:IEEE Press,2019:3902-3911.
[36]ZHANG P,WANG D,LU H C,et al.Learning uncertain convolutional features for accurate saliency detection [C]//Procee-dings of the IEEE International Conference on Computer Vision.Venice,Italy:IEEE Press,2017:212-221.
[37]ZHANG P,WANG D,LU H C,et al.Amulet:Aggregating multilevel convolutional features for salient object detection [C]//Proceedings of the IEEE International Conference on Computer Vision.Venice,Italy:IEEE Press,2017:202-211.
[38]HOU Q,CHENG M M,HU X W,et al.Deeply supervised salient object detection with short connections [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,41(4):815-828.
[39]WANG T,BORJI A,ZHANG L H,et al.A stagewise refine-ment model for detecting salient objects in images [C]//Proceedings of the IEEE International Conference on Computer Vision.Venice,Italy:IEEE Press,2017:4039-4048.
[40]LUO Z,MISHRA A,ACHKAR A,et al.Non-local deep features for salient object detection [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu,HI,USA:IEEE Press,2017:6593-6601.
[41]ZHANG L,DAI J,LU H C,et al.A bi-directional message pas-sing model for salient object detection [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City,UT,USA:IEEE Press,2018:1741-1750.
[42]DENG Z J,HU X W,ZHU L,et al.R3net:Recurrent residual refinement network for saliency detection [C]//Proceedings of the International Joint Conference on Artificial Intelligence.2018:684-690.
[43]CHEN S,TAN X,WANG B,et al.Reverse Attention-Based Residual Network for Salient Object Detection [J].IEEE Transactions on Image Processing,2020,29:3763-3776.
[44]ZHANG X N,WANG T T,QI J Q,et al.Progressive attention guided recurrent network for salient object detection [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City,UT,USA:IEEE Press,2018:714-722.
[45]LIU N,HAN J,YANG M H.Picanet:Pixel-wise contextual attention learning for accurate saliency detection [J].IEEE Transactions on Image Processing,2020,29:6438-6451.
[46]WANG T,ZHANG L,WANG S,et al.Detect globally,refine locally:A novel approach to saliency detection [C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City,UT,USA:IEEE Press,2018:3127-3135.
[47]ZENG Y,ZHUGE Y Z,LU H C,et al.Multi-source weak supervision for saliency detection [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach,CA,USA:IEEE Press,2019:6067-6076.
[48]ZHANG L,ZHANG J,LIN Z,et al.Capsal:Leveraging captioning to boost semantics for salient object detection [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach,CA,USA:IEEE Press,2019:6017-6026.
[49]LIU Y,CHENG M M,ZHANG X Y,et al.DNA:Deeply supervised nonlinear aggregation for salient object detection [J].IEEE Transactions on Cybernetics,2022,52:6131-6142.
[50]MOHAMMADI S,NOORI M,BAHRI A,et al.Cagnet:Con-tent-aware guidance for salient object detection [J].Pattern Recognition,2020,103:107303.
[51]PANG Y,ZHAO X,ZHANG L H,et al.Multi-scale interactive network for salient object detection [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle,WA,USA:IEEE Press,2020:9410-9419.
[52]FENG M Y,LU H C,DING Y.Attentive feedback network for boundary-aware salient object detection [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach,CA,USA:IEEE Press,2019:1623-1632.
[53]LI X,SONG D,DONG Y S.Hierarchical feature fusion network for salient object detection [J].IEEE Transactions on Image Processing,2020,29:9165-9175.
[54]TU Z,MA Y,LI C L,et al.Edge-guidednon-local fully convolu-tional network for salient object detection [J].IEEE Transactions on Circuits and Systems for Video Technology,2021,31:582-593.
[55]RAHTU E,KANNALA J,SALO M,et al.Segmenting salient objects from images and videos [C]//Proceedings of the European Conference on Computer Vision.Berlin,Heidelberg:Springer Press,2017:366-379.
[56]ZHOU F,KANG S B,COHEN M F.Time-mapping using space-time saliency [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Columbus,OH,USA:IEEE Press,2014:3358-3365.
[57]KIM H,KIM Y,SIM J Y,et al.Spatiotemporal saliency detection for video sequences based on random walk with restart [J].IEEE Transactions on Image Processing,2015,24:2552-2564.
[58]WANG W,SHEN J,PORIKLJ F,et al.Saliency-aware geodesic video object segmentation [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Boston,MA,USA:IEEE Press,2015:3395-3402.
[59]LI S,SEYBOLD B,VOROBYOV A,et al.Unsupervised video objectsegmentation with motion-based bilateral networks[C]//Proceedings of the European Conference on Computer Vision.Munich,Germany:Springer Press,2018:215-231.
[60]TANG Y,ZOU W,JIN Z,et al.Weakly supervised salient object detection with spatiotemporal cascade neural networks [J].IEEE Transactions on Circuits and Systems for Video Technology,2019,29:1973-1984.
[61]CHEN Y,ZOU W,TANG Y,et al.Scom:Spatiotemporal constrained optimization for salient object detection [J].IEEE Transactions on Image Processing,2018,27:3345-3357.
[62]WANG W,SHEN J,SHAO L,et al.Video salient object detection via fully convolutional networks [J].IEEE Transactions on Image Processing,2018,27:38-49.
[63]LI G B,XIE Y,WEI T H,et al.Flow guided recurrent neural encoder for video salient object detection [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City,UT,USA:IEEE Press,2018:3243-3252.
[64]CHEN C,LI S,WANG Y,et al.Video saliency detection viaspatial-temporal fusion and low-rank coherency diffusion [J].IEEE Transactions on Image Processing,2017,26:3156-3170.
[65]YAN P,LI G,XIE Y,et al.Semi-supervised video salient object detection using pseudo-labels [C]//Proceedings of the IEEE International Conference on Computer Vision.Seoul,Korea(South):IEEE Press,2019:7283-7292.
[66]ZHAO W,ZHANG J,LI L,et al.Weakly supervised video sa-lient object detection [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Nashville,TN,USA:IEEE Press,2021:16821-16830.
[67]WANG W,SONG H,ZHAO S Y,et al.Learning unsupervised video object segmentation through visual attention [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach,CA,USA:IEEE Press,2019:3059-3069.
[68]XI T,ZHAO W,WANG H,et al.Salient object detection with spatiotemporal background priors for video [J].IEEE Transactions on Image Processing,2017,26:3425-3436.
[69]LIU Z,LI J,YE L,et al.Saliency detection for unconstrainedvideos using superpixel-level graph and spatiotemporal propagation [J].IEEE Transactions on Circuits and Systems for Video Technology,2017,27:2527-2542.
[70]LIU B,MU K,XU M,et al.A novelspatiotemporal attention enhanced discriminative network for video salient object detection [J].Applied Intelligence,2022,52:5922-5937.
[71]ZHANG M,LIU J,WANG Y F,et al.Dynamic Context-Sensitive Filtering Network for Video Salient Object Detection [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Montreal,QC,Canada:IEEE Press,2021:1533-1543.
[1] LUO Huilan, YE Ju. Study of Multi-task Learning with Joint Semantic Segmentation and Depth Estimation [J]. Computer Science, 2023, 50(6A): 220100111-10.
[2] ZHEN Tiange, SONG Mingyang, JING Liping. Incorporating Multi-granularity Extractive Features for Keyphrase Generation [J]. Computer Science, 2023, 50(4): 181-187.
[3] DU Li-jun, TANG Xi-lu, ZHOU Jiao, CHEN Yu-lan, CHENG Jian. Alzheimer's Disease Classification Method Based on Attention Mechanism and Multi-task Learning [J]. Computer Science, 2022, 49(6A): 60-65.
[4] ZHAO Kai, AN Wei-chao, ZHANG Xiao-yu, WANG Bin, ZHANG Shan, XIANG Jie. Intracerebral Hemorrhage Image Segmentation and Classification Based on Multi-taskLearning of Shared Shallow Parameters [J]. Computer Science, 2022, 49(4): 203-208.
[5] YANG Xiao-yu, YIN Kang-ning, HOU Shao-qi, DU Wen-yi, YIN Guang-qiang. Person Re-identification Based on Feature Location and Fusion [J]. Computer Science, 2022, 49(3): 170-178.
[6] ZHENG Shun-yuan, HU Liang-xiao, LYU Xiao-qian, SUN Xin, ZHANG Sheng-ping. Edge Guided Self-correction Skin Detection [J]. Computer Science, 2022, 49(11): 141-147.
[7] SONG Long-ze, WAN Huai-yu, GUO Sheng-nan, LIN You-fang. Multi-task Spatial-Temporal Graph Convolutional Network for Taxi Idle Time Prediction [J]. Computer Science, 2021, 48(7): 112-117.
[8] LIU Xiao-long, HAN Fang, WANG Zhi-jie. Joint Question Answering Model Based on Knowledge Representation [J]. Computer Science, 2021, 48(6): 241-245.
[9] ZHOU Xiao-jin, XU Chen-ming, RUAN Tong. Multi-granularity Medical Entity Recognition for Chinese Electronic Medical Records [J]. Computer Science, 2021, 48(4): 237-242.
[10] ZHANG Chun-yun, QU Hao, CUI Chao-ran, SUN Hao-liang, YIN Yi-long. Process Supervision Based Sequence Multi-task Method for Legal Judgement Prediction [J]. Computer Science, 2021, 48(3): 227-232.
[11] WANG Ti-shuang, LI Pei-feng, ZHU Qiao-ming. Chinese Implicit Discourse Relation Recognition Based on Data Augmentation [J]. Computer Science, 2021, 48(10): 85-90.
[12] PAN Zu-jiang, LIU Ning, ZHANG Wei, WANG Jian-yong. MTHAM:Multitask Disease Progression Modeling Based on Hierarchical Attention Mechanism [J]. Computer Science, 2020, 47(9): 185-189.
[13] ZHOU Zi-qin, YAN Hua. 3D Shape Recognition Based on Multi-task Learning with Limited Multi-view Data [J]. Computer Science, 2020, 47(4): 125-130.
[14] GENG Lei-lei, CUI Chao-ran, SHI Cheng, SHEN Zhen, YIN Yi-long, FENG Shi-hong. Social Image Tag and Group Joint Recommendation Based on Deep Multi-task Learning [J]. Computer Science, 2020, 47(12): 177-182.
[15] CHEN Xun-min, YE Shu-han, ZHAN Rui. Crowd Counting Model of Convolutional Neural Network Based on Multi-task Learning and Coarse to Fine [J]. Computer Science, 2020, 47(11A): 183-187.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!