计算机科学 ›› 2021, Vol. 48 ›› Issue (6): 103-109.doi: 10.11896/jsjkx.200600068

• 计算机图形学&多媒体 • 上一篇    下一篇

学习全局引导渐进特征聚合轻量级网络的显著性目标检测

潘明远, 宋慧慧, 张开华, 刘青山   

  1. 南京信息工程大学大气环境与装备技术协同创新中心 南京210044
    南京信息工程大学江苏省大数据分析技术重点实验室 南京210044
  • 收稿日期:2020-06-12 修回日期:2020-09-22 出版日期:2021-06-15 发布日期:2021-06-03
  • 通讯作者: 宋慧慧(songhuihui@nuist.edu.cn)
  • 基金资助:
    国家新一代人工智能重大项目(2018AAA0100400);国家自然科学基金(61872189,61876088);江苏省自然科学基金(BK20191397,BK20170040)

Learning Global Guided Progressive Feature Aggregation Lightweight Network for Salient Object Detection

PAN Ming-yuan, SONG Hui-hui, ZHANG Kai-hua, LIU Qing-shan   

  1. Collaborative Innovation Center on Atmospheric Environment and Equipment Technology,Nanjing University of Information Science and Technology,Nanjing 210044,China
    Jiangsu Key Laboratory of Big Data Analysis Technology,Nanjing University of Information Science and Technology,Nanjing 210044,China
  • Received:2020-06-12 Revised:2020-09-22 Online:2021-06-15 Published:2021-06-03
  • About author:PAN Ming-yuan,born in 1995,postgraduate.His main research interest is salient object detection.(pan_mingyuan@foxmail.com)
    SONG Hui-hui,born in 1986,Ph.D,professor,is a member of China Computer Federation.Her main research interests include saliency detection and image super-resolution.
  • Supported by:
    National Major Project of China for New Generation of AI (2018AAA0100400),National Natural Science Foundation of China(61872189,61876088) and Natural Science Foundation of Jiangsu Province(BK20191397,BK20170040).

摘要: 针对目前显著性目标检测算法中存在的特征融合不充分、模型较为冗余等问题,提出了一种基于全局引导渐进特征融合的轻量级显著性目标检测算法。首先,使用轻量特征提取网络MobileNetV3对图像提取不同层次的多尺度特征;然后对MobileNetV3提取的高层语义特征使用轻量级多尺度感受野增强模块以进一步增强其全局特征的表征力;最后设计渐进特征融合模块对多层多尺度特征自顶而下逐步融合,并采用常用的交叉熵损失函数在多个阶段对这些融合特征进行优化,得到由粗到细的显著图。整个网络模型是无需预处理和后处理的端到端结构。在6个基准数据集上进行了大量实验,并采用PR_Curve,F-measure,S-measure和MAE指标来衡量性能。结果表明,所提方法明显优于10种先进的对比方法,并且算法模型大小仅约为10MB,在GTX2080Ti显卡上处理大小为400×300像素的图像的速度可以达到46帧/秒。

关键词: 卷积神经网络, 快速, 轻量, 特征融合, 显著性目标检测

Abstract: To solve the problems of insufficient feature fusion and redundant models in salient object detection algorithms,this paper proposes a novel globally guided progressive feature aggregation network for lightweight salient object detection.Firstly,the lightweight feature extraction network MobileNetV3 is used to extract different levels of features of the image.Then,the lightweight multi-scale receptive field enhancement module is applied to further enhance the global representation of the highestlevel feature extracted by MobileNetV3.Finally,the progressive feature aggregation module is utilized to progressively fuse high-level and low-level features from top to bottom and the common cross entropy loss function is used to optimize these fused features in multiple stages,so as to obtain the saliency maps from coarse to fine.The whole network is an absolute end-to-end framework without any pre-processing and post-processing.Extensive experiments on six benchmark datasets demonstrate the superiority of the proposed method against other 10 methods in terms of metrics such as PR Curve,F-measure,S-measure and MAE.At the same time,the model is only about 10MB and can run at a speed of 46FPS on a GTX2080Ti GPU when processing a 400×300 image.

Key words: Convolutional neural network, Fast, Feature fusion, Lightweight, Salient object detection

中图分类号: 

  • TP391
[1]WANG Y,XU X F.Image Segmentation Based on Saliency and Pulse Coupled Neural Network[J].Computer Science,2018,45(7):259-263.
[2]DONOSER M,URSCHLER M,HIRZER M,et al.Saliencydriven total variation segmentation[C]//2009 IEEE 12th International Conference on Computer Vision.IEEE,2009:817-824.
[3]ZHANG D,MENG D,ZHAO L,et al.Bridging saliency detection to weakly supervised object[J].arXiv:1703.01290,2017.
[4]ZHANG Z F,WU Z M,DU L,et al.Video Saliency Detection Based on Compressed Domain Coding Length[J].Computer Science,2017,44(10):312-317.
[5]FAN D P,WANG W,CHENG M M,et al.Shifting more attention to video salient object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:8554-8564.
[6]HONG S,YOU T,KWAK S,et al.Online tracking by learning discriminative saliency map with convolutional neural network[C]//International Conference on Machine Learning.2015:597-606.
[7]YANG C,ZHANG L,LU H,et al.Saliency detection via graph-based manifold ranking[C]//Proceedings of the IEEEConfe-rence on Computer Vision and Pattern Recognition.2013:3166-3173.
[8]ZHU W,LIANG S,WEI Y,et al.Saliency optimization from robust background detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2014:2814-2821.
[9]CHENG M M,MITRA N J,HUANG X,et al.Global contrast based salient region detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2014,37(3):569-582.
[10]HOU Q,CHENG M M,HU X,et al.Deeply supervised salient object detection with short connections[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:3203-3212.
[11]TANG Y,WU X,BU W.Deeply-supervised recurrent convolutional neural network for saliency detection[C]//Proceedings of the 24th ACM International Conference on Multimedia.2016:397-401.
[12]ZHANG P,WANG D,LU H,et al.Amulet:Aggregating multi-level convolutional features for salient object detection[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:202-211.
[13]LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:3431-3440.
[14]LI X,ZHAO L,WEI L,et al.Deepsaliency:Multi-task deep neural network model for salient object detection[J].IEEETransa-ctions on Image Processing,2016,25(8):3919-3930.
[15]CHEN S,TAN X,WANG B,et al.Reverse attention for salient object detection[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:234-250.
[16]WANG W,ZHAO S,SHEN J,et al.Salient object detectionwith pyramid attention and salient edges[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:1448-1457.
[17]HOWARD A,SANDLER M,CHU G,et al.Searching for mobilenetv3[C]//Proceedings of the IEEE International Confe-rence on Computer Vision.2019:1314-1324.
[18]WANG W,LAI Q,FU H,et al.Salient object detection in the deep learning era:An in-depth survey[J].arXiv:1904.09146,2019.
[19]BORJI A,CHENG M M,JIANG H,et al.Salient object detection:A benchmark[J].IEEE Transactions on Image Processing,2015,24(12):5706-5722.
[20]XIE S,TU Z.Holistically-nested edge detection[C]//Procee-dings of the IEEE International Conference on Computer Vision.2015:1395-1403.
[21]WANG T,BORJI A,ZHANG L,et al.A stagewise refinement model for detecting salient objects in images[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:4019-4028.
[22]WU Z,SU L,HUANG Q.Cascaded partial decoder for fast and accurate salient object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:3907-3916.
[23]ISLAM M A,KALASH M,ROCHAN M,et al.Salient Object Detection using a Context-Aware Refinement Network[C]//BMVC.2017.
[24]DENG Z,HU X,ZHU L,et al.R3net:Recurrent residual refinement network for saliency detection[C]//Proceedings of the 27th International Joint Conference on Artificial Intelligence.AAAI Press,2018:684-690.
[25]WANG T,ZHANG L,WANG S,et al.Detect globally,refine locally:A novel approach to saliency detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:3127-3135.
[26]LIU N,HAN J,YANG M H.Picanet:Learning pixel-wise contextual attention for saliency detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:3089-3098.
[27]LIU S,HUANG D.Receptive field block net for accurate andfast object detection[C]//Proceedings of the European Confe-rence on Computer Vision(ECCV).2018:385-400.
[28]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7132-7141.
[29]HOU Q,ZHANG L,CHENG M M,et al.Strip Pooling:Rethinking Spatial Pooling for Scene Parsing[J].arXiv:2003.13328,2020.
[30]WANG L,LU H,WANG Y,et al.Learning to detect salient objects with image-level supervision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:136-145.
[31]DENG J,DONG W,SOCHER R,et al.Imagenet:A large-scale hierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2009:248-255.
[32]KINGMA D P,BA J.Adam:A method for stochastic optimization[J].arXiv:1412.6980,2014.
[33]LI X,LU H,ZHANG L,et al.Saliency detection via dense and sparse reconstruction[C]//Proceedings of the IEEE International Conference on Computer Vision.2013:2976-2983.
[34]LI G,YU Y.Visual saliency based on multiscale deep features[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:5455-5463.
[35]LI Y,HOU X,KOCH C,et al.The secrets of salient object segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2014:280-287.
[36]MOVAHEDI V,ELDER J H.Design and perceptual validation
of performance measures for salient object segmentation[C]//2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops.IEEE,2010:49-56.
[37]YANG C,ZHANG L,LU H,et al.Saliency detection via graph-based manifold ranking[C]//Proceedings of the IEEEConfe-rence on Computer Vision and Pattern Recognition.2013:3166-3173.
[38]MARTIN D,FOWLKES C,TAL D,et al.A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics[C]//Proceedings Eighth IEEE International Conference on Computer Vision(ICCV 2001).IEEE,2001:416-423.
[39]FAN D P,CHENG M M,LIU Y,et al.Structure-measure:Anew way to evaluate foreground maps[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:4548-4557.
[40]LI G,YU Y.Deep contrast learning for salient object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:478-487.
[41]LUO Z,MISHRA A,ACHKAR A,et al.Non-local deep features for salient object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:6609-6617.
[42]ZHANG P,WANG D,LU H,et al.Learning uncertain convolutional features for accurate saliency detection[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:212-221.
[43]FENG M,LU H,DING E.Attentive feedback network forboundary-aware salient object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:1623-1632.
[1] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[2] 李宗民, 张玉鹏, 刘玉杰, 李华.
基于可变形图卷积的点云表征学习
Deformable Graph Convolutional Networks Based Point Cloud Representation Learning
计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[3] 陈泳全, 姜瑛.
基于卷积神经网络的APP用户行为分析方法
Analysis Method of APP User Behavior Based on Convolutional Neural Network
计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121
[4] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[5] 檀莹莹, 王俊丽, 张超波.
基于图卷积神经网络的文本分类方法研究综述
Review of Text Classification Methods Based on Graph Convolutional Network
计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[6] 张颖涛, 张杰, 张睿, 张文强.
全局信息引导的真实图像风格迁移
Photorealistic Style Transfer Guided by Global Information
计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036
[7] 戴朝霞, 李锦欣, 张向东, 徐旭, 梅林, 张亮.
基于DNGAN的磁共振图像超分辨率重建算法
Super-resolution Reconstruction of MRI Based on DNGAN
计算机科学, 2022, 49(7): 113-119. https://doi.org/10.11896/jsjkx.210600105
[8] 程成, 降爱莲.
基于多路径特征提取的实时语义分割方法
Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction
计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157
[9] 刘月红, 牛少华, 神显豪.
基于卷积神经网络的虚拟现实视频帧内预测编码
Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network
计算机科学, 2022, 49(7): 127-131. https://doi.org/10.11896/jsjkx.211100179
[10] 徐鸣珂, 张帆.
Head Fusion:一种提高语音情绪识别的准确性和鲁棒性的方法
Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition
计算机科学, 2022, 49(7): 132-141. https://doi.org/10.11896/jsjkx.210100085
[11] 金方焱, 王秀利.
融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取
Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM
计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
[12] 张嘉淏, 刘峰, 齐佳音.
一种基于Bottleneck Transformer的轻量级微表情识别架构
Lightweight Micro-expression Recognition Architecture Based on Bottleneck Transformer
计算机科学, 2022, 49(6A): 370-377. https://doi.org/10.11896/jsjkx.210500023
[13] 沈超, 何希平.
基于纹理特征增强和轻量级网络的人脸防伪算法
Face Anti-spoofing Algorithm Based on Texture Feature Enhancement and Light Neural Network
计算机科学, 2022, 49(6A): 390-396. https://doi.org/10.11896/jsjkx.210600217
[14] 王建明, 陈响育, 杨自忠, 史晨阳, 张宇航, 钱正坤.
不同数据增强方法对模型识别精度的影响
Influence of Different Data Augmentation Methods on Model Recognition Accuracy
计算机科学, 2022, 49(6A): 418-423. https://doi.org/10.11896/jsjkx.210700210
[15] 陈永平, 朱建清, 谢懿, 吴含笑, 曾焕强.
基于外接圆半径差损失的实时安全帽检测算法
Real-time Helmet Detection Algorithm Based on Circumcircle Radius Difference Loss
计算机科学, 2022, 49(6A): 424-428. https://doi.org/10.11896/jsjkx.220100252
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!