计算机科学 ›› 2020, Vol. 47 ›› Issue (11): 186-191.doi: 10.11896/jsjkx.191200063

• 计算机图形学&多媒体 • 上一篇    下一篇

卷积神经网络低层特征辅助的图像实例分割方法

樊玮1, 刘挺1, 黄睿1, 郭青2, 张宝2   

  1. 1 中国民航大学计算机科学与技术学院 天津 300300
    2 天津大学智能与计算学部 天津 300350
  • 收稿日期:2019-12-07 修回日期:2020-05-19 出版日期:2020-11-15 发布日期:2020-11-05
  • 通讯作者: 黄睿(rhuang@cauc.edu.cn)
  • 作者简介:wfancauc@163.com
  • 基金资助:
    天津市教委科研计划项目(2019KJ126);中国民航大学中央高校基金项目(3122018C021,3122018C020)

Low-level CNN Feature Aided Image Instance Segmentation

FAN Wei1, LIU Ting1, HUANG Rui1, GUO Qing2, ZHANG Bao2   

  1. 1 College of Computer Science and Technology,Civil Aviation University of China,Tianjin 300300,China
    2 College of Intelligence and Computing,Tianjin University,Tianjin 300350,China
  • Received:2019-12-07 Revised:2020-05-19 Online:2020-11-15 Published:2020-11-05
  • About author:FAN Wei,born in 1968,Ph.D,professor,is a member of China Computer Federation.His main research interests include machine learning and revenue management.
    HUANG Rui,born in 1987,Ph.D,lecturer,is a member of China Computer Fe-deration.His main research interests include computer vision and machine learning.
  • Supported by:
    This work was supported by the Scientific Research Project of Tianjin Education Commission (2019KJ126) and Foundation Project for Central University of CAUC(3122018C021,3122018C020).

摘要: 流行的实例分割网络Mask R-CNN在进行实例分割时,存在目标分割边界和分割轮廓粗糙的问题,导致分割精度低。针对此问题,提出在Mask R-CNN分割分支中引入网络的低层卷积特征进行高精度的实例分割方法。首先从特征提取网络中选择特征,通过插值算法将其缩放至固定尺度(输入图像的1/8)作为低层特征;然后通过RoI对齐操作提取当前待分割目标的特征后与原始的Mask R-CNN的分割分支对应目标的特征进行拼接,并将其作为精细化目标分割的特征。低层网络特征引入了更多的低级纹理和轮廓信息,可以有效地提高物体的分割精度。在COCO2017数据集上,所提方法使用ResNet-101-FPN作为特征提取网络得到的分割结果的平均准确度(AP)相对于Mask R-CNN提高了1.2%。实验结果表明,所提方法在使用不同特征提取网络时具有较好的鲁棒性和有效性。

关键词: 低层特征, 深度神经网络, 深度学习, 实例分割, 特征融合

Abstract: The popular instance segmentation network,Mask R-CNN,has rough target segmentation boundaries and segmentation contours when performing instance segmentation,which leads to low segmentation accuracy.To solve this problem,a high-precision instance segmentation method is proposed by introducing the low-level features of the network into the segmentation branch of Mask R-CNN.Specifically,it selects the convolutional features from lower layers of feature extraction network at first.And then,it resizes the features to a fixed scale (1/8 of the input image) by interpolation algorithm to form the low-level features.It concatenates the features of original segmentation branch of Mask R-CNN with the features extracted by RoI Align ope-ration from low-level features for current target.Since low-level features introduce more low-level texture and contour information,it can effectively improve the accuracy of instance segmentation.Compared with Mask R-CNN,the proposed method obtains 1.2% relative average precision (AP) improvement on the COCO2017 dataset by using ResNet-101-FPN as the feature extraction network.Experimental results show that the proposed method is robust and effective when using different feature extraction networks.

Key words: Deep learning, Deep neural network, Feature fusion, Instance segmentation, Low-level feature

中图分类号: 

  • TP391.4
[1] LIU S,QI L,QIN H,et al.Path Aggregation Network for Instance Segmentation[J].arXiv:1803.01534.
[2] LUO J,SAVAKIS A E,SINGHAL A.A Bayesian network-based framework for semantic image understanding[J].Pattern Recognition,2005,38(6):919-934.
[3] LI L,JIANG S Q,HUANG Q M.Learning Hierarchical Semantic Description Via Mixed-Norm Regularization for Image Understanding[J].IEEE Transactions on Multimedia,2012,14(5):1401-1413.
[4] LOZANO S,MÖLLER K,BRENDLE A,et al.AUTOPILOT-BT:A system for knowledge and model based mechanical ventilation[J].Technology and Health Care,2008,16(1):1-11.
[5] THEIS J,OSSMANN D,THIELECKE F,et al.Robust autopilot design for landing a large civil aircraft in crosswind[J].Control Engineering Practice,2018,76:54-64.
[6] ZHU J,LAO Y W,ZHENG Y F.Object Tracking in Structured Environments for Video Surveillance Applications[J].IEEE Transactions on Circuits and Systems for Video Technology,2010,20(2):223-235.
[7] GILBERT A L,GILES M K,FLACHS G M,et al.A Real-Time Video Tracking System[J].IEEE Transactions on Pattern Ana-lysis and Machine Intelligence,1980(1):10.
[8] SALTI S,CAVALLARO A,DI STEFANO L.Adaptive Ap-pearance Modeling for Video Tracking:Survey and Evaluation[J].IEEE Transactions on Image Processing,2012,21(10):4334-4348.
[9] YEE K P,SWEARINGEN K,LI K,et al.Faceted metadata for image search and browsing[C]//Proceedings of the 2003 Conference on Human Factors in Computing Systems(CHI 2003).Ft.Lauderdale,Florida,USA,2003.
[10] WANG M,LI H,TAO D C,et al.Multimodal Graph-BasedReranking for Web Image Search[J].IEEE Transactions on Ima-ge Processing,2012,21(11):4649-4661.
[11] LI X,LIU Z,LUO P,et al.Not All Pixels Are Equal:Difficulty-aware Semantic Segmentation via Deep Layer Cascade[J].ar-Xiv:1704.01344.
[12] LIU Z,LI X,LUO P,et al.Semantic Image Segmentation viaDeep Parsing Network[J].arXiv:1509.02634.
[13] PINHEIRO P O,COLLOBERT R,DOLLAR P.Learning toSegment Object Candidates[J].arXiv:1506.06204.
[14] PINHEIRO P O,LIN T Y,COLLOBERT R,et al.Learning to Refine Object Segments[J].arXiv:1603.08695.
[15] DAI J,HE K,LI Y,et al.Instance-sensitive Fully Convolutional Networks[J].arXiv:1603.08678.
[16] DAI J,HE K,SUN J.Instance-aware Semantic Segmentation via Multi-task Network Cascades[J].arXiv:1512.04412.
[17] HAYDER Z,HE X,SALZMANN M.Boundary-Aware Instance Segmentation[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE,2017.
[18] DAI J,LI Y,HE K,et al.R-FCN:Object Detection via Region-based Fully Convolutional Networks[J].arXiv:1605.06409.
[19] GIRSHICK R.Fast r-cnn[C]//2015 IEEE International Confe-rence on Computer Vision (ICCV).IEEE,2016.
[20] REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2017,39(6):1137-1149.
[21] LI Y,QI H,DAI J,et al.Fully Convolutional Instance-Aware Semantic Segmentation[C]//2017 IEEE Conference on Compu-ter Vision and Pattern Recognition (CVPR).Honolulu,HI:IEEE,2017:4438-4446.
[22] CHEN L C,HERMANS A,PAPANDREOU G,et al.Mask-Lab:Instance Segmentation by Refining Object Detection with Semantic and Direction Features[J].arXiv:1712.04837.
[23] HE K,GKIOXARI G,PIOTR DOLLÁ R,et al.Mask R-CNN[C]//2017 IEEE International Conference on Computer Vision (ICCV).IEEE,2017.
[24] HUANG Z,HUANG L,GONG Y,et al.Mask Scoring R-CNN[J].arXiv:1903.00241.
[25] CHEN K,PANG J,WANG J,et al.Hybrid Task Cascade forInstance Segmentation[J].arXiv:1901.07518.
[26] CAI Z,VASCONCELOS N.Cascade R-CNN:Delving into High Quality Object Detection[J].arXiv:1712.00726.
[27] SUN Y,P P S K,SHIMAMURA J,et al.Concatenated Feature Pyramid Network for Instance Segmentation[J].arXiv:1904.00768.
[28] LIN T Y,DOLLÁR P,GIRSHICK R,et al.Feature PyramidNetworks for Object Detection[J].arXiv:1612.03144.
[29] LIANG X,LIN L,WEI Y,et al.Proposal-free Network for Instance-level Object Segmentation[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2018,40(12):2978-2991.
[30] BAI M,URTASUN R.Deep Watershed Transform for Instance Segmentation[J].arXiv:1611.08303.
[31] BEUCHER S,C ANTUÉL.Use of Watersheds in Contour Detection[C]//International workshop on image processing,real-time edge and motion detection.CCETT,1979.
[32] KIRILLOV A,LEVINKOV E,ANDRES B,et al.InstanceCut:from Edges to Instances with MultiCut[J].arXiv:1611.08272.
[33] JIN L,CHEN Z,TU Z.Object Detection Free Instance Segmentation With Labeling Transformations[J].arXiv:1611.08991.
[34] LIU S,JIA J,FIDLER S,et al.SGN:Sequential Grouping Networks for Instance Segmentation[C]//2017 IEEE International Conference on Computer Vision (ICCV).Venice:IEEE,2017:3516-3524.
[35] REN M,ZEMEL R S.End-to-End Instance Segmentation with Recurrent Attention[C]//Computer Vision & Pattern Recognition.IEEE,2017.
[36] ROMERA-PAREDES B,TORR P H S.Recurrent Instance Segmentation[J].Computer Science,2016,9910(10):312-329.
[37] HOCHREITER S,SCHMIDHUBER J.Long Short-Term Memory[J].Neural Computation,1997,9(8):1735-1780.
[38] SHI X,CHEN Z,WANG H,et al.Convolutional LSTM Net-work:A Machine Learning Approach for Precipitation Nowcas-ting[J].arXiv:1506.04214.
[39] ZEILER M D,FERGUS R.Visualizing and Understanding Convolutional Networks[J].arXiv:1311.2901.
[40] LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft COCO:Common Objects in Context[C]//European Conference on Computer Vision.Springer International Publishing,2014.
[41] MASSA F,GIRSHICK R.maskrcnn-benchmark:Fast,modularreference implementation of Instance Segmentation and Object Detection algorithms in PyTorch[OL].https://github.com/facebookresearch/maskrcnn-benchmark.
[1] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[2] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[3] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[4] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[5] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[6] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[7] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[8] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[9] 张颖涛, 张杰, 张睿, 张文强.
全局信息引导的真实图像风格迁移
Photorealistic Style Transfer Guided by Global Information
计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036
[10] 程成, 降爱莲.
基于多路径特征提取的实时语义分割方法
Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction
计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157
[11] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[12] 周慧, 施皓晨, 屠要峰, 黄圣君.
基于主动采样的深度鲁棒神经网络学习
Robust Deep Neural Network Learning Based on Active Sampling
计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044
[13] 苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫.
小样本雷达辐射源识别的深度学习方法综述
Survey of Deep Learning for Radar Emitter Identification Based on Small Sample
计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138
[14] 刘伟业, 鲁慧民, 李玉鹏, 马宁.
指静脉识别技术研究综述
Survey on Finger Vein Recognition Research
计算机科学, 2022, 49(6A): 1-11. https://doi.org/10.11896/jsjkx.210400056
[15] 孙福权, 崔志清, 邹彭, 张琨.
基于多尺度特征的脑肿瘤分割算法
Brain Tumor Segmentation Algorithm Based on Multi-scale Features
计算机科学, 2022, 49(6A): 12-16. https://doi.org/10.11896/jsjkx.210700217
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!