计算机科学 ›› 2022, Vol. 49 ›› Issue (8): 136-142.doi: 10.11896/jsjkx.220100132

• 计算机图形学& 多媒体 • 上一篇    下一篇

基于边框距离度量的增量目标检测方法

刘冬梅, 徐洋, 吴泽彬, 刘倩, 宋斌, 韦志辉   

  1. 南京理工大学计算机科学与工程学院 南京 210094
  • 收稿日期:2022-01-17 修回日期:2022-02-17 发布日期:2022-08-02
  • 通讯作者: 徐洋(xuyangth90@gmail.com)
  • 作者简介:(dongmei@njust.edu.cn)
  • 基金资助:
    国家自然科学基金(61772274,62071233,61971223,61976117);江苏省自然科学基金(BK20211570,BK20180018,BK20191409);中央高校基金项目(30917015104,30919011103,30919011402,30921011209);中国博士后基金(2017M611814,2018T110502)

Incremental Object Detection Method Based on Border Distance Measurement

LIU Dong-mei, XU Yang, WU Ze-bin, LIU Qian, SONG Bin, WEI Zhi-hui   

  1. School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210094,China
  • Received:2022-01-17 Revised:2022-02-17 Published:2022-08-02
  • About author:LIU Dong-mei,born in 1996,postgra-duate.Her main research interests include image processing and deep lear-ning.
    XU Yang,born in 1990,Ph.D,associate professor,is a member of China Computer Federation.His main research interests include image processing and machine learning.
  • Supported by:
    National Natural Science Foundation of China(61772274,62071233,61971223,61976117),Natural Science Foundation of Jiangsu Province(BK20211570,BK20180018,BK20191409),Fundamental Research Funds for the Central Universities(30917015104,30919011103,30919011402,30921011209) and China Postdoctoral Science Foundation(2017M611814,2018T110502).

摘要: 增量学习在图像分类中已经获得了不错的效果,但是将增量学习技术直接应用于多类目标检测具有一定的挑战性。相比图像分类,目标检测是一项更复杂的任务,因为它结合了分类和边框回归的问题。目前最先进的增量目标检测器大多采用基于知识蒸馏的外部固定区域建议方法,该方法需耗费大量的时间和成本。由于单阶段检测器缺少旧类别的标注和区域建议信息,检测器通常会将旧类目标识别为背景,从而导致灾难性遗忘,因此提出了一种基于边框距离度量的标签选择算法。该算法利用旧模型检测结果和现有的数据集标签,通过度量边框重合度进行选择与合并,弥补了新数据集中旧类目标注释缺失的问题,缓解了灾难性遗忘。同时设计了一个注意力残差模块,该模块通过将注意力模块与残差模块相结合,在特征提取网络的不同深度均能提取可鉴别性特征,进一步提升了模型检测新旧类目标的精度。在单阶段检测框架中实现了该方法,同时在PASCAL VOC数据集上验证了该方法的有效性。与目前最好的方法相比,所提模型检测旧类别目标的平均精度值mAP高出了2.8%,总体的平均精度值mAP高出了2.1%。所提方法得到的伪标签有效缓解了遗忘问题,注意力残差模块的设计提升了模型的检测精度。

关键词: 标签选择, 目标检测, 伪标签, 灾难性遗忘, 增量学习, 注意力模块

Abstract: Incremental learning has achieved good results in image classification,but it is challenging to apply incremental learning to multi-class object detection.Object detection is more complex than image classification,which combines classification and border regression.At present,the most advanced incremental object detectors adopt the external fixed region suggestion method based on knowledge distillation,which consumes a lot of time and cost.For single-stage detectors,due to the lack of annotation and region advice information for the old class,old objects are usually identified by the detector as the background,resulting in catastrophic forgetting.In this paper,a label selection algorithm based on border distance metric is proposed.It uses the detection results of the old model and the existing dataset labels to select and merge by measuring the coincidence of the bounding boxes,making up for the lack of annotations of the old objects in the new dataset and alleviating catastrophic forgetting.In addition,a module that combines the attention module with the residual module is designed to extract discriminative features at different depths in feature extraction network,to further improve the detection accuracy of model.The proposed method is implemented in the single-stage detection framework,and the effectiveness of the method is verified on PASCAL VOC dataset.Compared with the best model at present,the average accuracy value of the old object and all objects improves by 2.8% and 2.1%,respectively.The pseudo-labels obtained by the proposed method greatly alleviate the forgetting problem,and the attention residual module improves the detection accuracy of the model.

Key words: Attention module, Catastrophic forgetting, Incremental learning, Label selection, Object detection, Pseudo label

中图分类号: 

  • TP391
[1]HE K,ZHANG X,REN S,et al.Spatial Pyramid Pooling inDeep Convolutional Networks for Visual Recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(9):1904-1916.
[2]REN S,HE K,GIRRSHICK R,et al.Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
[3]LIN T Y,DOLLAR P,GIRSHICK R,et al.Feature Pyramid Networks for Object Detection[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).IEEE Computer Society,2017:2117-2125.
[4]REDMON J,DIVVALA S,GIRSHICK R,et al.You Only Look Once:Unified,Real-time Object Detection[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2016:779-788.
[5]REDMON J,FARHADI A.YOLOv3:An Incremental Improvement[J].arXiv:1804.02767,2018.
[6]ZHOU X,WANG D,KRÄHENBÜHL P.Objects as Points[J].arXiv:1904.07850,2019.
[7]TIAN Z,SHEN C,CHEN H,et al.FCOS:Fully Convolutional One-Stage Object Detection[C]//IEEE International Conference on Computer Vision.2019:9626-9635.
[8]SHMELKOV K,SCHMID C,ALAHARI K.Incremental Lear-ning of Object Detectors without Catastrophic Forgetting[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:3420-3429.
[9]LI Z,HOIEM D.Learning without Forgetting[J].IEEE Transa-ctions on Pattern Analysis and Machine Intelligence,2017,40(12):2935-2947.
[10]LI D,TASCI S,GHOSH S,et al.RILOD:Near Real-time Incremental Learning for Object Detection at the Edge[C]//Procee-dings of the 4th ACM.New York,2019:113-126.
[11]ZHANG J,ZHANG J,GHOSH S,et al.Class-incrementalLearning via Deep Model Consolidation[C]//IEEE Winter Conference on Applications of Computer Vision.2020:1131-1140.
[12]ZHENG Z,WANG P,REN D,et al.Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation[EB/OL].https://arxiv.org/pdf/2005.03572.pdf.
[13]DHAR P,SINGH R V,PENG K C,et al.Learning withoutMemorizing[C]//IEEE Conference on Computer Vision and Pattern Recognition.2019:5138-5146.
[14]WOO S,PARK J,LEE J Y,et al.CBAM:Convolutional Block Attention Module[C]//Proceedings of the European Conference on Computer Vision.2018:3-19.
[15]LIU L,KUANG Z,CHEN Y,et al.IncDet:In Defense of Elastic Weight Consolidation for Incremental Object Detection[J].IEEE Transactions on Neural Networks and Learning Systems,2021,32(6):2306-2319.
[1] 魏恺轩, 付莹.
基于重参数化多尺度融合网络的高效极暗光原始图像降噪
Re-parameterized Multi-scale Fusion Network for Efficient Extreme Low-light Raw Denoising
计算机科学, 2022, 49(8): 120-126. https://doi.org/10.11896/jsjkx.220200179
[2] 王灿, 刘永坚, 解庆, 马艳春.
基于软标签和样本权重优化的Anchor Free目标检测算法
Anchor Free Object Detection Algorithm Based on Soft Label and Sample Weight Optimization
计算机科学, 2022, 49(8): 157-164. https://doi.org/10.11896/jsjkx.210600240
[3] 祝文韬, 兰先超, 罗唤霖, 岳彬, 汪洋.
改进Faster R-CNN的光学遥感飞机目标检测
Remote Sensing Aircraft Target Detection Based on Improved Faster R-CNN
计算机科学, 2022, 49(6A): 378-383. https://doi.org/10.11896/jsjkx.210300121
[4] 马宾, 付永康, 王春鹏, 李健, 王玉立.
基于GDIoU损失函数的YOLOv4绝缘子高效定位算法
High Performance Insulators Location Scheme Based on YOLOv4 with GDIoU Loss Function
计算机科学, 2022, 49(6A): 412-417. https://doi.org/10.11896/jsjkx.210600089
[5] 陈永平, 朱建清, 谢懿, 吴含笑, 曾焕强.
基于外接圆半径差损失的实时安全帽检测算法
Real-time Helmet Detection Algorithm Based on Circumcircle Radius Difference Loss
计算机科学, 2022, 49(6A): 424-428. https://doi.org/10.11896/jsjkx.220100252
[6] 陈佳舟, 赵熠波, 徐阳辉, 马骥, 金灵枫, 秦绪佳.
三维城市场景中的小物体检测
Small Object Detection in 3D Urban Scenes
计算机科学, 2022, 49(6): 238-244. https://doi.org/10.11896/jsjkx.210400174
[7] 胡伏原, 万新军, 沈鸣飞, 徐江浪, 姚睿, 陶重犇.
深度卷积神经网络图像实例分割方法研究进展
Survey Progress on Image Instance Segmentation Methods of Deep Convolutional Neural Network
计算机科学, 2022, 49(5): 10-24. https://doi.org/10.11896/jsjkx.210200038
[8] 徐涛, 陈奕仁, 吕宗磊.
基于改进YOLOv3的机坪工作人员反光背心检测研究
Study on Reflective Vest Detection for Apron Workers Based on Improved YOLOv3 Algorithm
计算机科学, 2022, 49(4): 239-246. https://doi.org/10.11896/jsjkx.210200119
[9] 沈少朋, 马洪江, 张智恒, 周相兵, 朱春满, 温佐承.
多元时序上状态转移模式的三支漂移检测
Three-way Drift Detection for State Transition Pattern on Multivariate Time Series
计算机科学, 2022, 49(4): 144-151. https://doi.org/10.11896/jsjkx.210600045
[10] 许华杰, 秦远卓, 杨洋.
基于多级特征融合与注意力模块的场景识别方法
Scene Recognition Method Based on Multi-level Feature Fusion and Attention Module
计算机科学, 2022, 49(4): 209-214. https://doi.org/10.11896/jsjkx.210100135
[11] 赵越, 余志斌, 李永春.
基于互注意力指导的孪生跟踪算法
Cross-attention Guided Siamese Network Object Tracking Algorithm
计算机科学, 2022, 49(3): 163-169. https://doi.org/10.11896/jsjkx.210300066
[12] 张侣, 周博文, 吴亮红.
基于改进卷积注意力模块与残差结构的SSD网络
SSD Network Based on Improved Convolutional Attention Module and Residual Structure
计算机科学, 2022, 49(3): 211-217. https://doi.org/10.11896/jsjkx.201200019
[13] 解宇, 杨瑞玲, 刘公绪, 李德玉, 王文剑.
基于动态拓扑图的人体骨架动作识别算法
Human Skeleton Action Recognition Algorithm Based on Dynamic Topological Graph
计算机科学, 2022, 49(2): 62-68. https://doi.org/10.11896/jsjkx.210900059
[14] 赫晓慧, 邱芳冰, 程淅杰, 田智慧, 周广胜.
基于边缘特征融合的高分影像建筑物目标检测
High-resolution Image Building Target Detection Based on Edge Feature Fusion
计算机科学, 2021, 48(9): 140-145. https://doi.org/10.11896/jsjkx.200800002
[15] 袁磊, 刘紫燕, 朱明成, 马珊珊, 陈霖周廷.
融合改进密集连接和分布排序损失的遥感图像检测
Improved YOLOv3 Remote Sensing Target Detection Based on Improved Dense Connection and Distributional Ranking Loss
计算机科学, 2021, 48(9): 168-173. https://doi.org/10.11896/jsjkx.200800001
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!