计算机科学 ›› 2023, Vol. 50 ›› Issue (2): 267-274.doi: 10.11896/jsjkx.220900212

• 人工智能 • 上一篇    下一篇

受人脑中记忆机制启发的增量目标检测方法

商迪, 吕彦锋, 乔红   

  1. 中国科学院自动化研究所多模态人工智能系统全国重点实验室 北京 100190
    中国科学院大学人工智能学院 北京 100049
    中国科学院自动化研究所复杂系统管理与控制国家重点实验室 北京 100190
  • 收稿日期:2022-09-22 修回日期:2022-10-28 出版日期:2023-02-15 发布日期:2023-02-22
  • 通讯作者: 吕彦锋(yanfeng.lv@ia.ac.cn)
  • 作者简介:(shangdi2020@ia.ac.cn)
  • 基金资助:
    北京市自然科学基金(L211023);科技创新2030-“新一代人工智能”重大项目(2020AAA0105900);国家自然科学基金(91948303)

Incremental Object Detection Inspired by Memory Mechanisms in Brain

SHANG Di, LYU Yanfeng, QIAO Hong   

  1. State Key Laboratory of Multimodal Artificial Intelligence System,Institute of Automation,Chinese Academy of Sciences,Beijing 100190,China
    School of Artificial Intelligence,University of Chinese Academy of Sciences,Beijing 100049,China
    State Key Laboratory of Complex System Management and Control,Institute of Automation,Chinese Academy of Sciences,Beijing 100190,China
  • Received:2022-09-22 Revised:2022-10-28 Online:2023-02-15 Published:2023-02-22
  • Supported by:
    Natural Science Foundation of Beijing(L211023),National Key Research and Development Plan of China(2020AAA0105900) and National Natural Science Foundation of China(91948303)

摘要: 增量学习是缩小当前人工智能和人类智能间差距的关键技术,指智能体像人类一样从不稳定数据流中顺序学习多个任务,且不发生遗忘。目标检测是计算机视觉领域的核心任务之一,是计算机理解图像的基石。因此,增量目标检测问题具有重要的研究意义和实际意义。尽管增量学习在图像分类中取得了不错的成果,但基于目标检测的增量学习研究还处于初级阶段。这是因为目标检测相比图像分类更加复杂,它需要同时解决分类和边框回归的问题。不少研究者为解决此问题做了很多努力,但大多数工作都只关注如何保留已学习任务的性能,忽略了模型对新任务的快速适应能力,而这正是增量学习的关键要求。基于大脑的记忆机制,人类可以在学习中不断地提取知识以更好更快地学习新任务,不发生遗忘。受此启发,提出了一种融合编解码记忆重放机制的增量元学习方法。该方法对已学习样本的特征向量进行编码存储和解码重放,从而将不稳定数据流近似为动态稳定数据集,缓解了遗忘问题。同时,设计了一个双循环在线元学习策略,模型在内循环分别基于多批次新旧混合数据进行随机梯度更新,最后在外循环进行元学习,从而获得多任务间的共同结构,使模型具有良好的泛化性能,能够快速适应学习中遇到的新任务。在大型的公开数据集PASCAL VOC 和 MS COCO 上设置了3种增量目标检测实验环境来评估所提算法。实验结果表明,所提算法与最先进的方法相比体现出了具有竞争力的性能,证明了其可以帮助模型更好地抵抗遗忘,具有更好的泛化性能。所提算法基于梯度更新,与模型无关,因此其可以与其他检测框架结合,具有强适应性。

关键词: 增量学习, 目标检测, 受脑启发, 元学习, 抵抗遗忘, 泛化性能

Abstract: Incremental learning is key to bridging the enormous gap between artificial intelligence and human intelligence,mea-ning that agents can learn several tasks sequentially from a continuous stream of correlated data without forgetting,just as humans do.Object detection is one of the core tasks in the field of computer vision and the cornerstone of computer images understanding.Therefore,the incremental object detection has important research and practical significance.Although incremental learning has achieved good results in image classification,the research on incremental learning based on object detection is still in its infancy.This is because object detection is more complex than image classification,which needs to solve both classification and bounding box regression problems.Many researchers have made great efforts to solve this problem,but most of the work only focuses on how to retain previous learning,ignoring fast adaptability to new tasks,which is a critical requirement for incremental learning.Based on the memory mechanism of the brain,humans can constantly extract knowledge during learning,so as to learn new tasks better and faster without forgetting.Inspired by this,an incremental meta-learning method that integrates the codec memory replay mechanism is proposed.This method encodes,stores,decodes and replays the feature vectors of learned samples,so as to approximate the dynamic learning environment as a local stationary environment and avoid catastrophic forgetting.Besides,a double-loop online meta-learning strategy is designed,which can help model to extract common structures of tasks and improve generalization performance on new tasks encountered during learning.The model is respectively updated by SGD with multiple batches of old and new mixed data in the inner loop,and is meta-updated in the outer loop.We evaluate the proposed approach on three incremental object detection settings defined on PASCAL VOC and MS COCO datasets,where the proposed algorithm performs favorably well against state-of-the-art methods.It proves that it can help the model to resist forgetting better and have better generalization performance on new tasks.The proposed algorithm is gradient-based and model-agnostic,so it has strongadaptability and can be applied on more complex detection frameworks.

Key words: Incremental learning, Object detection, Brain inspiration, Meta-learning, Resistance to forgetting, Generalization

中图分类号: 

  • TP391
[1]MICHAEL M,NEAL J C.Catastrophic interference in connectionist networks:The sequential learning problem[M]//Psychology of Learning and Motivation.Academic Press,1989:109-165.
[2]RAIA H,DUSHYANT R,ANDREI A R,et al.Embracingchange:Continual learning in deep neural networks [J].Trends in Cognitive Sciences,2020,24(12):1028-1040.
[3]KONSTANTIN S,CORDELIA S,KARTEEK A.Incrementallearning of object detectors without catastrophic forgetting[C]// IEEE International Conference on Computer Vision.2017:3420-3429.
[4]HAO Y,FU Y W,JIANG Y G,et al.An end-to end architecture for class-incremental object detection with knowledge distillation[C]// IEEE International Conference on Multimedia and Expo.2019:1-6.
[5]CHEN L,YU C Y,CHEN L C.A new knowledge distillation for incremental object detection[C]// International Joint Confe-rence on Neural Networks.2019:1-7.
[6]MANO A,TYLER L H,CHRISTOPHER K.RODEO:replayfor online object detection[C]// British Machine Vision Confe-rence.2020.
[7]ALAN B.Working memory [J].Science,1992,255:556-559.
[8]ALAN B.The episodic buffer:a new component of workingmemory? [J].Trends in Cognitive Sciences,2000,4(11):417-423.
[9]PATRICIA S,GOLDMAN R.Regional and cellular fractiona-tion of working memory [J].Proceedings of the National Aca-demy of Sciences of the United States of America,1996,93(24):13473-13480.
[10]CHARLE H,SARAH M,GORDON D B.Memory for familiar and unfamiliar words:Evidence for a long-term memory contribution to short-term memory span [J].Journal of Memory and Language,1991,30(6):685-701.
[11]PHILIP G,VINCENT F C,SAMUEL S,et al.The long and the short of long-term memory-a molecular framework [J].Nature,1986,322(6078):419-422.
[12]ROSS G.Fast r-cnn[C]// IEEE International Conference on Computer Vision.2015:1440-1448.
[13]MARK E,AE S M,LUC V G,et al.The pascal visual object classes challenge:A retrospective [J].International Journal of Computer Vision,2015,111(1):98-136.
[14]TSUN Y L,MICHAEL M,SERGE B,et al.Microsoft COCO:Common Objects in Context[C]// European Conference on Computer Vision.Berlin:Springer,2014:740-755.
[15]DAVUD L,MARC A R.Gradient episodic memory for continual learning[C]// Advances in Neural Information Processing Systems.Curran Associates,Inc,2017:6470-6479.
[16]SYLVESTRE A R,ALEXANDER K,GEORG S,et al.icarl:Incremental classifier and representation learning[C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:5533-5542.
[17]DUSHYANT R,FRANCESCO V,ANDREI A R,et al.Conti-nual unsupervised representation learning[C]//Advances in Neural Information Processing Systems.Curran Associates,Inc,2019:7645-7655.
[18]ARSLAN C,ALBERT G,PUNEET K D,et al.Using hindsight to anchor past knowledge in continual learning[C]// Procee-dings of the AAAI Conference on Artificial Intelligence.2021:6993-7001.
[19]JAMES K,RAZYAN P,NEIL R,et al.Overcoming catastrophicforgetting in neural networks [J].Proceedings of the National Academy of Sciences of the United States of America,2017,114(13):3521-3526.
[20]PRAVENDRA S,VINARY K V,PRATIK M,et al.Calibrating cnns for lifelong learning[C]// Advances in Neural Information Processing Systems.Curran Associates,Inc,2020:15579-15590.
[21]JAEHONG Y,SAEHOON K,EUNHO Y,et al.Scalable andorder-robust continual learning with additive parameter decomposition[C]// International Conference on Learning Representations.2020.
[22]EKIN D C,BARRET Z,DANDELION M,et al.AutoAugment:Learning Augmentation Policies from Data[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:113-123.
[23]PENG C,ZHAO K,BRIAN C L.Faster ilod:Incremental lear-ning for object detectors based on faster rcnn [J].PatternRe-cognition Letters,2020,140:109-115.
[24]JOSEPH K J J,SALMAN K,FAHAD S K,et al.Towards open world object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:5830-5840.
[25]ZHANG J T,ZHANG T,SHALINI G,et al.Class-incremental learning via deep model consolidation [C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision.2020.
[26]JOSEPH K,JATHUSHAN R,SALMAN K,et al.Incremental object detection via meta-learning [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2021,44(22):9209-9216.
[27]CHELSEA F,PIETER A,SERGEY L.Model agnosticmeta-learning for fast adaptation of deep networks[C]//Proceedings of the 34rd International Conference on Machine Learning.2017:1126-1135.
[28]ALEX N,JOSHUA A,JOHN S.On first-order meta-learning algorithms [J].arXiv:1803.02999,2018.
[29]ANDREI A R,DUSHYANT R,JAKUB S,et al.Meta-learning with latent embedding optimization[C]//International Confe-rence on Learning Representations.2019.
[30]ADAM S,SERGEY B,MATTHEW B,et al.Meta-learning with memory-augmented neural networks[C]//Proceedings of The 33rd International Conference on Machine Learning.2016:20-22.
[31]MISHRA N,ROHANINEJAD M,CHEN X,et al.Meta-lear-ning with temporal convolutions [J].arXiv:1707.03141,2017.
[32]JAKE S,KEVIN S,RICHARD Z.Prototypical networks forfew-shot learning[C]//Advances in Neural Information Processing Systems.Curran Associates,Inc,2017.
[33]FLOOD S,YONGXIN Y,LI Z,et al.Learning to compare:Relation network for few-shot learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018.
[34]TIAN Y L,WANG Y,DILIP K,et al.Rethinking few-shotimage classification:A good embedding is all you need?[C]// European Conference on Computer Vision.Berlin:Springer,2020:266-282.
[35]WANG T C,RAO M A,HISHAM C,et al.Learning rich features at high-speed for single-shot object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019.
[36]NIE J,RAO M A,HISHAM C,et al.Enriched feature guidedrefinement network for object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019.
[37]PANG Y W,WANG T C,Rao M A,et al.Efficient featurized image pyramid network for single shot detector[C]//Procee-dings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019.
[38]FAHAD S K,XU J L,JOOST V D W,et al.Recognizing actions through action-specific person detection [J].IEEE Transactions on Image Processing,2015,24(11):4422-4432.
[39]JIALE C,HISHAM C,RAO M A,et al.D2det:Towards high quality object detection and instance segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020.
[40]HISHAM C,SUN G L,FAHAD S K,et al.Object counting and instance segmentation with image-level supervision[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019.
[1] 刘航, 普园媛, 吕大华, 赵征鹏, 徐丹, 钱文华.
极化自注意力约束颜色溢出的图像自动上色
Polarized Self-attention Constrains Color Overflow in Automatic Coloring of Image
计算机科学, 2023, 50(3): 208-215. https://doi.org/10.11896/jsjkx.220100149
[2] 张卫良, 陈秀宏.
跨层融合和感受野扩增的SSD目标检测算法
SSD Object Detection Algorithm with Cross-layer Fusion and Receptive Field Amplification
计算机科学, 2023, 50(3): 231-237. https://doi.org/10.11896/jsjkx.211100281
[3] 陈亮, 王璐, 李生春, 刘昌宏.
基于深度学习的可视化仪表板生成技术研究
Study on Visual Dashboard Generation Technology Based on Deep Learning
计算机科学, 2023, 50(3): 238-245. https://doi.org/10.11896/jsjkx.230100064
[4] 华杰, 刘学亮, 赵烨.
基于特征融合的小样本目标检测
Few-shot Object Detection Based on Feature Fusion
计算机科学, 2023, 50(2): 209-213. https://doi.org/10.11896/jsjkx.220500153
[5] 蔡肖, 陈志华, 盛斌.
基于移位窗口金字塔Transformer的遥感图像目标检测
SPT:Swin Pyramid Transformer for Object Detection of Remote Sensing
计算机科学, 2023, 50(1): 105-113. https://doi.org/10.11896/jsjkx.211100208
[6] 荣欢, 钱敏峰, 马廷淮, 孙圣杰.
基于先验知识图谱的多代理被遮挡目标类别推理模型
Novel Class Reasoning Model Towards Covered Area in Given Image Based on InformedKnowledge Graph Reasoning and Multi-agent Collaboration
计算机科学, 2023, 50(1): 243-252. https://doi.org/10.11896/jsjkx.220700112
[7] 刘冬梅, 徐洋, 吴泽彬, 刘倩, 宋斌, 韦志辉.
基于边框距离度量的增量目标检测方法
Incremental Object Detection Method Based on Border Distance Measurement
计算机科学, 2022, 49(8): 136-142. https://doi.org/10.11896/jsjkx.220100132
[8] 王灿, 刘永坚, 解庆, 马艳春.
基于软标签和样本权重优化的Anchor Free目标检测算法
Anchor Free Object Detection Algorithm Based on Soft Label and Sample Weight Optimization
计算机科学, 2022, 49(8): 157-164. https://doi.org/10.11896/jsjkx.210600240
[9] 齐秀秀, 王佳昊, 李文雄, 周帆.
基于概率元学习的矩阵补全预测融合算法
Fusion Algorithm for Matrix Completion Prediction Based on Probabilistic Meta-learning
计算机科学, 2022, 49(7): 18-24. https://doi.org/10.11896/jsjkx.210600126
[10] 祝文韬, 兰先超, 罗唤霖, 岳彬, 汪洋.
改进Faster R-CNN的光学遥感飞机目标检测
Remote Sensing Aircraft Target Detection Based on Improved Faster R-CNN
计算机科学, 2022, 49(6A): 378-383. https://doi.org/10.11896/jsjkx.210300121
[11] 马宾, 付永康, 王春鹏, 李健, 王玉立.
基于GDIoU损失函数的YOLOv4绝缘子高效定位算法
High Performance Insulators Location Scheme Based on YOLOv4 with GDIoU Loss Function
计算机科学, 2022, 49(6A): 412-417. https://doi.org/10.11896/jsjkx.210600089
[12] 陈永平, 朱建清, 谢懿, 吴含笑, 曾焕强.
基于外接圆半径差损失的实时安全帽检测算法
Real-time Helmet Detection Algorithm Based on Circumcircle Radius Difference Loss
计算机科学, 2022, 49(6A): 424-428. https://doi.org/10.11896/jsjkx.220100252
[13] 陈佳舟, 赵熠波, 徐阳辉, 马骥, 金灵枫, 秦绪佳.
三维城市场景中的小物体检测
Small Object Detection in 3D Urban Scenes
计算机科学, 2022, 49(6): 238-244. https://doi.org/10.11896/jsjkx.210400174
[14] 胡伏原, 万新军, 沈鸣飞, 徐江浪, 姚睿, 陶重犇.
深度卷积神经网络图像实例分割方法研究进展
Survey Progress on Image Instance Segmentation Methods of Deep Convolutional Neural Network
计算机科学, 2022, 49(5): 10-24. https://doi.org/10.11896/jsjkx.210200038
[15] 沈少朋, 马洪江, 张智恒, 周相兵, 朱春满, 温佐承.
多元时序上状态转移模式的三支漂移检测
Three-way Drift Detection for State Transition Pattern on Multivariate Time Series
计算机科学, 2022, 49(4): 144-151. https://doi.org/10.11896/jsjkx.210600045
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!