计算机科学 ›› 2023, Vol. 50 ›› Issue (9): 75-81.doi: 10.11896/jsjkx.230400204
杨溢1, 申昇2, 窦知阳3, 李元1, 韩振军1
YANG Yi1, SHEN Sheng2, DOU Zhiyang3, LI Yuan1, HAN Zhenjun1
摘要: 人体目标检测对社会治理和城市安全具有很重要的现实意义,监控数据是数据安全的重要来源。小目标检测是目前受到广泛关注的安全检测问题中一项具有挑战性的任务,其检测对象为大型图像中少于20个像素的目标。小目标的特征难以表征,其中一个主要挑战是,用于预训练/共同训练检测器的数据集(如COCO)与用于微调检测器的数据集(如TinyPerson)之间存在尺度不匹配的情况,这给小目标检测器的性能带来了负面影响。为了解决这个问题,文中提出了一种优化策略,用于匹配不同数据集的尺度,称其为尺度分布搜索(Scale Distribution Search,SDS),同时平衡图片的信息收益(数据集之间的尺度相近)和信息损失(信噪比(SNR)的降低)。该策略使用高斯模型对数据集中目标的尺度分布进行建模,通过迭代的方式寻找最优分布参数;并对比数据集中目标的特征分布和检测器的性能,以找到最佳的尺度分布。通过SDS策略,主流目标检测方法在TinyPerson上实现了更好的性能,证明了SDS策略在提升预训练/共同训练效率上的有效性。
中图分类号:
[1]LINT Y,MAIRE M,BELONGIE S,et al.Microsoft coco:Common objects in context[C]//13th European Conference(ECCV 2014).2014:740-755. [2]LIN T Y,DOLLÁR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2117-2125. [3]REN S,HE K,GIRSHICK R,et al.Faster r-cnn:Towards real-time object detection with region proposal networks[C]//NIPS.2016. [4]DENG J,DONG W,SOCHER R,et al.Imagenet:A large-scale hierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2009:248-255. [5]EVERINGHAM M,VAN GOOL L,WILLIAMS C K I,et al.The pascal visual object classes(voc) challenge[J].International Journal of Computer Vision,2010,88:303-338. [6]ISLAM M R,MATIN A.Detection of COVID 19 from CTimage by the novel LeNet-5 CNN architecture[C]//2020 23rd International Conference on Computer and Information Techno-logy(ICCIT).IEEE,2020:1-5. [7]ALEX K,ILYA S,GEOFFREY E H.Imagenet classificationwith deep convolutional neural networks[C]//Advances in Neural Information Processing Systems.2012:1097-1105. [8] LIN M,CHEN Q,YAN S C.Network in network[J].arXiv:1312.4400,2013. [9]KAREN S,ANDREW Z.Very deep convolutional networks for large-scale image recognition[C]//3rd International Conference on Learning Representations(ICLR 2015).Conference Track Proceedings,2015. [10] SZEGEDY C,LIU W,JIA Y Q,et al.Going deeper with convolu-tions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:1-9. [11]ZHANG C,BENZ P,ARGAW D M,et al.Resnet or densenetintroducing dense shortcuts to resnet[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision.2021:3550-3559. [12] SERMANET P,EIGEN D,ZHANG X,et al.Overfeat:Integra-ted recognition,localization and detection using convolutional networks[C]//3rd International Conference on Learning Representations(ICLR 2014).2014. [13]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2014:580-587. [14]HE K,ZHANG X,REN S,et al.Spatial pyramid pooling in deepconvolutional networks for visual recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(9):1904-1916. [15]GIRSHICK R.Fast r-cnn[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:1440-1448. [16]TAN M,PANG R,LE Q V.Efficientdet:Scalable and efficient object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:10781-10790. [17]WANG C Y,BOCHKOVSKIY A,LIAO H Y M.YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2023:7464-7475. [18]LIU W,ANGUELOV D,ERHAN D,et al.Ssd:Single shotmultibox detector[C]//Computer Vision-ECCV 2016:14th European Conference,Amsterdam.Springer International Publi-shing,2016:21-37. [19]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2980-2988. [20] ZHANG X,WAN F,LIU C,et al.FreeAnchor:Learning toMatch Anchors for Visual Object Detection[J].arXiv:1909.02466,2019. [21]LAW H,DENG J.CornerNet:Detecting Objects as Paired Keypoints[J].International Journal of Computer Vision,2020,128(3):642-656. [22]DUAN K,BAI S,XIE L,et al.CenterNet:Keypoint Triplets for Object Detection[J].arXiv:1904.08189,2019. [23]YANG T,ZHANG X,LI Z,et al.MetaAnchor:Learning to Detect Objects with Customized Anchors[J].arXiv:1807.00980,2018. [24]WANG J,CHEN K,YANG S,et al.Region proposal by guided anchoring[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:2965-2974. [25]TIAN Z,SHEN C,CHEN H,et al.Fcos:Fully convolutionalone-stage object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:9627-9636. [26]KONG T,SUN F,LIU H,et al.Foveabox:Beyound anchor-based object detection[J].IEEE Transactions on Image Proces-sing,2020,29:7389-7398. [27]YANG Z,LIU S,HU H,et al.Reppoints:Point set representation for object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:9657-9666. [28]CARION N,MASSA F,SYNNAEVE G,et al.End-to-end object detection with transformers[C]//Computer Vision-ECCV 2020:16th European Conference.Springer International Publishing,2020:213-229. [29]ZHU X,SU W,LU L,et al.Deformable detr:Deformable transformers for end-to-end object detection[J].arXiv:2010.04159,2020. [30]WANG T,YUAN L,CHEN Y,et al.Pnp-detr:Towards efficient visual analysis with transformers[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:4661-4670. [31]MENG D,CHEN X,FAN Z,et al.Conditional detr for fasttraining convergence[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:3651-3660. [32]WANG Y,ZHANG X,YANG T,et al.Anchor detr:Query design for transformer-based detector[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2022:2567-2575. [33]ZHANG S,ZHU X,LEI Z,et al.S3fd:Single shot scale-invariant face detector[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:192-201. [34]LI J,WANG Y,WANG C,et al.DSFD:dual shot face detector[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:5060-5069. [35]PANG J,LI C,SHI J,et al.R2-CNN:Fast tiny object detection inlarge-scale remote sensing images[J].arXiv:1902.06042,2019. [36]YANG X,YANG J,YAN J,et al.Scrdet:Towards more robust detection for small,cluttered and rotated objects[C]//Procee-dings of the IEEE/CVF International Conference on Computer Vision.2019:8232-8241. [37]LEE C,PARK S,SONG H,et al.Interactive Multi-Class Tiny-Object Detection[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2022:14136-14145. [38]KIM J U,PARK S,RO Y M.Robust small-scalepedestrian detection with cued recall via memory learning[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:3050-3059. [39]XU C,WANG J,YANG W,et al.Detecting tiny objects in aerial images:A normalized Wasserstein distance and a new benchmark[J].ISPRS Journal of Photogrammetry and Remote Sen-sing,2022,190:79-93. [40]YU X,GONG Y,JIANG N,et al.Scale match for tiny person detection[C]//Proceedings of the IEEE/CVF Winter Confe-rence on Applications of Computer Vision.2020:1257-1265. [41]CAO Y,XU J,LIN S,et al.Gcnet:Non-local networks meetsqueeze-excitation networks and beyond[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops.2019. [42]CAI Z,VASCONCELOS N.Cascade r-cnn:Delving into highquality object detection[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2018:6154-6162. [43]LU X,LI B,YUE Y,et al.Grid r-cnn[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:7363-7372. |
|