面向智能视频监控的人体小目标检测

doi:10.11896/jsjkx.230400204

Abstract

Abstract: Person detection has significant practical implications for social governance and urban security.Monitoring data is an important source of data security.Tiny object detection,which focuses on less than 20 pixels objects in large-scale images,is a challenging task.One of the main challenges is the scale mismatch between the dataset used for pre-training/co-training the detectors,such as COCO,and the dataset used for fine-tuning the detectors,such as TinyPerson,which negatively affects the performance of detectors on tiny object detection.To address this challenge,this paper proposes an optimization strategy called scale distribution searching(SDS) to match the scale of different datasets for tiny object detection,which also balance the information gain and loss.The Gauss model is used to model the scale distribution of targets in the dataset,and the optimal distribution parameters are found through iteration.The feature distribution and the performance of the detector is comparedto find the best scale distribution.Through the SDS strategy,mainstream object detection methods have achieved better performance on TinyPerson,demonstrating the effectiveness of the SDS strategy in improving pre-training/co-training efficiency.

Key words: Intelligent video surveillance, Tiny object detection, Scale distribution search, Pre-train

CLC Number:

TP391.41

YANG Yi, SHEN Sheng, DOU Zhiyang, LI Yuan, HAN Zhenjun. Tiny Person Detection for Intelligent Video Surveillance[J].Computer Science, 2023, 50(9): 75-81.

References

[1]LINT Y,MAIRE M,BELONGIE S,et al.Microsoft coco:Common objects in context[C]//13th European Conference(ECCV 2014).2014:740-755.
[2]LIN T Y,DOLLÁR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2117-2125.
[3]REN S,HE K,GIRSHICK R,et al.Faster r-cnn:Towards real-time object detection with region proposal networks[C]//NIPS.2016.
[4]DENG J,DONG W,SOCHER R,et al.Imagenet:A large-scale hierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2009:248-255.
[5]EVERINGHAM M,VAN GOOL L,WILLIAMS C K I,et al.The pascal visual object classes(voc) challenge[J].International Journal of Computer Vision,2010,88:303-338.
[6]ISLAM M R,MATIN A.Detection of COVID 19 from CTimage by the novel LeNet-5 CNN architecture[C]//2020 23rd International Conference on Computer and Information Techno-logy(ICCIT).IEEE,2020:1-5.
[7]ALEX K,ILYA S,GEOFFREY E H.Imagenet classificationwith deep convolutional neural networks[C]//Advances in Neural Information Processing Systems.2012:1097-1105.
[8] LIN M,CHEN Q,YAN S C.Network in network[J].arXiv:1312.4400,2013.
[9]KAREN S,ANDREW Z.Very deep convolutional networks for large-scale image recognition[C]//3rd International Conference on Learning Representations(ICLR 2015).Conference Track Proceedings,2015.
[10] SZEGEDY C,LIU W,JIA Y Q,et al.Going deeper with convolu-tions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:1-9.
[11]ZHANG C,BENZ P,ARGAW D M,et al.Resnet or densenetintroducing dense shortcuts to resnet[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision.2021:3550-3559.
[12] SERMANET P,EIGEN D,ZHANG X,et al.Overfeat:Integra-ted recognition,localization and detection using convolutional networks[C]//3rd International Conference on Learning Representations(ICLR 2014).2014.
[13]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2014:580-587.
[14]HE K,ZHANG X,REN S,et al.Spatial pyramid pooling in deepconvolutional networks for visual recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(9):1904-1916.
[15]GIRSHICK R.Fast r-cnn[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:1440-1448.
[16]TAN M,PANG R,LE Q V.Efficientdet:Scalable and efficient object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:10781-10790.
[17]WANG C Y,BOCHKOVSKIY A,LIAO H Y M.YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2023:7464-7475.
[18]LIU W,ANGUELOV D,ERHAN D,et al.Ssd:Single shotmultibox detector[C]//Computer Vision-ECCV 2016:14th European Conference,Amsterdam.Springer International Publi-shing,2016:21-37.
[19]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2980-2988.
[20] ZHANG X,WAN F,LIU C,et al.FreeAnchor:Learning toMatch Anchors for Visual Object Detection[J].arXiv:1909.02466,2019.
[21]LAW H,DENG J.CornerNet:Detecting Objects as Paired Keypoints[J].International Journal of Computer Vision,2020,128(3):642-656.
[22]DUAN K,BAI S,XIE L,et al.CenterNet:Keypoint Triplets for Object Detection[J].arXiv:1904.08189,2019.
[23]YANG T,ZHANG X,LI Z,et al.MetaAnchor:Learning to Detect Objects with Customized Anchors[J].arXiv:1807.00980,2018.
[24]WANG J,CHEN K,YANG S,et al.Region proposal by guided anchoring[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:2965-2974.
[25]TIAN Z,SHEN C,CHEN H,et al.Fcos:Fully convolutionalone-stage object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:9627-9636.
[26]KONG T,SUN F,LIU H,et al.Foveabox:Beyound anchor-based object detection[J].IEEE Transactions on Image Proces-sing,2020,29:7389-7398.
[27]YANG Z,LIU S,HU H,et al.Reppoints:Point set representation for object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:9657-9666.
[28]CARION N,MASSA F,SYNNAEVE G,et al.End-to-end object detection with transformers[C]//Computer Vision-ECCV 2020:16th European Conference.Springer International Publishing,2020:213-229.
[29]ZHU X,SU W,LU L,et al.Deformable detr:Deformable transformers for end-to-end object detection[J].arXiv:2010.04159,2020.
[30]WANG T,YUAN L,CHEN Y,et al.Pnp-detr:Towards efficient visual analysis with transformers[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:4661-4670.
[31]MENG D,CHEN X,FAN Z,et al.Conditional detr for fasttraining convergence[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:3651-3660.
[32]WANG Y,ZHANG X,YANG T,et al.Anchor detr:Query design for transformer-based detector[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2022:2567-2575.
[33]ZHANG S,ZHU X,LEI Z,et al.S3fd:Single shot scale-invariant face detector[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:192-201.
[34]LI J,WANG Y,WANG C,et al.DSFD:dual shot face detector[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:5060-5069.
[35]PANG J,LI C,SHI J,et al.R2-CNN:Fast tiny object detection inlarge-scale remote sensing images[J].arXiv:1902.06042,2019.
[36]YANG X,YANG J,YAN J,et al.Scrdet:Towards more robust detection for small,cluttered and rotated objects[C]//Procee-dings of the IEEE/CVF International Conference on Computer Vision.2019:8232-8241.
[37]LEE C,PARK S,SONG H,et al.Interactive Multi-Class Tiny-Object Detection[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2022:14136-14145.
[38]KIM J U,PARK S,RO Y M.Robust small-scalepedestrian detection with cued recall via memory learning[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:3050-3059.
[39]XU C,WANG J,YANG W,et al.Detecting tiny objects in aerial images:A normalized Wasserstein distance and a new benchmark[J].ISPRS Journal of Photogrammetry and Remote Sen-sing,2022,190:79-93.
[40]YU X,GONG Y,JIANG N,et al.Scale match for tiny person detection[C]//Proceedings of the IEEE/CVF Winter Confe-rence on Applications of Computer Vision.2020:1257-1265.
[41]CAO Y,XU J,LIN S,et al.Gcnet:Non-local networks meetsqueeze-excitation networks and beyond[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops.2019.
[42]CAI Z,VASCONCELOS N.Cascade r-cnn:Delving into highquality object detection[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2018:6154-6162.
[43]LU X,LI B,YUE Y,et al.Grid r-cnn[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:7363-7372.

Related Articles 15

[1]	YI Liu, GENG Xinyu, BAI Jing. Hierarchical Multi-label Text Classification Algorithm Based on Parallel Convolutional Network Information Fusion [J]. Computer Science, 2023, 50(9): 278-286.
[2]	LIANG Jiayin, XIE Zhipeng. Text Paraphrase Generation Based on Pre-trained Language Model and Tag Guidance [J]. Computer Science, 2023, 50(8): 150-156.
[3]	WU Hanxiao, ZHAO Qianqian, ZHU Jianqing, ZENG Huanqiang, DU Jixiang, LIAO Yun. Metric Regularized Infrared and Visible Cross-modal Person Re-identification [J]. Computer Science, 2023, 50(6A): 221100046-8.
[4]	CAI Haoran, YANG Jian, YANG Lin, LIU Cong. Low-resource Thai Speech Synthesis Based on Alternate Training and Pre-training [J]. Computer Science, 2023, 50(6A): 220800127-5.
[5]	YE Han, LI Xin, SUN Haichun. Convolutional Network Entity Missing Detection Method Combined with Gated Mechanism [J]. Computer Science, 2023, 50(5): 262-269.
[6]	WANG Taiyan, PAN Zulie, YU Lu, SONG Jingbin. Binary Code Similarity Detection Method Based on Pre-training Assembly Instruction Representation [J]. Computer Science, 2023, 50(4): 288-297.
[7]	LIU Zhe, YIN Chengfeng, LI Tianrui. Chinese Spelling Check Based on BERT and Multi-feature Fusion Embedding [J]. Computer Science, 2023, 50(3): 282-290.
[8]	HUAN Zhigang, JIANG Guoquan, ZHANG Yujian, LIU Liu, LIU Shanshan. Employing Gated Mechanism to Incorporate Multi-features into Chinese Event Coreference Resolution [J]. Computer Science, 2023, 50(3): 291-297.
[9]	SU Qi, WANG Hongling, WANG Zhongqing. Unsupervised Script Summarization Based on Pre-trained Model [J]. Computer Science, 2023, 50(2): 310-316.
[10]	HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
[11]	ZHAO Dan-dan, HUANG De-gen, MENG Jia-na, DONG Yu, ZHANG Pan. Chinese Entity Relations Classification Based on BERT-GRU-ATT [J]. Computer Science, 2022, 49(6): 319-325.
[12]	LIU Shuo, WANG Geng-run, PENG Jian-hua, LI Ke. Chinese Short Text Classification Algorithm Based on Hybrid Features of Characters and Words [J]. Computer Science, 2022, 49(4): 282-287.
[13]	Abudukelimu ABULIZI, ZHANG Yu-ning, Alimujiang YASEN, GUO Wen-qiang, Abudukelimu HALIDANMU. Survey of Research on Extended Models of Pre-trained Language Models [J]. Computer Science, 2022, 49(11A): 210800125-12.
[14]	HUANG Yu-jiao, ZHAN Li-chao, FAN Xing-gang, XIAO Jie, LONG Hai-xia. Text Classification Based on Knowledge Distillation Model ELECTRA-base-BiLSTM [J]. Computer Science, 2022, 49(11A): 211200181-6.
[15]	CHEN Qiao-song, HE Xiao-yang, XU Wen-jie, DENG Xin, WANG Jin, PIAO Chang-hao. Reentrancy Vulnerability Detection Based on Pre-training Technology and Expert Knowledge [J]. Computer Science, 2022, 49(11A): 211200182-8.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Tiny Person Detection for Intelligent Video Surveillance

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0