计算机科学 ›› 2023, Vol. 50 ›› Issue (9): 235-241.doi: 10.11896/jsjkx.220800067

• 数据库&大数据&数据科学 • 上一篇    下一篇

密集场景下基于多尺度特征聚合的人群计数方法

刘培刚1, 孙洁1, 杨超智1, 李宗民1,2   

  1. 1 中国石油大学(华东)计算机科学与技术学院 山东 青岛 266580
    2 中国石油大学胜利学院 山东 东营 257061
  • 收稿日期:2022-08-06 修回日期:2022-12-07 出版日期:2023-09-15 发布日期:2023-09-01
  • 通讯作者: 刘培刚(dongfangwy@upc.edu.cn)
  • 基金资助:
    :国家重点研发计划(2019YFF0301800);国家自然科学基金(61379106);山东省自然科学基金(ZR2013FM036,ZR2015FM011)

Crowd Counting Based on Multi-scale Feature Aggregation in Dense Scenes

LIU Peigang1, SUN Jie1, YANG Chaozhi1, LI Zongmin1,2   

  1. 1 School of Computer Science and Technology in China University of Petroleum(East China),Qingdao,Shandong 266580,China
    2 Shengli College of China University of Petroleum,Dongying,Shandong 257061,China
  • Received:2022-08-06 Revised:2022-12-07 Online:2023-09-15 Published:2023-09-01
  • About author:LIU Peigang,born in 1979,Ph.D,postgraduate supervisor,is a member of China Computer Federation.His main research interests include graphical image processing and data science and applications.
  • Supported by:
    National Key R & D Program of China(2019YFF0301800),National Natural Science Foundation of China (61379106) and Shandong Provincial Natural Science Foundation(ZR2013FM036,ZR2015FM011).

摘要: 密集场景下个体尺度存在巨大差异,目标个体尺度不一导致人群计数精度不高。针对这一问题,提出了一种密集场景下基于多尺度特征聚合的人群计数方法。该方法研究不同特征层级对不同尺度个体的特征信息表示能力,通过层级连接充分获取多尺度特征;同时,提出了一个多尺度特征聚合模块,采用多列具有不同扩张率的空洞卷积,通过动态特征选择机制自动调整感受野,以有效提取不同尺度个体的特征。该方法能够在保留小尺度个体特征信息的基础上进一步扩大感受野,增强大尺度个体的检测能力,使其更好地适应人群个体的多尺度变化。在3个公共人群计数数据集上进行了实验,实验结果表明,所提模型在计数准确性上有了进一步的提高,其中在ShanghaiTech数据集Part_A上MAE为51.21,MSE为83.70。

关键词: 密集场景, 人群计数, 空洞卷积, 动态特征选择, 点预测

Abstract: Individual scales vary greatly in dense scenes,and the varying scales of target individuals lead to poor crowd counting accuracy.To address this problem,the crowd counting method based on multi-scale feature fusion in dense scenes is proposed.The method investigates the ability of different feature layers to represent feature information for individuals at different scales,with adequate access to multi-scale features through layer connections.At the same time,a multi-scale feature aggregation module is proposed,which uses multiple columns of dilated convolution with different expansion rates,and automatically adjusts the perceptual field through a dynamic feature selection mechanism to effectively extract features of individuals at different scales.The method can further expand the field of perception while preserving the information of small-scale,and improving the detection capability of large-scale individuals,making it better adapted to the multi-scale changes of the population.Experimental results on the three public population counting datasets show that the proposed model has further improved the counting accuracy,with an MAE of 51.21 and an MSE of 83.70 on the ShanghaiTech Part A dataset.

Key words: Intensive scenes, Crowd counting, Dilated convolution, Dynamic feature selection, Point prediction

中图分类号: 

  • TP391.41
[1]DALAL N,TRIGGS B.Histograms of Oriented Gradients forHuman Detection [C]//IEEE Computer Society Conference on Computer Vision & Pattern Recognition.IEEE,2005.
[2]ENZWEILER M,GAVRILA D M.Monocular Pedestrian Detection:Survey and Experiments[J].IEEE Trans on Pattern Ana-lysis & Machine Intelligence,2009,31:2179-2195.
[3]LEE M H,CHUNG K H,CHOI G K,et al.Measurement of Sr-90 in Aqueous Samples Using Liquid Scintillation Counting with Full Spectrum DPM Method[J].Applied Radiation and Isotopes,2002,57(2):257-263.
[4]MIN L,ZHANG Z,HUANG K,et al.Estimating the Number of People in Crowded Scenes by MID Based Foreground Segmentation and Head-shoulder Detection [C]//The 19th International Conference on Pattern Recognition.IEEE,2009.
[5]DAVIES A C,JIA H Y,VELASTIN S A.Crowd MonitoringUsing Image Processing[J].Electronics & Communication Engineering Journal,1995,7(1):37-47.
[6]MIN F,PEI X,LI X,et al.Fast Crowd Density Estimation with Convolutional Neural Networks[J].Engineering Applications of Artificial Intelligence,2015,43(aug.):81-88.
[7]WANG C,HUA Z,LIANG Y,et al.Deep People Counting inExtremely Dense Crowds [C]//The 23rd ACM International Conference.ACM,2015.
[8]ZHANG C,LI H,WANG X,et al.Cross-scene Crowd Counting Via Deep Convolutional Neural Networks [C]//IEEE Confe-rence on Computer Vision & Pattern Recognition.IEEE,2015:833-841.
[9]ARTETA C,LEMPITSKY V,NOBLE J A,et al.InteractiveObject Counting [C]//European Conference on Computer Vision.Cham:Springer,2014.
[10]PENG X,PENG Y X,TANG Q,et al.Crowd Counting Based on Single-column Multi-scale Convolutional Neural Network[J].Computer Science,2020,47(4):150-156.
[11]PHAM V Q,KOZAKAYA T,YAMAGUCHI O,et al.COUNT Forest:Co-voting Uncertain Number of Targets Using Random Forest for Crowd Density Estimation [C]//2015 IEEE International Conference on ComputerVision(ICCV).IEEE,2015.
[12]WALACH E,WOLF L.Learning to Count with CNN Boosting [C]//European Conference on Computer Vision.Cham:Sprin-ger,2016.
[13]LI J Q,YAN H.Crowd Counting Method Based on Cross-co-lumn Features Fusion[J].Computer Science,2021,48(6):118-124.
[14]ZHANG Y,ZHOU D,CHEN S,et al.Single-image CrowdCounting Via Multi-column Convolutional Neural Network [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2016.
[15]SAM D B,SURYA S,BABU R V.Switching ConvolutionalNeural Network for Crowd Counting [C]//Computer Vision & Pattern Recognition.IEEE,2017:5744-5752.
[16]SINDAGI V A,PATEL V M.Generating High-quality Crowd Density Maps Using Contextual Pyramid CNNs [C]//2017 IEEE International Conference on Computer Vision(ICCV).IEEE,2017.
[17]HOSSAIN M,HOSSEINZADEH M,CHANDA O,et al.Crowd Counting Using Scale-aware Attention Networks [C]//2019 IEEE Winter Conferenceon Applications of Computer Vision(WACV).IEEE,2019.
[18]ZHANG A,SHEN J,XIAO Z,et al.Relational Attention Network for Crowd Counting [C]//2019 IEEE/CVF International Conference on Computer Vision(ICCV).IEEE,2020.
[19]JIANG X,ZHANG L,XU M,et al.Attention Scaling for Crowd Counting [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2020.
[20]LI Y,ZHANG X,CHEN D.CSRNet:Dilated ConvolutionalNeural Networks for Understanding the Highly Congested Scenes [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE,2018.
[21]LIU Y,SHI M,ZHAO Q,et al.Point in,Box out:BeyondCounting Persons in Crowds [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2019.
[22]SONG Q,WANG C,JIANG Z,et al.Rethinking Counting andLocalization in Crowds:A Purely Point-based Framework[C]//2021 IEEE/CVF International Conference on Computer Vision(ICCV).IEEE,2021:3345-3354,
[23]JING S,CHEN C L,WANG X.Scene-independent Group Profiling in Crowd [C]//Computer Vision & Pattern Recognition.IEEE,2014.
[24]ZHU F,WANG X G.Crowd Tracking by Group Structure Evolution[J].IEEE Trans on Circuits and Systems for Video Technology,2016,28(3):772-786.
[25]LIN T Y,DOLLAR P,GIRSHICK R,et al.Feature Pyramid Networks for Object Detection [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).IEEE Computer Society,2017:2117-2125.
[26]SIMONYAN K,ZISSERMAN A.Very Deep Convolutional Net-works for Large-scale Image Recognition[J/OL].Computer Science,2014.https://doi.org/10.48550/arXiv.1409.1556.
[27]WU H,WANG W,ZHONG J,et al.SCS-Net:A Scale and Con-text Sensitive Network for Retinal Vessel Segmentation[J].Medical Image Analysis,2021,70(10):102025.
[28]IDREES H,SALEEMI I,SHAH M.Multi-source Multi-scale Counting in Dense Crowd Images [C]//Computer Vision and Pattern Recognition.IEEE,2013:2547-2554.
[29]DEB D,VENTURA J.An Aggregated Multicolumn DilatedConvolution Network for Perspective-free Counting[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops(CVPRW).IEEE,2013:308-317.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!