计算机科学 ›› 2021, Vol. 48 ›› Issue (6): 118-124.doi: 10.11896/jsjkx.200700107

• 计算机图形学&多媒体 • 上一篇    下一篇

基于跨列特征融合的人群计数方法

李佳倩, 严华   

  1. 四川大学电子信息学院 成都610065
  • 收稿日期:2020-07-17 修回日期:2020-10-20 出版日期:2021-06-15 发布日期:2021-06-03
  • 通讯作者: 严华(yanhua@scu.edu.cn)
  • 基金资助:
    国家自然科学基金(11872069)

Crowd Counting Method Based on Cross-column Features Fusion

LI Jia-qian, YAN Hua   

  1. School of Electronics and Information Engineering,Sichuan University,Chengdu 610065,China
  • Received:2020-07-17 Revised:2020-10-20 Online:2021-06-15 Published:2021-06-03
  • About author:LI Jia-qian,born in 1996,postgraduate.Her main research interests include computer vision and deep learning.(jiiaqian@outlook.com)
    YAN Hua,born in 1971,Ph.D,professor.His main research interests include Intelligent information system and so on.
  • Supported by:
    National Natural Science Foundation of China(11872069).

摘要: 人群计数是计算机视觉和机器学习领域中一个极具挑战性的课题。由于人群尺度变化和场景遮挡等现象会导致计数准确度不高,因此提出了一种基于跨列特征融合的人群计数方法(Cross-column Features Fusion Network,CCFNet)。该方法融合了来自多列不同接受域的特征,并且结合了拥有互质扩张率的空洞卷积,因此不仅能够增大感受野,还能保证信息的连续性,从而更好地适应人群规模的巨大变化;同时引入注意力模型引导网络聚焦于图片中的头部位置,根据注意力分数图为不同位置分配不同的权重,突出人群而弱化背景,最终得到高质量的密度图。在当前主流的人群计数数据集上的对比实验中,所提方法的平均绝对误差(Mean Absolute Error,MAE)在ShanghaiTech数据集的A,B子集上分别达到了63.2和8.9,在UCF_CC_50数据集上达到了222.1,在WorldExpo’10数据集上达到了7.1。这表明所提方法具有更好的计数准确度,能够很好地适应不同的场景,尤其对于尺度变化较大的场景,效果优于以往的大多数算法。

关键词: 空洞卷积, 跨列特征融合, 人群计数, 注意力模型

Abstract: Crowd counting is a challenging subject in computer vision and machine learning.Due to the phenomenon of crowd scale change and scene occlusion,the counting accuracy is low.A crowd counting method based on cross-column features fusion,called cross-column features fusion network(CCFNet),is proposed in this paper.CCFNet fuses features from multiple columns and different receptive fields,and combines with the dilate convolution employing coprime expansion rate.Therefore,CCFNet can not only increase the receptive field but also ensure the continuity of information,so as to adapt to the huge changes in the crowd size better.At the same time,the attention model is introduced to guide the network to focus on the head position in the images.According to the attention score graph,different weights are assigned to different positions to highlight the crowd and weaken the background.Finally,a high-quality density map is obtained.In comparative experiments on the current mainstream population counting datasets,the mean absolute error(MAE) reaches 63.2 and 8.9 on the A and B subsets of the ShanghaiTech dataset,222.1 on the UCF_CC_50 dataset,and 7.1 on the WorldExpo’10 dataset.The results show that the proposed method has better counting accuracy and can adapt to different scenes.Especially for scenes with large scale variation,its effect is better than most of the pre-vious algorithms.

Key words: Attention model, Cross-column features fusion, Crowd counting, Dilate convolution

中图分类号: 

  • TP391
[1]SINDAGI V A,PATEL V M.A survey of recent advances in cnn-based single image crowd counting and density estimation[J].Pattern Recognition Letters,2018,107:3-16.
[2]ZHANG Y,ZHOU D,CHEN S,et al.Single-image crowdcounting via multi-column convolutional neural network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:589-597.
[3]ZENG L,XU X,CAI B,et al.Multi-scale convolutional neuralnetworks for crowd counting[C]//2017 IEEE International Conference on Image Processing(ICIP).IEEE,2017:465-469.
[4]JIANG X,XIAO Z,ZHANG B,et al.Crowd counting and density estimation by trellis encoder-decoder networks[C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:6133-6142.
[5]SINDAGI V A,PATEL V M.Generating high-quality crowddensity maps using contextual pyramid cnns[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:1861-1870.
[6]WU B,NEVATIA R.Detection of multiple,partially occludedhumans in a single image by bayesian combination of edgelet part detectors[C]//Tenth IEEE International Conference on Computer Vision (ICCV’05).IEEE,2005,1:90-97.
[7]WANG M,WANG X.Automatic adaptation of a generic pedestrian detector to a specific traffic scene[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2011:3401-3408.
[8]DOLLAR P,WOJEK C,SCHIELE B,et al.Pedestrian detec-tion:An evaluation of the state of the art[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2011,34(4):743-761.
[9]VIOLA P,JONES M J.Robust real-time face detection[J].International Journal of Computer Vision,2004,57(2):137-154.
[10]DALAL N,TRIGGS B.Histograms of oriented gradients forhuman detection[C]//2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR’05).IEEE,2005:886-893.
[11]CHAN A B,LIANG Z S J,VASCONCELOS N.Privacy preserving crowd monitoring:Counting people without people models or tracking[C]//2008 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2008:1-7.
[12]CHEN K,LOY C C,GONG S,et al.Feature mining for localised crowd counting[C]//British Machine Vision Conference.2012:3.
[13]RYAN D,DENMAN S,FOOKES C,et al.Crowd counting using multiple local features[C]//2009 Digital Image Computing:Techniques and Applications.IEEE,2009:81-88.
[14]LEMPITSKY V,ZISSERMAN A.Learning to count objects in images[C]//Advances in Neural Information Processing Systems.2010:1324-1332.
[15]WANG C,ZHANG H,YANG L,et al.Deep people counting in extremely dense crowds[C]//Proceedings of the 23rd ACM International Conference on Multimedia.2015:1299-1302.
[16]BABU S D,SAJJAN N N,VENKATESH B R,et al.Divide and grow:Capturing huge diversity in crowd images with incrementally growing cnn[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:3618-3626.
[17]CAO X,WANG Z,ZHAO Y,et al.Scale aggregation network for accurate and efficient crowd counting[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:734-750.
[18]SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:1-9.
[19]SHEN Z,XU Y,NI B,et al.Crowd counting via adversarial cross-scale consistency pursuit[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:5245-5254.
[20]LI Y,ZHANG X,CHEN D.Csrnet:Dilated convolutional neural networks for understanding the highly congested scenes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:1091-1100.
[21]SHI M,YANG Z,XU C,et al.Revisiting perspective information for efficient crowd counting[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:7279-7288.
[22]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014.
[23]LIN T Y,DOLLÁR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2117-2125.
[24]YU F,KOLTUN V.Multi-scale context aggregation by dilated convolutions[J].arXiv:1511.07122,2015.
[25]WANG P,CHEN P,YUAN Y,et al.Understanding convolution for semantic segmentation[C]//2018 IEEE Winter Conference on Applications of Computer Vision(WACV).IEEE,2018:1451-1460.
[26]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7132-7141.
[27]ZHANG Y,ZHOU C,CHANG F,et al.Multi-resolution attention convolutional neural network for crowdcounting[J].Neurocomputing,2019,329:144-152.
[28]CHEN J,SU W,WANG Z.Crowd counting with crowd attention convolutional neuralnetwork[J].Neurocomputing,2020,382:210-220.
[29]IDREES H,SALEEMI I,SEIBERT C,et al.Multi-source multi-scale counting in extremely dense crowd images[C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition.2013:2547-2554.
[30]ZHANG C,LI H,WANG X,et al.Cross-scene crowd counting via deep convolutional neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:833-841.
[1] 吴子仪, 李邵梅, 姜梦函, 张建朋.
基于自注意力模型的本体对齐方法
Ontology Alignment Method Based on Self-attention
计算机科学, 2022, 49(9): 215-220. https://doi.org/10.11896/jsjkx.210700190
[2] 赵征鹏, 李俊钢, 普园媛.
基于卷积神经网络的Retinex低照度图像增强
Low-light Image Enhancement Based on Retinex Theory by Convolutional Neural Network
计算机科学, 2022, 49(6): 199-209. https://doi.org/10.11896/jsjkx.210400092
[3] 瞿中, 陈雯.
基于空洞卷积和多特征融合的混凝土路面裂缝检测
Concrete Pavement Crack Detection Based on Dilated Convolution and Multi-features Fusion
计算机科学, 2022, 49(3): 192-196. https://doi.org/10.11896/jsjkx.210100164
[4] 王施云, 杨帆.
基于U-Net特征融合优化策略的遥感影像语义分割方法
Remote Sensing Image Semantic Segmentation Method Based on U-Net Feature Fusion Optimization Strategy
计算机科学, 2021, 48(8): 162-168. https://doi.org/10.11896/jsjkx.200700182
[5] 龚航, 刘培顺.
夜间行驶车辆远光灯检测方法
Detection Method of High Beam in Night Driving Vehicle
计算机科学, 2021, 48(12): 256-263. https://doi.org/10.11896/jsjkx.200700026
[6] 许华杰, 杨洋, 李桂兰.
基于注意力机制和深度卷积神经网络的材质识别方法
Material Recognition Method Based on Attention Mechanism and Deep Convolutional Neural Network
计算机科学, 2021, 48(10): 220-225. https://doi.org/10.11896/jsjkx.200800073
[7] 朱威, 王图强, 陈悦峰, 何德峰.
基于多尺度残差网络的对象级边缘检测算法
Object-level Edge Detection Algorithm Based on Multi-scale Residual Network
计算机科学, 2020, 47(6): 144-150. https://doi.org/10.11896/jsjkx.190700121
[8] 彭贤彭, 玉旭汤强, 宋砚琪.
基于单列多尺度卷积神经网络的人群计数
Crowd Counting Based on Single-column Multi-scale Convolutional Neural Network
计算机科学, 2020, 47(4): 150-156. https://doi.org/10.11896/jsjkx.190400034
[9] 刘砚, 雷印杰, 宁芊.
基于深度神经网络的“弱监督”密集场景人群计数算法
Study of Crowd Counting Algorithm of “Weak Supervision” Dense Scene Based on DeepNeural Network
计算机科学, 2020, 47(4): 184-188. https://doi.org/10.11896/jsjkx.190700212
[10] 周鹏程,龚声蓉,钟珊,包宗铭,戴兴华.
基于深度特征融合的图像语义分割
Image Semantic Segmentation Based on Deep Feature Fusion
计算机科学, 2020, 47(2): 126-134. https://doi.org/10.11896/jsjkx.190100119
[11] 陈训敏, 叶书函, 詹瑞.
基于多任务学习及由粗到精的卷积神经网络人群计数模型
Crowd Counting Model of Convolutional Neural Network Based on Multi-task Learning and Coarse to Fine
计算机科学, 2020, 47(11A): 183-187. https://doi.org/10.11896/jsjkx.200300012
[12] 杨培健, 吴晓富, 张索非, 周全.
基于空洞卷积鉴别器的语义分割迁移算法
Semantic Segmentation Transfer Algorithm Based on Atrous Convolution Discriminator
计算机科学, 2020, 47(11): 174-178. https://doi.org/10.11896/jsjkx.191100014
[13] 李宗民, 李思远, 刘玉杰, 李华.
基于注意力模型的手绘图像检索方法
Sketch-based Image Retrieval Based on Attention Model
计算机科学, 2020, 47(11): 199-204. https://doi.org/10.11896/jsjkx.190800145
[14] 龙星延, 屈丹, 张文林.
结合瓶颈特征的注意力声学模型
Attention Based Acoustics Model Combining Bottleneck Feature LONG Xing-yan QU Dan ZHANG Wen-lin
计算机科学, 2019, 46(1): 260-264. https://doi.org/10.11896/j.issn.1002-137X.2019.01.040
[15] 李云波, 唐斯琪, 周星宇, 潘志松.
可伸缩模块化CNN人群计数方法
Crowd Counting Method via Scalable Modularized Convolutional Neural Network
计算机科学, 2018, 45(8): 17-21. https://doi.org/10.11896/j.issn.1002-137X.2018.08.004
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!