Computer Science ›› 2020, Vol. 47 ›› Issue (4): 150-156.doi: 10.11896/jsjkx.190400034

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Crowd Counting Based on Single-column Multi-scale Convolutional Neural Network

PENG Xian, PENG Yu-xu, TANG Qiang, SONG Yan-qi   

  1. School of Computer and Communication Engineering,Changsha University of Science Technology,Changsha 410000,China
  • Received:2019-04-05 Online:2020-04-15 Published:2020-04-15
  • Contact: PENG Yu-xu,born in 1977,Ph.D,associate professor,CCF member,mainly focuses on signal and information processing.
  • About author:PENG Xian,born in 1994,master.His main research area is deep learning.
  • Supported by:
    This work was supported by the Research Foundation of Education Bureau of Hunan Province,China(18B162) and Young Teacher Development Foundation of Changsha University of Science & Technology(2019QJCZ014).

Abstract: The problem of crowd counting in single images and monitoring videos has received increasing attention in recent years.Due to the scale change and crowd occlusion,crowd counting is a very challenging problem,but deep convolutional neural network has been proved to be effective in solving this problem.In this paper,a single-column multi-scale convolutional neural network is proposed,which provides a data-driven deep learning method that can understand various scenarios and perform accurate counting and estimation.The proposed network model is mainly composed of the front end and the middle end,for two-dimensional features extraction,as well as the back end,which is used to restore the density map.Stack pools are used to replace the maximum pooling layer,and scale invariance of the model is increased without introducing additional parameters.Partial vgg-16 structure is adopted at the front end of the network model,and FME (feature aggregation module) is adopted in the middle to break the independence between different columns,to better extract multi-scale feature information.At the back end,three columns and five layers of cavity convolution with different expansion rates are adopted to increase the sensing field while keeping the resolution unchanged,generating a crowd density map with higher quality.A relative population loss is introduced to improve the model performance in the case of sparse population density.This model works well on two of the most challenging crowd counting data sets.The results show that on two subsets of ShanghaiTech and UCF_CC_50,the mean absolute error (MAE) and mean square error (MSE) of the proposed method are 66.2 and 103.0,8.7 and 13.4,251.0 and 329.5,respectively,achieving better performance than the traditional crowd counting methods.Compared with other models,the proposed model has higher accuracy,better robustness and better counting effect for images with sparse population.

Key words: Convolutional neural networks, Crowd counting, Dilated convolution, Feature combination, Relative head loss, Stacked-pooling

CLC Number: 

  • TP391
[1]QU J,SHI Z L,YE Y D.Unbalanced crowd density estimation based on convolutional features[J].Computer Science,2018,45(8):236-241.
[2]ZHANG Y,ZHOU D,CHEN S,et al.Single-image crowdcounting via multi-column convolutional neural network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:589-597.
[3]WANG C,ZHANG H,YANG L,et al.Deep people counting in extremely dense crowds[C]//Proceedings of the 23rd ACM International Conference on Multimedia.ACM,2015:1299-1302.
[4]LIN S F,CHEN J Y,CHAO H X.Estimation of number of people in crowded scenes using perspective transformation[J].IEEE Transactions on Systems,Man,and Cybernetics-Part A:Systems and Humans,2001,31(6):645-654.
[5]DALAL N,TRIGGS B.Histograms of oriented gradients forhuman detection[C]//IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2005(CVPR 2005).IEEE,2005:886-893.
[6]WANG M,WANG X.Automatic adaptation of a generic pedestrian detector to a specific traffic scene[C]//CVPR 2011.IEEE,2011:3401-3408[7]GE W,COLLINS R T.Marked point processes for crowd counting[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2009:2913-2920.
[8]IDREES H,SOOMRO K,SHAH M.Detecting humans in dense crowds using locally-consistent scale prior and global occlusion reasoning[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(10):1986-1998.
[9]LIN Z,DAVIS L S.Shape-based human detection and segmentation via hierarchical part-template matching[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2010,32(4):604-618.
[10]CHAN A B,VASCONCELOS N.Bayesian poisson regressionfor crowd counting[C]//2009 IEEE 12th International Conference on Computer Vision.IEEE,2009:545-551.
[11]CHEN K,LOY C C,GONG S,et al.Feature mining for localised crowd counting[C]//BMVC.2012:3.
[12]LEMPITSKY V,ZISSERMAN A.Learning to count objects in images[C]//Advances in Neural Information Processing Systems.2010:1324-1332.
[13]ZHANG Y,ZHOU D,CHEN S,et al.Single-image crowdcounting via multi-column convolutional neural network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:589-597.
[14]ZENG L,XU X,CAI B,et al.Multi-scale convolutional neural networks for crowd counting[C]//2017 IEEE International Conference on Image Processing (ICIP).IEEE,2017:465-469.
[15]LI Y,ZHANG X,CHEN D.Csrnet:Dilated convolutional neural networks for understanding the highly congested scenes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:1091-1100.
[16]CAO X,WANG Z,ZHAO Y,et al.Scale aggregation network for accurate and efficient crowd counting[C]//Proceedings of the European Conference on Computer Vision (ECCV).2018:734-750.
[17]HUANG S,LI X,CHENG Z Q,et al.Stacked pooling:Improving crowd counting by boosting scale invariance[J].arXiv:1808.07456,2018[18]SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:1-9.
[19]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014.
[20]NAIR V,HINTON G E.Rectified linear units improve restrictedboltzmann machines[C]//Proceedings of the 27th International Conference on Machine Learning (ICML-10).2010:807-814.
[21]YU F,KOLTUN V.Multi-scale context aggregation by dilated convolutions[J].arXiv:1511.07122,2015.
[22]CHEN L C,PAPANDREOU G,KOKKINOS I,et al.Deeplab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected crfs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,40(4):834-848.
[23]CHEN L C,PAPANDREOU G,SCHROFF F,et al.Rethinking atrous convolution for semantic image segmentation[J].arXiv:1706.05587,2017.
[24]ZEILER M D,KRISHNAN D,TAYLOR G W,et al.Deconvolutional networks[C]//2010 IEEE Computer Society Confe-rence on Computer Vision and Pattern Recognition.IEEE,2010:2528-2535.
[25]NOH H,HONG S,HAN B.Learning deconvolution network for semantic segmentation[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:1520-1528.
[26]ZHANG L,SHI M,CHEN Q.Crowd counting via scale-adaptive convolutional neural network[C]//2018 IEEE WinterConfe-rence on Applications of Computer Vision (WACV).IEEE,2018:1113-1121.
[27]RODRIGUEZ M,LAPTEV I,SIVIC J,et al.Density-aware person detection and tracking in crowds[C]//2011 International Conference on Computer Vision.IEEE,2011:2423-2430.
[1] ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[2] WANG Jian-ming, CHEN Xiang-yu, YANG Zi-zhong, SHI Chen-yang, ZHANG Yu-hang, QIAN Zheng-kun. Influence of Different Data Augmentation Methods on Model Recognition Accuracy [J]. Computer Science, 2022, 49(6A): 418-423.
[3] SUN Jie-qi, LI Ya-feng, ZHANG Wen-bo, LIU Peng-hui. Dual-field Feature Fusion Deep Convolutional Neural Network Based on Discrete Wavelet Transformation [J]. Computer Science, 2022, 49(6A): 434-440.
[4] QU Zhong, CHEN Wen. Concrete Pavement Crack Detection Based on Dilated Convolution and Multi-features Fusion [J]. Computer Science, 2022, 49(3): 192-196.
[5] CHEN Zhi-yi, SUI Jie. DeepFM and Convolutional Neural Networks Ensembles for Multimodal Rumor Detection [J]. Computer Science, 2022, 49(1): 101-107.
[6] WANG Shi-yun, YANG Fan. Remote Sensing Image Semantic Segmentation Method Based on U-Net Feature Fusion Optimization Strategy [J]. Computer Science, 2021, 48(8): 162-168.
[7] HE Qing-fang, WANG Hui, CHENG Guang. Research on Classification of Breast Cancer Pathological Tissues with Adaptive Small Data Set [J]. Computer Science, 2021, 48(6A): 67-73.
[8] LI Jia-qian, YAN Hua. Crowd Counting Method Based on Cross-column Features Fusion [J]. Computer Science, 2021, 48(6): 118-124.
[9] GONG Hang, LIU Pei-shun. Detection Method of High Beam in Night Driving Vehicle [J]. Computer Science, 2021, 48(12): 256-263.
[10] YANG Kun, ZHANG Juan, FANG Zhi-jun. Multi-patch and Multi-scale Hierarchical Aggregation Network for Fast Nonhomogeneous ImageDehazing [J]. Computer Science, 2021, 48(11): 250-257.
[11] GAO Chuang, LI Jian-hua, JI Xiu-yi, ZHU Cheng-long, LI Shi-liang, LI Hong-lin. Drug Target Interaction Prediction Method Based on Graph Convolutional Neural Network [J]. Computer Science, 2021, 48(10): 127-134.
[12] SUN Yan-li, YE Jiong-yao. Convolutional Neural Networks Compression Based on Pruning and Quantization [J]. Computer Science, 2020, 47(8): 261-266.
[13] MA Hai-Jiang. Recommendation Algorithm Based on Convolutional Neural Network and Constrained Probability Matrix Factorization [J]. Computer Science, 2020, 47(6A): 540-545.
[14] WU Hao-hao and WANG Fang-shi. Application of Multi-scale Dilated Convolution in Image Classification [J]. Computer Science, 2020, 47(6A): 166-171.
[15] SONG Ling-ling, WANG Shi-hui, YANG Chao, SHENG Xiao. Application Research of Improved XGBoost in Imbalanced Data Processing [J]. Computer Science, 2020, 47(6): 98-103.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!