基于多任务学习及由粗到精的卷积神经网络人群计数模型

doi:10.11896/jsjkx.200300012

Abstract

Abstract: Crowd counting refers to counting the number of people in a single image or a single video frame.In order to solve the problem of insufficient counting of crowd tasks,a crowd counting model based on multi-task learning and coarse to fine convolutional neural network is proposed.Firstly,multi-task learning means introducing auxiliary tasks related to the original task to guide the learning of the main tasks.The crowd density estimation is the main task of the crowd counting model,and the crowd segmentation task is used as an auxiliary task to improve network performance.Secondly,the proposed crowd counting model is able to predict the density map from coarse to fine.A rough and inaccurate crowd density map is generated,which is combined with the crowd segmentation map to obtain an accurate crowd density map.Experiments on the Shanghai Tech dataset Part A and Part B,and UCF_CC_50 dataset show that the proposed crowd counting model outperforms the state of the art CSRNet models by 4.55%,14.15% and 19.09% respectively,and the mean square error is reduced by 10.00 %,19.09% and 19.47% respectively compared with the SOTAs.The proposed model significantly improves the accuracy and robustness of the crowd counting model.

Key words: Convolutional neural network, Crowd counting, Crowd density estimation, Crowd segmentation, Multi-task learning

CLC Number:

TP391

CHEN Xun-min, YE Shu-han, ZHAN Rui. Crowd Counting Model of Convolutional Neural Network Based on Multi-task Learning and Coarse to Fine[J].Computer Science, 2020, 47(11A): 183-187.

References

[1] FU H,MA H,XIAO H.Scene adaptive accurate and fast vertical crowd counting via joint using depth and color information[J].Multimedia Tools and Applications,2014,73(1):273-289.
[2] WEI WU,ZHANG Q S,WANG M J,et al.Detection of traffic parameters based on computer vision and image processing[J].Information and Control,2001,30(3):257-261.
[3] FRENCH G,FISHER M,MACKIEWICZ M,et al.Convolutionalneural networks for counting fish in fisheries surveillancevi-deo[C]//British Machine Vision Conference.2015:23-32.
[4] RYAN D,DENMON S,SRIDHARAN S,et al.An evaluation of crowd counting methods,features and regression models[J].Computer Vision and Image Understanding,2015,130:1-17.
[5] VIOLA P,JONES M J.Robust Real time face detection[J].International Journal of Computer Vision,2004,57(2):137-154.
[6] DALAL N,TRIGGS B.Histograms of oriented gradients forhuman detection[C]//IEEE Computer Society Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2005:886-893.
[7] HAAR A.Zur Theorie der orthogonalen Funktionen systeme[J].Mathematische Annalen,1911,71(1):38-53.
[8] WU B,NEVATIA R.Detection of multiple,partially occludedhumans in a single image by Bayesian combination of edgelet part detectors[C]//Tenth IEEE International Conference on Computer Vision,2005(ICCV 2005).IEEE,2005:90-97.
[9] HEARTS M A,DUMAIS S T,OSMAN E,et al.Support vector machines[J].IEEE Intelligent Systems,1998,13(4):18-28.
[10] LIN S F,CHEN J Y,CHAO H X.Estimation of number of people in crowded scenes using perspective transformation[J].IEEE Transactions on Systems,Man & Cybernetics Part A (Systems & Humans),2001,31(6):645-654.
[11] VIOLA P,JONES M,SNOW D.Detecting pedestrians usingpatterns of motion and appearance[J].International Journal of Computer Vision,2005,63(2):153-161.
[12] CHAN A B,LIANG Z S J,VASCONCELOS N.Privacy preserving crowd monitoring:counting people without people models or tracking[C]//Proceedings of the2008 IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Socie-ty,2008:1-7.
[13] CHAN A B,VASCONCELOS N.Bayesian poisson regression for crowd counting[C]//2009 IEEE 12th International Conference on Computer Vision.IEEE,2009:545-551.
[14] RYAN D,DENMAN S,FOOKES C B,et al.Crowd counting using multiple local features[C]//2009 Digital Image Computing:Techniques and Applications.IEEE,2009:81-88.
[15] LEMPITSKY V,ZISSERMAN A.Learning to count objects in images[C]//In Advances in Neural Information Processing Systems,2010:1324-1332.
[16] OJALA T,PIETIKAINEN,M,MAENPAA,T.Multiresolution gray-scale and rotation invariant texture classification with local binary patterns[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2002,24(7):971-987.
[17] PARAGIOS N,RAMESH V.A MRF-based approach for real-time subway monitoring[C]//Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR 2001).IEEE,2001:1034-1040.
[18] PHAM V Q,KOZAKAYA T,YAMAGUCHI O,et al.Count Forest:Covoting Uncertain Number of Targets using Random Forest for Crowd Density Estimation[C]//International Confe-rence on Computer Vision (ICCV 2015).IEEE,2015:3253-3261.
[19] ZHANG Y,ZHAN D,CHEN S,et al.Single-image crowdcounting via multi-column convolutional neural network[C]//IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2016:589-597.
[20] SAM D B,SURYA S,BABU R V.Switching ConvolutionalNeural Network for Crowd Counting[C]//2017IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE,2017:4031-4039.
[21] LI Y,ZHANG X,CHEN D.CSRNet:dilated convolutional neural networks for understanding the highly congested scenes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:1091-1100.
[22] KANG K,WANG X.Fully convolutional neural networks forcrowd segmentation[J].Computer Science,2014,49(1):25-30.
[23] KINGMA D P,BA J.Adam:A method for stochastic optimization[J].arXiv:1412.6980,2014.
[24] ZHANG C,LI H,WANG X,et al.Cross-scene crowd counting via deep convolutional neural networks[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE,2015:833-841.
[25] CAO X,WANG Z,ZHAO Y,et al.Scale aggregation network for accurate and efficient crowd counting[C]//Proceedings of the European Conference on Computer Vision (ECCV).2018:734-750.

Related Articles 15

[1]	ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[2]	CHEN Yong-quan, JIANG Ying. Analysis Method of APP User Behavior Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(8): 78-85.
[3]	ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[4]	DAI Zhao-xia, LI Jin-xin, ZHANG Xiang-dong, XU Xu, MEI Lin, ZHANG Liang. Super-resolution Reconstruction of MRI Based on DNGAN [J]. Computer Science, 2022, 49(7): 113-119.
[5]	LIU Yue-hong, NIU Shao-hua, SHEN Xian-hao. Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(7): 127-131.
[6]	XU Ming-ke, ZHANG Fan. Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition [J]. Computer Science, 2022, 49(7): 132-141.
[7]	YANG Yue, FENG Tao, LIANG Hong, YANG Yang. Image Arbitrary Style Transfer via Criss-cross Attention [J]. Computer Science, 2022, 49(6A): 345-352.
[8]	YANG Jian-nan, ZHANG Fan. Classification Method for Small Crops Combining Dual Attention Mechanisms and Hierarchical Network Structure [J]. Computer Science, 2022, 49(6A): 353-357.
[9]	ZHANG Jia-hao, LIU Feng, QI Jia-yin. Lightweight Micro-expression Recognition Architecture Based on Bottleneck Transformer [J]. Computer Science, 2022, 49(6A): 370-377.
[10]	WANG Jian-ming, CHEN Xiang-yu, YANG Zi-zhong, SHI Chen-yang, ZHANG Yu-hang, QIAN Zheng-kun. Influence of Different Data Augmentation Methods on Model Recognition Accuracy [J]. Computer Science, 2022, 49(6A): 418-423.
[11]	SUN Jie-qi, LI Ya-feng, ZHANG Wen-bo, LIU Peng-hui. Dual-field Feature Fusion Deep Convolutional Neural Network Based on Discrete Wavelet Transformation [J]. Computer Science, 2022, 49(6A): 434-440.
[12]	DU Li-jun, TANG Xi-lu, ZHOU Jiao, CHEN Yu-lan, CHENG Jian. Alzheimer's Disease Classification Method Based on Attention Mechanism and Multi-task Learning [J]. Computer Science, 2022, 49(6A): 60-65.
[13]	WU Zi-bin, YAN Qiao. Projected Gradient Descent Algorithm with Momentum [J]. Computer Science, 2022, 49(6A): 178-183.
[14]	ZHAO Zheng-peng, LI Jun-gang, PU Yuan-yuan. Low-light Image Enhancement Based on Retinex Theory by Convolutional Neural Network [J]. Computer Science, 2022, 49(6): 199-209.
[15]	ZHANG Wen-xuan, WU Qin. Fine-grained Image Classification Based on Multi-branch Attention-augmentation [J]. Computer Science, 2022, 49(5): 105-112.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Crowd Counting Model of Convolutional Neural Network Based on Multi-task Learning and Coarse to Fine

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0