计算机科学 ›› 2020, Vol. 47 ›› Issue (11A): 183-187.doi: 10.11896/jsjkx.200300012
陈训敏, 叶书函, 詹瑞
CHEN Xun-min, YE Shu-han, ZHAN Rui
摘要: 人群计数是指计算单张图像或单个视频帧中人的数目,为了解决人群任务的计数不够准确的问题,提出了一种基于多任务学习及由粗到精的卷积神经网络人群计数模型。首先,多任务学习是指引入与原始任务相关的辅助任务,指导主要任务的学习,人群密度估计是人群计数模型的主要任务,人群分割任务作为辅助任务以提高网络性能。其次,由粗到精策略表明人群计数模型预测密度图是一个由粗糙到精细的过程,即生成粗糙且不准确的人群密度图,结合人群分割图后得到准确的人群密度图。在Shanghai Tech数据集Part A部分、Part B部分和UCF_CC_50数据集上的实验表明,所提人群计数模型相比之前最好的CSRNet模型绝对误差分别降低了4.55%,14.15%,19.09%,均方误差分别降低了10.00%,19.09%,19.47%,显著提高了人群计数模型的准确性和鲁棒性。
中图分类号:
[1] FU H,MA H,XIAO H.Scene adaptive accurate and fast vertical crowd counting via joint using depth and color information[J].Multimedia Tools and Applications,2014,73(1):273-289. [2] WEI WU,ZHANG Q S,WANG M J,et al.Detection of traffic parameters based on computer vision and image processing[J].Information and Control,2001,30(3):257-261. [3] FRENCH G,FISHER M,MACKIEWICZ M,et al.Convolutionalneural networks for counting fish in fisheries surveillancevi-deo[C]//British Machine Vision Conference.2015:23-32. [4] RYAN D,DENMON S,SRIDHARAN S,et al.An evaluation of crowd counting methods,features and regression models[J].Computer Vision and Image Understanding,2015,130:1-17. [5] VIOLA P,JONES M J.Robust Real time face detection[J].International Journal of Computer Vision,2004,57(2):137-154. [6] DALAL N,TRIGGS B.Histograms of oriented gradients forhuman detection[C]//IEEE Computer Society Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2005:886-893. [7] HAAR A.Zur Theorie der orthogonalen Funktionen systeme[J].Mathematische Annalen,1911,71(1):38-53. [8] WU B,NEVATIA R.Detection of multiple,partially occludedhumans in a single image by Bayesian combination of edgelet part detectors[C]//Tenth IEEE International Conference on Computer Vision,2005(ICCV 2005).IEEE,2005:90-97. [9] HEARTS M A,DUMAIS S T,OSMAN E,et al.Support vector machines[J].IEEE Intelligent Systems,1998,13(4):18-28. [10] LIN S F,CHEN J Y,CHAO H X.Estimation of number of people in crowded scenes using perspective transformation[J].IEEE Transactions on Systems,Man & Cybernetics Part A (Systems & Humans),2001,31(6):645-654. [11] VIOLA P,JONES M,SNOW D.Detecting pedestrians usingpatterns of motion and appearance[J].International Journal of Computer Vision,2005,63(2):153-161. [12] CHAN A B,LIANG Z S J,VASCONCELOS N.Privacy preserving crowd monitoring:counting people without people models or tracking[C]//Proceedings of the2008 IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Socie-ty,2008:1-7. [13] CHAN A B,VASCONCELOS N.Bayesian poisson regression for crowd counting[C]//2009 IEEE 12th International Conference on Computer Vision.IEEE,2009:545-551. [14] RYAN D,DENMAN S,FOOKES C B,et al.Crowd counting using multiple local features[C]//2009 Digital Image Computing:Techniques and Applications.IEEE,2009:81-88. [15] LEMPITSKY V,ZISSERMAN A.Learning to count objects in images[C]//In Advances in Neural Information Processing Systems,2010:1324-1332. [16] OJALA T,PIETIKAINEN,M,MAENPAA,T.Multiresolution gray-scale and rotation invariant texture classification with local binary patterns[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2002,24(7):971-987. [17] PARAGIOS N,RAMESH V.A MRF-based approach for real-time subway monitoring[C]//Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR 2001).IEEE,2001:1034-1040. [18] PHAM V Q,KOZAKAYA T,YAMAGUCHI O,et al.Count Forest:Covoting Uncertain Number of Targets using Random Forest for Crowd Density Estimation[C]//International Confe-rence on Computer Vision (ICCV 2015).IEEE,2015:3253-3261. [19] ZHANG Y,ZHAN D,CHEN S,et al.Single-image crowdcounting via multi-column convolutional neural network[C]//IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2016:589-597. [20] SAM D B,SURYA S,BABU R V.Switching ConvolutionalNeural Network for Crowd Counting[C]//2017IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE,2017:4031-4039. [21] LI Y,ZHANG X,CHEN D.CSRNet:dilated convolutional neural networks for understanding the highly congested scenes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:1091-1100. [22] KANG K,WANG X.Fully convolutional neural networks forcrowd segmentation[J].Computer Science,2014,49(1):25-30. [23] KINGMA D P,BA J.Adam:A method for stochastic optimization[J].arXiv:1412.6980,2014. [24] ZHANG C,LI H,WANG X,et al.Cross-scene crowd counting via deep convolutional neural networks[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE,2015:833-841. [25] CAO X,WANG Z,ZHAO Y,et al.Scale aggregation network for accurate and efficient crowd counting[C]//Proceedings of the European Conference on Computer Vision (ECCV).2018:734-750. |
[1] | 周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026 |
[2] | 李宗民, 张玉鹏, 刘玉杰, 李华. 基于可变形图卷积的点云表征学习 Deformable Graph Convolutional Networks Based Point Cloud Representation Learning 计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023 |
[3] | 陈泳全, 姜瑛. 基于卷积神经网络的APP用户行为分析方法 Analysis Method of APP User Behavior Based on Convolutional Neural Network 计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121 |
[4] | 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153 |
[5] | 檀莹莹, 王俊丽, 张超波. 基于图卷积神经网络的文本分类方法研究综述 Review of Text Classification Methods Based on Graph Convolutional Network 计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064 |
[6] | 金方焱, 王秀利. 融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取 Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM 计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190 |
[7] | 张颖涛, 张杰, 张睿, 张文强. 全局信息引导的真实图像风格迁移 Photorealistic Style Transfer Guided by Global Information 计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036 |
[8] | 戴朝霞, 李锦欣, 张向东, 徐旭, 梅林, 张亮. 基于DNGAN的磁共振图像超分辨率重建算法 Super-resolution Reconstruction of MRI Based on DNGAN 计算机科学, 2022, 49(7): 113-119. https://doi.org/10.11896/jsjkx.210600105 |
[9] | 刘月红, 牛少华, 神显豪. 基于卷积神经网络的虚拟现实视频帧内预测编码 Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network 计算机科学, 2022, 49(7): 127-131. https://doi.org/10.11896/jsjkx.211100179 |
[10] | 徐鸣珂, 张帆. Head Fusion:一种提高语音情绪识别的准确性和鲁棒性的方法 Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition 计算机科学, 2022, 49(7): 132-141. https://doi.org/10.11896/jsjkx.210100085 |
[11] | 杨涵, 万游, 蔡洁萱, 方铭宇, 吴卓超, 金扬, 钱伟行. 基于步态分类辅助的虚拟IMU的行人导航方法 Pedestrian Navigation Method Based on Virtual Inertial Measurement Unit Assisted by GaitClassification 计算机科学, 2022, 49(6A): 759-763. https://doi.org/10.11896/jsjkx.211200148 |
[12] | 杨玥, 冯涛, 梁虹, 杨扬. 融合交叉注意力机制的图像任意风格迁移 Image Arbitrary Style Transfer via Criss-cross Attention 计算机科学, 2022, 49(6A): 345-352. https://doi.org/10.11896/jsjkx.210700236 |
[13] | 杨健楠, 张帆. 一种结合双注意力机制和层次网络结构的细碎农作物分类方法 Classification Method for Small Crops Combining Dual Attention Mechanisms and Hierarchical Network Structure 计算机科学, 2022, 49(6A): 353-357. https://doi.org/10.11896/jsjkx.210200169 |
[14] | 张嘉淏, 刘峰, 齐佳音. 一种基于Bottleneck Transformer的轻量级微表情识别架构 Lightweight Micro-expression Recognition Architecture Based on Bottleneck Transformer 计算机科学, 2022, 49(6A): 370-377. https://doi.org/10.11896/jsjkx.210500023 |
[15] | 王建明, 陈响育, 杨自忠, 史晨阳, 张宇航, 钱正坤. 不同数据增强方法对模型识别精度的影响 Influence of Different Data Augmentation Methods on Model Recognition Accuracy 计算机科学, 2022, 49(6A): 418-423. https://doi.org/10.11896/jsjkx.210700210 |
|