计算机科学 ›› 2018, Vol. 45 ›› Issue (8): 17-21.doi: 10.11896/j.issn.1002-137X.2018.08.004
李云波1, 唐斯琪1, 周星宇2, 潘志松1
LI Yun-bo1, TANG Si-qi1, ZHOU Xing-yu2, PAN Zhi-song1
摘要: 本文目标是根据任意视角、任意人群密度的图像信息,估计真实场景中的人群密度。但三维空间景物投影到二维空间时会造成透视失真和人群遮挡问题,导致难以区分个体与个体、个体与背景的差异。为此,提出一种灵活高效的可伸缩模块化卷积神经网络(CNN)的架构,允许直接输入任意大小和分辨率的图像,不额外计算视角变化信息,通过生成密度图的方式来估计人群数量。架构的每个模块采用不同卷积核的多列结构,可以拟合不同远近的个体信息;并结合前后两层的特征信息,减少了梯度消失造成的精度下降损失。实验证明,在ShanghaiTech PartA和PartB数据集上,所提方法的准确率比之前最好的MCNN方法分别提高了14.58%,40.53%,均方根误差分别降低了23.89%,33.90%。
中图分类号:
[1]LIN S F,CHEN J Y,CHAO H X.Estimation of number of people in crowded scenes using perspective transformation[J].IEEE Transactions on Systems,Man & Cybernetics Part A Systems & Humans,2001,31(6):645-654. [2]DALAL N,TRIGGS B.Histograms of Oriented Gradients for Human Detection[C]∥IEEE Computer Society Conference on Computer Vision & Pattern Recognition.IEEE Computer Society,2005:886-893. [3]WANG M,WANG X.Automatic adaptation of a generic pedestrian detector to a specific traffic scene[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2011:3401-3408. [4]GE W,COLLINS R T.Marked point processes for crowd-coun-ting[C]∥IEEE Conference on Computer Vision and Pattern Recognition,2009(CVPR 2009).IEEE,2009:2913-2920. [5]IDREES H,SOOMRO K,SHAH M.Detecting Humans inDense Crowds Using Locally-Consistent Scale Prior and Global Occlusion Reasoning[M].IEEE Computer Society,2015. [6]LIN Z,DAVIS L S.Shape-Based Human Detection and Segmentation via Hierarchical Part-Template Matching[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2010,32(4):604-618. [7]LEMPITSKY V S,ZISSERMAN A.Learning To Count Objects in Images[C]∥International Conference on Neural Information Processing Systems.Curran Associates Inc.,2010:1324-1332. [8]ZHANG C,LI H,WANG X,et al.Cross-scene crowd counting via deep convolutional neural networks[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2015:833-841. [9]WANG C,ZHANG H,YANG L,et al.Deep People Counting in Extremely Dense Crowds[C]∥ACM International Conference on Multimedia.ACM,2015:1299-1302. [10]BOOMINATHAN L,KRUTHIVENTI S S S,BABU R V.CrowdNet:A Deep Convolutional Network for Dense Crowd Counting[C]∥Proceedings of ACM Conference on Multimedia (ACMMM) - 2016.2016:640-644. [11]ZHANG Y,ZHOU D,CHEN S,et al.Single-Image CrowdCounting via Multi-Column Convolutional Neural Network[C]∥Computer Vision and Pattern Recognition.IEEE,2016:589-597. [12]HAN S,POOL J,TRAN J,et al.Learning both Weights and Connections for Efficient Neural Networks[C]∥NIPS 2015.2015:1135-1143. [13]HAN S,LIU X,MAO H,et al.EIE:Efficient Inference Engine on Compressed Deep Neural Network[C]∥ACM/IEEE International Symposium on Computer Architecture.IEEE,2016:243-254. [14]HAN S,MAO H,DALLY W J.Deep Compression:Compressing Deep Neural Networks with Pruning,Trained Quantization and Huffman Coding[J].Fiber,2015,56(4):3-7. [15]LIN M,CHEN Q,YAN S.Network In Network[C]∥International Conference on Learning Representations.2013. [16]NAIR V,HINTON G E.Rectified linear units improve restric-ted boltzmann machines[C]∥International Conference on International Conference on Machine Learning.Omnipress,2010:807-814. [17]HE K,ZHANG X,REN S,et al.Deep Residual Learning for Ima-ge Recognition[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2016:770-778. [18]RODRIGUEZ M,LAPTEV I,SIVIC J,et al.Density-aware person detection and tracking in crowds[C]∥International Confe-rence on Computer Vision.IEEE Computer Society,2011:2423-2430. [19]IDREES H,SALEEMI I,SEIBERT C,et al.Multi-source Multi-scale Counting in Extremely Dense Crowd Images[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2013:2547-2554. [20]OÑORO-RUBIO D,LÓPEZ-SASTRE R J.Towards Perspec-tive-Free Object Counting with Deep Learning[C]∥European Conference on Computer Vision.Springer,Cham,2016:615-629. |
[1] | 周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026 |
[2] | 李宗民, 张玉鹏, 刘玉杰, 李华. 基于可变形图卷积的点云表征学习 Deformable Graph Convolutional Networks Based Point Cloud Representation Learning 计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023 |
[3] | 陈泳全, 姜瑛. 基于卷积神经网络的APP用户行为分析方法 Analysis Method of APP User Behavior Based on Convolutional Neural Network 计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121 |
[4] | 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153 |
[5] | 檀莹莹, 王俊丽, 张超波. 基于图卷积神经网络的文本分类方法研究综述 Review of Text Classification Methods Based on Graph Convolutional Network 计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064 |
[6] | 张颖涛, 张杰, 张睿, 张文强. 全局信息引导的真实图像风格迁移 Photorealistic Style Transfer Guided by Global Information 计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036 |
[7] | 戴朝霞, 李锦欣, 张向东, 徐旭, 梅林, 张亮. 基于DNGAN的磁共振图像超分辨率重建算法 Super-resolution Reconstruction of MRI Based on DNGAN 计算机科学, 2022, 49(7): 113-119. https://doi.org/10.11896/jsjkx.210600105 |
[8] | 程成, 降爱莲. 基于多路径特征提取的实时语义分割方法 Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction 计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157 |
[9] | 刘月红, 牛少华, 神显豪. 基于卷积神经网络的虚拟现实视频帧内预测编码 Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network 计算机科学, 2022, 49(7): 127-131. https://doi.org/10.11896/jsjkx.211100179 |
[10] | 徐鸣珂, 张帆. Head Fusion:一种提高语音情绪识别的准确性和鲁棒性的方法 Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition 计算机科学, 2022, 49(7): 132-141. https://doi.org/10.11896/jsjkx.210100085 |
[11] | 金方焱, 王秀利. 融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取 Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM 计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190 |
[12] | 孙福权, 崔志清, 邹彭, 张琨. 基于多尺度特征的脑肿瘤分割算法 Brain Tumor Segmentation Algorithm Based on Multi-scale Features 计算机科学, 2022, 49(6A): 12-16. https://doi.org/10.11896/jsjkx.210700217 |
[13] | 吴子斌, 闫巧. 基于动量的映射式梯度下降算法 Projected Gradient Descent Algorithm with Momentum 计算机科学, 2022, 49(6A): 178-183. https://doi.org/10.11896/jsjkx.210500039 |
[14] | 张嘉淏, 刘峰, 齐佳音. 一种基于Bottleneck Transformer的轻量级微表情识别架构 Lightweight Micro-expression Recognition Architecture Based on Bottleneck Transformer 计算机科学, 2022, 49(6A): 370-377. https://doi.org/10.11896/jsjkx.210500023 |
[15] | 王建明, 陈响育, 杨自忠, 史晨阳, 张宇航, 钱正坤. 不同数据增强方法对模型识别精度的影响 Influence of Different Data Augmentation Methods on Model Recognition Accuracy 计算机科学, 2022, 49(6A): 418-423. https://doi.org/10.11896/jsjkx.210700210 |
|