计算机科学 ›› 2020, Vol. 47 ›› Issue (12): 205-209.doi: 10.11896/jsjkx.191000132
邵阳雪1,2, 孟伟1,2, 孔德珍2,3, 韩林轩2,3, 刘扬1,2,3
SHAO Yang-xue1,2, MENG Wei1,2, KONG Deng-zhen2,3, HAN Lin-xuan2,3, LIU Yang1,2,3
摘要: 保证正在执行任务的特种车辆的道路优先通行权是合理配置城市交通资源、实施和保证应急救援的前提.特种车辆的跨模态识别是实现智慧交通的重要核心技术尤其是在智能车联网尚未成熟、未来长期存在无人驾驶和有人驾驶混合交通的环境中实现无人车对正在执行任务的特种车辆进行合理避让显得尤为重要.针对无人驾驶对特种车辆识别的需求文中构建了跨模态检索与识别网络(Cross-Modal Retrievaland Recognition NetCMR2Net)提出了一种基于深度学习的特种车辆跨模态检索和识别方法.CMR2Net由两个卷积子网络和一个特征融合网络组成卷积子网络分别用于提取特种车的图像与音频特征在高层语义空间中利用相似性度量的方法进行特征匹配以达到跨模态检索和识别的目的.在特种车跨模态数据集上进行的跨模态识别实验表明所提方法对跨模态检索和识别任务具有较高的识别率甚至在缺失一种模态的场景下也可准确识别出特种车辆.本研究对于提升"城市大脑"的性能具有重要的理论指导意义对设计、实现和改善未来智慧交通具有较高的工程应用价值.
中图分类号:
[1] LIN Z H.Multimodal Deep Learning Object Detecting and Application[D].Chengdu:University of Electronic Science and Technology of China.2018. [2] HE X,TANG Y P,CHEN P.Fast hash vehicle retrieval method based on multitasking[J].Journal of Image and Graphics,2018,23(12):1801-1812. [3] LI X Y,NIE X S,CUI C R,et al.Image Retrieval Algorithm Based on Transfer Learning[J].Computer Science,2019,46(1):73-77. [4] JIANG Z T,QIN J Q,HU S.Multi-spectral Scene Recognition Method Basedon Multi-way Convolution Neural Network[J].Computer Science,2019,46(9):265-270. [5] ARANDJELOVI R,ZISSERMAN,et al.Look,Listen and Learn[J/OL].https://ui.adsabs.harvard.edu/abs/2017arXiv170508168A. [6] RASIWASIA N,PEREIRA J C,COVIELLO E,et al.A New Approach to Cross-Modal Multimedia Retrieval[C]//International Conference on Multimedia.2010:521-535. [7] JIAN L,RAN H,SUN Z,et al.Group-Invariant Cross-Modal Subspace Learning[C]//International Joint Conference on Artificial Intelligence.Seattle,WA,USA:IEEE Press,2016:1739-1745. [8] SHARMA A,KUMAR A,DAUME H,et al.Generalized Multiview Analysis:A discriminative latent space[C]//IEEE Confe-rence on Computer Vision &Pattern Recognition.2012:2160-2167. [9] NGIAM J,KHOSLA A,KIM M,et al.Multimodal deep learning[C]//International Conference on Machine Learning.Washington,USA,2011:689-696. [10] SRIVASTAVA,NITISH,SALAKHUTDINOV,et al.Multimodal Learning with Deep Boltzmann Machines[C]//Advances in Neural Information Processing Systems.2012:2222-2230. [11] FENG Y G.CAI GY.Cross-modal Retrieval Fusing Multilayer Semantic[J].Computer Science,2019,46(3):227-233. [12] KAISER L,GOMEZ A N,SHAZEER N,et al.One Model To Learn Them All[J/OL].https://ui.adsabs.harvard.edu/abs/2017arXiv170605137K. [13] AYTAR Y,VONDRICK C,TORRALBA A.See,Hear,andRead:Deep Aligned Representations[J/OL].https://ui.ad-sabs.harvard.edu/abs/2017arXiv170600932A. [14] ARANDJELOVIC',RELJA,ZISSERMAN,et al.Look,Listen andLearn[EB/OL].https://ui.adsabs.harvard.edu/abs/2017-arXiv170508168A. [15] HAO W,ZHANG Z,HE G.CMCGAN:A Uniform Framework for Cross-Modal Visual-Audio Mutual Generation[C]//AAAI Conference on Artificial Intelligence (AAAI).New Orleans,LA,USA:AAAI,2018:6886-6893. [16] LIU Y,CAI K,LIU C,et al.CSRNCVA:a Model of Cross-media Semantic Retrieval based on Neural Computing of Visual and Auditory Sensations[J].Neural Network World,2018,28(4):305-323. [17] LIU Y,TU C L,ZHENG F B.Research of Neural Cognitive Computing Model for Visual and Auditory Cross-media Retrieval[J].Computer Science,2015,42(3):19-25,30. [18] JIN K.H,Maccan M.T,Froustey E,et al.Deep Convolutional Neural Network for Inverse Problems in Imaging[J].IEEE Transactions on Image Procession,2016,26(9):4509-4522. [19] LIN M,CHEN Q,YAN S.Network In Network[J/OL].https://ui.adsabs.harvard.edu/abs/2013arXiv1312.4400L. [20] HAHNLOSER RICHARD H R,SEBASTIAN S H,JACQUES S J.Permitted and forbidden sets in symmetric threshold-li-near networks.[J].Neural Computation,2003,15(3):621-638. [21] VAPNIK V N.Statistical Learning Theory[J].Encyclopedia of the ences of Learning,1998,41(4):3185. [22] HAO Y,QI C.Robust virtual frontal face synthesis from a given pose usingregularized linear regression[C]//International Conference on Image Processing(ICIP).Paris:IEEE Press,2014:702-4706. [23] LIU W,WEN Y,YU Z,et al.Large-margin softmax loss for convolutional neural networks[C]//International Conferencer on International Conference on Machine Learning.Vienna,Austria:ICML,2016:69-75. |
[1] | 周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026 |
[2] | 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺. 时序知识图谱表示学习 Temporal Knowledge Graph Representation Learning 计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204 |
[3] | 饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277 |
[4] | 汤凌韬, 王迪, 张鲁飞, 刘盛云. 基于安全多方计算和差分隐私的联邦学习方案 Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy 计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108 |
[5] | 陈泳全, 姜瑛. 基于卷积神经网络的APP用户行为分析方法 Analysis Method of APP User Behavior Based on Convolutional Neural Network 计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121 |
[6] | 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153 |
[7] | 孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061 |
[8] | 檀莹莹, 王俊丽, 张超波. 基于图卷积神经网络的文本分类方法研究综述 Review of Text Classification Methods Based on Graph Convolutional Network 计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064 |
[9] | 李宗民, 张玉鹏, 刘玉杰, 李华. 基于可变形图卷积的点云表征学习 Deformable Graph Convolutional Networks Based Point Cloud Representation Learning 计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023 |
[10] | 王剑, 彭雨琦, 赵宇斐, 杨健. 基于深度学习的社交网络舆情信息抽取方法综述 Survey of Social Network Public Opinion Information Extraction Based on Deep Learning 计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099 |
[11] | 郝志荣, 陈龙, 黄嘉成. 面向文本分类的类别区分式通用对抗攻击方法 Class Discriminative Universal Adversarial Attack for Text Classification 计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077 |
[12] | 姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046 |
[13] | 胡艳羽, 赵龙, 董祥军. 一种用于癌症分类的两阶段深度特征选择提取算法 Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification 计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092 |
[14] | 张颖涛, 张杰, 张睿, 张文强. 全局信息引导的真实图像风格迁移 Photorealistic Style Transfer Guided by Global Information 计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036 |
[15] | 戴朝霞, 李锦欣, 张向东, 徐旭, 梅林, 张亮. 基于DNGAN的磁共振图像超分辨率重建算法 Super-resolution Reconstruction of MRI Based on DNGAN 计算机科学, 2022, 49(7): 113-119. https://doi.org/10.11896/jsjkx.210600105 |
|