计算机科学 ›› 2022, Vol. 49 ›› Issue (7): 142-147.doi: 10.11896/jsjkx.210600198

• 计算机图形学&多媒体 • 上一篇    下一篇

基于向量注意力机制GoogLeNet-GMP的行人重识别方法

孟月波1,2, 穆思蓉1, 刘光辉1, 徐胜军1,2, 韩九强1,2   

  1. 1 西安建筑科技大学信息与控制工程学院 西安710055
    2 人工智能与数字经济广东省实验室(华南理工大学) 广州510000
  • 收稿日期:2021-06-24 修回日期:2021-12-26 出版日期:2022-07-15 发布日期:2022-07-12
  • 通讯作者: 穆思蓉(m_srong0413@163.com)
  • 作者简介:(mengyuebo@163.com)
  • 基金资助:
    国家自然科学基金面上项目(51678470);陕西省自然科学基金面上项目(2020JM-473,2020JM-472);西安建筑科技大学基础研究基金(JC1703);西安建筑科技大学自然科学基金(ZR19046)

Person Re-identification Method Based on GoogLeNet-GMP Based on Vector Attention Mechanism

MENG Yue-bo1,2, MU Si-rong1, LIU Guang-hui1, XU Sheng-jun1,2, HAN Jiu-qiang1,2   

  1. 1 School of Information and Control Engineering,Xi'an University of Architecture and Technology,Xi'an,710055,China
    2 Artificial Intelligence and Digital Economy Guangdong Provincial Laboratory,South China University of Technology,Guangzhou 510000,China
  • Received:2021-06-24 Revised:2021-12-26 Online:2022-07-15 Published:2022-07-12
  • About author:MENG Yue-bo,born in 1979,Ph.D,associate professor.Her main research interests include computer vision perception and understanding,intelligent architecture and artificial intelligence.
    MU Si-rong,born in 1995,postgra-duate.Her main research interests include visual processing,artificial intelligence,and image processing.
  • Supported by:
    National Natural Science Foundation of China(51678470),Nature Science Foundation of Shaanxi,China(2020JM-473,2020JM-472),Natural Science Basic Research of Xi'an University of Architecture and Technology(JC1703) and Natural Science Foundation of Xi'an University of Architecture and Technology(ZR19046).

摘要: 为了提高行人重识别(Re-ID)的准确率和适用性,提出了一种基于向量注意力机制GoogLeNet的Re-ID方法。首先,将3组图像(锚、正、负)输入到GoogLeNet-GMP网络中,获得分段式特征向量。然后,利用空间金字塔池化(Spatial Pyramid Pooling,SPP)对来自不同金字塔等级的特征进行聚合,并引入注意力机制,通过对代表目标视觉信息的多尺度池化区域进行整合,获得多个语义等级上的可区分性特征。同时,将两个不同损失函数的混合形式作为最终损失函数。在Market-15012和Duke-MTMC3数据集上进行实验,结果表明,相比其他优秀方法,所提方法在Rank-1和mAP指标方面表现更优。

关键词: GoogLeNet, 空间金字塔池化, 损失函数, 行人重识别, 注意力机制

Abstract: In order to improve the accuracy and applicability of person re-identification(Re-ID),a Re-ID method based on vector attention mechanism GoogLeNet is proposed.Firstly,three groups of images(anchor,positive and negative) are input into the GoogLeNet-GMP network to obtain segmented feature vectors.Then,spatial pyramid pooling(SPP) is used to aggregate the features from different pyramid levels,and attention mechanism is introduced.By integrating the multi-scale pooling regions which represent the visual information of the target,the distinguishable features on multiple semantic levels are obtained.At the same time,the mixed form of two different loss functions is taken as the final loss function.Experiments on Market-15012 and Duke-MTMC3 data set show that the proposed method performs better in Rank-1 and mAP indicators than other excellent methods.

Key words: Attention mechanism, GoogLeNet, Loss function, Person re-identification, Spatial pyramid pooling

中图分类号: 

  • TP391
[1]SONG W R,ZHAO Q Q,CHEN C H,et al.Survey on pedes-trian re-identification research[J].CAAI Transactions on Intelligent Systems,2017,12(6):770-780.
[2]WU Z,LI Y,RADKE R J.Viewpoint invariant human re-identification in camera networks using pose priors and subject-discriminative features[J].IEEE Transactions on Pattern Analysis &Machine Intelligence,2015,37(5):1095-1108.
[3]WANG J,WANG Z,LIANG C,et al.Equidistance constrained metric learning for person re-identification[J].Pattern Recognition,2018,74(Feb.):38-51.
[4]YANG F,XU Y,YIN M X,et al.Review on deeplearning-based pedestrian re-identification[J].Journal of Computer Applications,2020,40(5):1243-1252.
[5]RUI Z,OUYANG W,WANG X.Unsupervised salience lear-ning for person re-identification[C]//Proceeding of IEEE Confe-rence Computer Vision Pattern Recognition.2013:3586-3593.
[6]LIU J.Research on human weight recognition technology based on local features[D].Beijing:Beijing Jiaotong University,2019.
[7]LIU H.Research on pedestrian re inspection technology forvideo surveillance[D].Beijing:University of Chinese Academy of Sciences,2014.
[8]MARTINEL N.Accelerated low-rank sparse metric learning for person re-identification[J].Pattern Recognition Letters,2018,112(Sep.1):234-240.
[9]GÜLER R A,NEVEROVA N,KOKKINOS I.DensePose:Dense Human Pose Estimation In The Wild[C]//Conference on Computer Vision and Pattern Recognition(CVPR).2018:7297-7306.
[10]XIONG W,FENG C,XIONG Z J,et al.Improved pedestrianrecognition technology based on CNN[J].Computer Enginee-ring and Science,2019,41(4):95-102.
[11]DAI C C,WANG H Y,NI T G,et al.Person re-identification based on deep convolutional generative adversarial network and expanded neighbor reranking[J].Journal of Computer Research and Development,2019,56(8):1632-1641.
[12]LEI Z,YANG K,JIANG K,et al.KDAS-ReID:Architecturesearch for person re-identification via distilled knowledge with dynamic temperature[J].Algorithms,2021,14(5):137-148.
[13]YAN Y,NI B,LIU J,et al.Multi-level attention model for person re-identification[J].Pattern Recognition Letters,2018,127(4):156-164.
[14]LI C,ZHAO S L,ZHAO J P,et al.Scaling-up algorithm of multi-scale association rules[J].Computer Science,2017,44(8):285-289.
[15]CHEN C,QI F.Review on development of convolutionalneural network and its application in computer vision[J].Computer Science,2019,46(3):69-79.
[16]MAHENDRAN A,VEDALDI A.Visualizing deep convolutional neural networks using natural pre-images[J].International Journal of Computer Vision,2015,120(3):75-83.
[17]SONG G,LENG B,YU L,et al.Region-based quality estimation network for large-scale person re-identification[J].arXiv:1711.08766v2,2017.
[18]LU D,MA W Q.Gesture recognition based on improvedYOLOv4 tiny algorithm[J].Journal of Electronics & Information Technology,2021,43(6):1-9.
[19]GUO Y,CHEUNG N M.Efficient and deep person re-identification using multi-level similarity[C]//Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2018:2335-2344.
[20]CHANG H,QU D,WANG K,et al.Attribute-guided attention and dependency learning for improving person re-identification based on data analysis technology[J].Enterprise Information Systems,2021,47(5):1-26.
[21]RISTANI E,SOLERA F,ZOU R S,et al.Performance measuresand a data set for multi-target,multi-camera tracking[C]//European Conference on Computer Vision.Cham:Springer,2016:17-35.
[1] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[2] 周芳泉, 成卫青.
基于全局增强图神经网络的序列推荐
Sequence Recommendation Based on Global Enhanced Graph Neural Network
计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[3] 戴禹, 许林峰.
基于文本行匹配的跨图文本阅读方法
Cross-image Text Reading Method Based on Text Line Matching
计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032
[4] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[5] 熊丽琴, 曹雷, 赖俊, 陈希亮.
基于值分解的多智能体深度强化学习综述
Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization
计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112
[6] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[7] 汪鸣, 彭舰, 黄飞虎.
基于多时间尺度时空图网络的交通流量预测模型
Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction
计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[8] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[9] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[10] 闫佳丹, 贾彩燕.
基于双图神经网络信息融合的文本分类方法
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[11] 金方焱, 王秀利.
融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取
Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM
计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
[12] 熊罗庚, 郑尚, 邹海涛, 于化龙, 高尚.
融合双向门控循环单元和注意力机制的软件自承认技术债识别方法
Software Self-admitted Technical Debt Identification with Bidirectional Gate Recurrent Unit and Attention Mechanism
计算机科学, 2022, 49(7): 212-219. https://doi.org/10.11896/jsjkx.210500075
[13] 彭双, 伍江江, 陈浩, 杜春, 李军.
基于注意力神经网络的对地观测卫星星上自主任务规划方法
Satellite Onboard Observation Task Planning Based on Attention Neural Network
计算机科学, 2022, 49(7): 242-247. https://doi.org/10.11896/jsjkx.210500093
[14] 张颖涛, 张杰, 张睿, 张文强.
全局信息引导的真实图像风格迁移
Photorealistic Style Transfer Guided by Global Information
计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036
[15] 曾志贤, 曹建军, 翁年凤, 蒋国权, 徐滨.
基于注意力机制的细粒度语义关联视频-文本跨模态实体分辨
Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism
计算机科学, 2022, 49(7): 106-112. https://doi.org/10.11896/jsjkx.210500224
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!