计算机科学 ›› 2023, Vol. 50 ›› Issue (6A): 220200084-8.doi: 10.11896/jsjkx.220200084

• 图像处理&多媒体技术 • 上一篇    下一篇

基于渐进式注意力金字塔的行人重识别方法

张帅宇1, 彭力1, 戴菲菲2   

  1. 1 物联网技术应用教育部工程研究中心(江南大学物联网工程学院) 江苏 无锡 214122;
    2 台州市产品质量安全检测研究院 浙江 台州 318000
  • 出版日期:2023-06-10 发布日期:2023-06-12
  • 通讯作者: 彭力(jnpengli@outlook.com)
  • 作者简介:(1206856688@qq.com)
  • 基金资助:
    国家自然科学基金(61873112,61802107)

Person Re-identification Method Based on Progressive Attention Pyramid

ZHANG Shuaiyu1, PENG Li1, DAI Feifei2   

  1. 1 Engineering Research Center of Internet of Things Technology Applications, School of IoT Engineering, Jiangnan University, Wuxi, Jiangsu 214122, China;
    2 Taizhou Institute of Product Quality and Safety Testing,Taizhou,Zhejiang 318000,China
  • Online:2023-06-10 Published:2023-06-12
  • About author:ZHANG Shuaiyu,born in 1998,postgraduate.His main research interests include computer vision and person re-identification. PENG Li,born in 1967,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include visual Internet of things and deep learning.
  • Supported by:
    National Natural Science Foundation of China(61873112,61802107).

摘要: 针对现有行人重识别算法对行人特征提取不充分,导致算法在行人遮挡、姿态变化等场景下准确度较低的问题,提出了基于渐进式注意力金字塔的行人重识别方法。该方法基于注意力机制设计了一种渐进式的特征金字塔结构,将通道和空间两种注意力模块嵌入特征金字塔结构中,并分别应用在特征的通道和空间两个维度上,通道注意力金字塔聚合骨干网络各层级不同通道维度中值得关注的特征,空间注意力金字塔提取不同空间维度中值得关注的特征。金字塔的每一级都按照“切分-关注-合并”的原则,自底向上不断学习行人特征图在不同切分等级下的注意力,让网络充分挖掘到来自不同通道维度和不同空间维度的关键特征。同时,通过级联结构和可变形卷积实现多层级特征对齐,进一步提高模型的重识别精度。分别在Market-1501和DukeMTMC-reID两个主流数据集上对该方法进行实验,实验结果表明该方法可以让模型关注到更丰富的行人特征,模型的Rank-1指标相比基准网络分别提高了3.2%和5.8%,mAP指标分别提高了6.8%和6.6%。

关键词: 行人重识别, 注意力机制, 特征金字塔, 特征对齐, 池化, 度量学习

Abstract: Aiming at the problem that the existing person re-identification algorithms do not fully extract person features,resulting in low accuracy of the algorithm in scenes such as person occlusion and posture change,a person re-identification method based on progressive attention pyramid is proposed.This method designs a progressive feature pyramid structure based on the attention mechanism,embeds the channel and spatial attention modules into the feature pyramid structure,and applies them to the channel and spatial dimensions of the feature.Channel attention pyramid aggregates the noteworthy features in different channel dimensions at each level of the backbone network,and the spatial attention pyramid extracts the noteworthy features in different spatial dimensions.Each level of the pyramid follows the principle of “split-attend-concat”,and continuously learns the person feature map under different segmentation levels from the bottom up.Attention allows the network to fully mine key features from different channel dimensions and different spatial dimensions.At the same time,the multi-level feature alignment is realized through the cascade structure and deformable convolution,which further improves the re-identification accuracy of the model.In this paper,the method is tested on two mainstream datasets,Market-1501 and DukeMTMC-reID,respectively.Experimental results show that this method can allow the model to focus on richer person features.Compared with the baseline network,the Rank-1 index of the model increases by 3.2% and 5.8%,and the mAP index increases by 6.8% and 6.6%,respectively.

Key words: Person re-identification, Attention mechanism, Feature pyramid, Feature alignment, Pooling, Metric learning

中图分类号: 

  • TP391.4
[1]YE M,SHEN J,LING,et al.Deep learning for person re-identification:A survey and outlook[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2021,44(6):2872-2893.
[2]DALAL N,TRIGGS B.Histograms of oriented gradients forhuman detection[C]//2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR’05).Ieee,2005,1:886-893.
[3]LOWE D G.Object recognition from local scale-invariant fea-tures[C]//Proceedings of the Seventh IEEE International Conference on Computer Vision.IEEE,1999,2:1150-1157.
[4]LIAO S,HU Y,ZHU X,et al.Person re-identification by local maximal occurrence representation and metric learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:2197-2206.
[5]LUO H,GU Y,LIAO X,et al.Bag of tricks and a strong baseline for deep person re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.2019:4321-4329.
[6]SUN Y,ZHENG L,YANG Y,et al.Beyond part models:Person retrieval with refined part pooling(and a strong convolutional baseline)[C]//Proceedings of the European Conference on Computer vision(ECCV).2018:480-496.
[7]FAN X,LUO H,ZHANG X,et al.Scpnet:Spatial-channel parallelism network for joint holistic and partial person re-identification[C]//Asian Conference on Computer Vision.Cham:Springer,2018:19-34.
[8]WANG G,YUAN Y,CHEN X,et al.Learning discriminative features with multiple granularities for person re-identification[C]//Proceedings of the 26th ACM International Conference on Multimedia.2018:274-282.
[9]ZHANG Z,LAN C,ZENG W,et al.Relation-aware global attention for person re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:3186-3195.
[10]LI W,ZHU X,GONG S.Harmonious attention network for person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:2285-2294.
[11]ZHANG Z Y,DING J W,WEI H W,et al.Cascaded Multi-level Features Learning For Attention Based Person Re-Identification[J].Laser & Optoelectronics Progress,2021,58(22):2215003.
[12]LIN T Y,DOLLÁR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2117-2125.
[13]DAI J,QI H,XIONG Y,et al.Deformable convolutional net-works[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:764-773.
[14]SCHROFF F,KALENICHENKO D,PHILBIN J.Facenet:A unified embedding for face recognition and clustering[C]//Proceedings of the IEEE Conference on Computer Vision Pnd Rattern recognition.2015:815-823.
[15]HERMANS A,BEYER L,LEIBE B.In defense of the tripletloss for person re-identification[J].arXiv:1703.07737,2017.
[16]WANG G,YUAN Y,CHEN X,et al.Learning discriminative features with multiple granularities for person re-identification[C]//Proceedings of the 26th ACM International Conference on Multimedia.2018:274-282.
[17]ZHENG L,SHEN L,TIAN L,et al.Scalable person re-identification:A benchmark[C]//Proceedings of the IEEE Interna-tional Conference on Computer Vision.2015:1116-1124.
[18]RISTANI E,SOLERA F,ZOUR,et al.Performance measures and a data set for multi-target,multi-camera tracking[C]//European Conference on Computer Vision.Cham:Springer,2016:17-35.
[19]ZHONG Z,ZHENG L,KANG G,et al.Random erasing dataaugmentation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:13001-13008.
[20]SELVARAJU R R,COGSWELL M,DAS A,et al.Grad-cam:Visual explanations from deep networks via gradient-based localization[C]//Proceedings of the IEEE International Confe-rence on Computer Vision.2017:618-626.
[21]WANG D,ZHOU D K,HUANG Y D,et al.Multi-scale Multi-granularity Feature for Pedestrian Re-identification[J].Compu-ter Science,2021,48(7):238-244
[22]HOU R,MA B,CHANG H,et al.Interaction-and-aggregationnetwork for person re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:9317-932.
[23]CHEN B,DENG W,HU J.Mixed high-order attention network for person re-identification[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:371-381.
[24]TAY C P,ROY S,YAP K H.Aanet:Attribute attention network for person re-identifications[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:7134-7143.
[25]DAI Z,CHEN M,GU X,et al.Batch dropblock network for person re-identification and beyond[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:3691-3701.
[26]ZHONG Z,ZHENG L,CAO D,et al.Re-ranking person re-identification with k-reciprocal encoding[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:1318-1327.
[27]DONG H S,ZHONG S,YANG Y F,et al.Person Re-identification by Region Correlated Deep Feature Learning with Multiple Granularities[J].Computer Science,2021,48(12):269-277.
[28]YANG W,HUANG H,ZHANG Z,et al.Towards rich feature discovery with class activation maps augmentation for person re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:1389-1398.
[29]WANG G,GONG S,CHENG J,et al.Faster person re-identification[C]//European Conference on Computer Vision.Cham:Springer,2020:275-292.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!