基于空间转换与多尺度特征融合的行人重识别方法

doi:10.11896/jsjkx.240800156

Abstract

Abstract: A network combining spatial transformation and multiscale feature fusion is designed to address the issue of insufficiently representing pedestrian information due to misalignment of pedestrian spatial characteristics and occlusion factors.Firstly,a method for enhancing pedestrian retrieval is proposed,aiming to improve the network’s ability to recognize special samples.Secondly,a self-attention spatial transformation network is introduced to address the problem of inconsistent spatial semantic information in pedestrian image regions.Then,different scale features are extracted from the network,and fused separately based on the characteristics of each branch,incorporating coordinate attention and instance batch normalization.Finally,the features of different branches are fused to obtain highly representative fused features.Experiments on multiple datasets show that the proposed method outperforms existing methods in terms of re-identification performance.

Key words: Pedestrian re-identification, Spatial transformation, Feature fusion, Multiple scale, Side window filtering

CLC Number:

TP391.41

JIN Lu, LIU Mingkun, ZHANG Chunhong, CHEN Kefei, LUO Yaqiong, LI Bo. Pedestrian Re-identification Based on Spatial Transformation and Multi-scale Feature Fusion[J].Computer Science, 2025, 52(6A): 240800156-7.

References

[1]LUO H,JIANG W,FAN X,et al.Research progress of pedestrian re-identification based on deep learning [J].Acta Automatica Sinica,2019,45(11):2032-2049.
[2]LIAO S,HU Y,ZHU X,et al.Person re-identification by local maximal occurrence representation and metric learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:2197-2206.
[3]TAO D,JIN L,WANG Y,et al.Person reidentification by minimum classification error-based KISS metric learning[J].IEEE Transactions on Cybernetics,2014,45(2):242-252.
[4]SU C,LI J,ZHAN G S,et al.Pose-driven deep convolutionalmodel for person re-identification[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:3960-3969.
[5]LUO H,JIANG W,FAN X,et al.Stnreid:Deep convolutional networks with pairwise spatial transformer networks for partial person re-identification[J].IEEE Transactions on Multimedia,2020,22(11):2905-2913.
[6]ZHENG L,YANG Y,HAUPTMANNA G.Person re-identification:Past,present and future[J].arXiv:1610.02984,2016.
[7]YAN C,PANG G,BAI X,et al.Beyond triplet loss:person re-identification with fine-grained difference-aware pairwise loss[J].IEEE Transactions on Multimedia,2021,24:1665-1677.
[8]PAN X,LUO P,SHI J,et al.Two at once:Enhancing learning and generalization capacities via ibn-net[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:464-479.
[9]HU J,SHEN L,SUNG.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7132-7141.
[10]DAI J,QI H,XIONG Y,et al.Deformable convolutional net-works[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:764-773.
[11]PENG P,TIAN Y,HUANG Y,et al.Discriminative spatial feature learning for person re-identification[C]//Proceedings of the 28th ACM International Conference on Multimedia.2020:274-283.
[12]HOU Q,ZHOU D,FENG J.Coordinate attention for efficient mobile network design[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:13713-13722.
[13]CHEN W,CHEN X,ZHANG J,et al.Beyond triplet loss:a deep quadruplet network for person re-identification[C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:403-412.
[14]SZEGEDY C,VANHOUCKE V,IOFFES,et al.Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:2818-2826.
[15]ZHENG L,SHEN L,TIAN L,et al.Scalable person re-identification:A benchmark[C]//Proceedings of the IEEE Internatio-nal Conference on Computer Vision.2015:1116-1124.
[16]ZHENG Z,ZHENG L,YANG Y.Unlabeled samples generatedby gan improve the person re-identification baseline in vitro[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:3754-3762.
[17]SUN Y,ZHENG L,YANG Y,et al.Beyond part models:Person retrieval with refined part pooling(and a strong convolutional baseline)[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:480-496.
[18]LUO H,JIANG W,FAN X,et al.Stnreid:Deep convolutional networks with pairwise spatial transformer networks for partial person re-identification[J].IEEE Transactions on Multimedia,2020,22(11):2905-2913.
[19]SUN Y,XU Q,LI Y,et al.Perceive where to focus:Learning visibility-aware part-level features for partial person re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:393-402.
[20]SUN Y,CHENG C,ZHANG Y,et al.Circle loss:A unified perspective of pair similarity optimization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:6398-6407.
[21]ZHOU K,YANG Y,CAVALLARO A,et al.Omni-scale feature learning for person re-identification[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:3702-3712.
[22]WANG G,YANG S,LIU H,et al.High-order information matters:Learning relation and topology for occluded person re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:6449-6458.
[23]LUO H,GU Y,LIAO X,et al.Bag of tricks and a strong baseline for deep person re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.2019.
[24]CHEN Y,WANG H,SUN X,et al.Deep attention aware feature learning for person re-identification[J].Pattern Recognition,2022,126:108567.
[25]KALAYEH M M,BASARAN E,GÖKMEN M,et al.Human semantic parsing for person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:1062-1071.

Related Articles 15

[1]	SHI Xincheng, WANG Baohui, YU Litao, DU Hui. Study on Segmentation Algorithm of Lower Limb Bone Anatomical Structure Based on 3D CTImages [J]. Computer Science, 2025, 52(6A): 240500119-9.
[2]	LI Weirong, YIN Jibin. FB-TimesNet:An Improved Multimodal Emotion Recognition Method Based on TimesNet [J]. Computer Science, 2025, 52(6A): 240900046-8.
[3]	XU Yutao, TANG Shouguo. Visual Question Answering Integrating Visual Common Sense Features and Gated Counting Module [J]. Computer Science, 2025, 52(6A): 240800086-7.
[4]	WANG Rui, TANG Zhanjun. Multi-feature Fusion and Ensemble Learning-based Wind Turbine Blade Defect Detection Method [J]. Computer Science, 2025, 52(6A): 240900138-8.
[5]	LI Mingjie, HU Yi, YI Zhengming. Flame Image Enhancement with Few Samples Based on Style Weight Modulation Technique [J]. Computer Science, 2025, 52(6A): 240500129-7.
[6]	WANG Rong , ZOU Shuping, HAO Pengfei, GUO Jiawei, SHU Peng. Sand Dust Image Enhancement Method Based on Multi-cascaded Attention Interaction [J]. Computer Science, 2025, 52(6A): 240800048-7.
[7]	ZHANG Yongyu, GUO Chenjuan, WEI Hanyue. Deep Learning Stock Price Probability Prediction Based on Multi-modal Feature Wavelet Decomposition [J]. Computer Science, 2025, 52(6A): 240600140-11.
[8]	SHEN Xinyang, WANG Shanmin, SUN Yubao. Depression Recognition Based on Speech Corpus Alignment and Adaptive Fusion [J]. Computer Science, 2025, 52(6): 219-227.
[9]	GUO Yecai, HU Xiaowei, MAO Xiangnan. Multi-scale Feature Fusion Residual Denoising Network Based on Cascade [J]. Computer Science, 2025, 52(6): 239-246.
[10]	GENG Sheng, DING Weiping, JU Hengrong, HUANG Jiashuang, JIANG Shu, WANG Haipeng. FDiff-Fusion:Medical Image Diffusion Fusion Network Segmentation Model Driven Based onFuzzy Logic [J]. Computer Science, 2025, 52(6): 274-285.
[11]	JIANG Wenwen, XIA Ying. Improved U-Net Multi-scale Feature Fusion Semantic Segmentation Network for RemoteSensing Images [J]. Computer Science, 2025, 52(5): 212-219.
[12]	LI Xiwang, CAO Peisong, WU Yuying, GUO Shuming, SHE Wei. Study on Security Risk Relation Extraction Based on Multi-view IB [J]. Computer Science, 2025, 52(5): 330-336.
[13]	DENG Ceyu, LI Duantengchuan, HU Yiren, WANG Xiaoguang, LI Zhifei. Joint Inter-word and Inter-sentence Multi-relationship Modeling for Review-basedRecommendation Algorithm [J]. Computer Science, 2025, 52(4): 119-128.
[14]	YANG Jincai, YU Moyang, HU Man, XIAO Ming. Automatic Identification and Classification of Topical Discourse Markers [J]. Computer Science, 2025, 52(4): 255-261.
[15]	LI Xiaolan, MA Yong. Study on Lightweight Flame Detection Algorithm with Progressive Adaptive Feature Fusion [J]. Computer Science, 2025, 52(4): 64-73.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Pedestrian Re-identification Based on Spatial Transformation and Multi-scale Feature Fusion

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0