跨视角地理定位中的三维交互机制

doi:10.11896/jsjkx.240500020

Abstract

Abstract: Cross-view geo-localization refers to inferring the geographical location from images of different viewpoints,which is usually viewed as an image retrieval task.However,most existing methods neglect the global position information and feature completeness,which makes the model can not conducive to capturing deep semantic information.Additionally,the current two-dimensional interaction methods do not fully utilize the relationships between dimensions,leading to insufficient cross-dimensional interaction.To address these issues,this paper designs a triplet interaction mechanism for cross-view geo-localization.This method uses ConvNeXt as the feature extraction network,followed by a proposed triplet interaction mechanism,for feature enrichment operations.Finally,a joint loss function is utilized to guide model training.It performs multiple dimensional interactions within the model,reducing the problem of information loss in the two-dimensional feature projection.The proposed method includes a triplet interaction mechanism that uses different attention mechanisms in three channels,making the model robust to translations,scaling,and rotations for different cross-view images.Experimental results demonstrate that the proposed method can significantly outperforms other methods for both drone view localization and drone navigation tasks on University-1652 dataset.

Key words: Cross-view, Geo-localization, Interaction mechanism, Feature attention

CLC Number:

TP391

ZHOU Bowen, LI Yang, WANG Jiabao, MIAO Zhuang, ZHANG Rui. Triplet Interaction Mechanism in Cross-view Geo-localization[J].Computer Science, 2025, 52(3): 86-94.

References

[1]LIN J,ZHENG Z,ZHONG Z,et al.Joint representation learning and keypoint detection for cross-view geo-localization[J].IEEE Transactions on Image Processing,2022,31:3780-3792.
[2]SHEN T R,WEI Y M,KANG L,et al.Mccg:a ConvNeXt-based multiple-classifier method for cross-view geo-localization[J].IEEE Transactions on Circuits and Systems for Video Technology,2023,34(3):1456-1468.
[3]ZHU S,SHAH M,CHEN C.Transgeo:Transformer is all you need for cross-view image geo-localization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:1162-1171.
[4]ZHENG Z,WEI Y,YANG Y.University-1652:a multi-viewmulti-source benchmark for drone-based geo-localization[C]//Proceedings of the 28th ACM International Conference on Multimedia.2020:1395-1403.
[5]CHEN D,KRÄHENBüHL P.Learning from all vehicles[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:17222-17231.
[6]PENG T,LI Q,ZHU P.Rgb-t crowd counting from drone:abenchmark and mmccn network[C]//Proceedings of the Asian Conference on Computer Vision.2020.
[7]LUO H Y,CHEN T X,LI X J,et al.Keepedge:a knowledge distillation empowered edge intelligence framework for visual assisted positioning in UAV delivery[J].IEEE Transactions on Mobile Computing,2022,22(8):4729-4741.
[8]SHUGAEV M,SEMENOV I,ASHLEY K,et al.Arcgeo:localizing limited field-of-view images using cross-view matching[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision.2024:209-218.
[9]WORKMAN S,JACOBS N.On the location dependence of convolutional neural network features[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.2015:70-78.
[10]SUN Y X,YE Y M,KANG J,et al.Cross-view object geo-localization in a local region with satellite imagery[J].IEEE Transactions on Geoscience and Remote Sensing,2023,61:1-16.
[11]WANG T,ZHENG Z,YAN C,et al.Each part matters:local patterns facilitate cross-view geo-localization[J].IEEE Transactions on Circuits and Systems for Video Technology,2021,32(2):867-879.
[12]LIU L,LI H.Lending orientation to neural networks for cross-view geo-localization[C]//Proceedings of 2019 IEEE Conference on Computer Vision and Pattern Recognition.Long Beach,USA:IEEE,2019:5624-5633.
[13]LOWE D G.Object recognition from local scale-invariant features[C]//Proceedings of the seventh IEEE International Conference on Computer Vision.IEEE,1999,2:1150-1157.
[14]TIAN Y,CHEN C,SHAH M.Cross-view image matching for geo-localization in urban environments[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:3608-3616.
[15]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2014:580-587.
[16]LIN T Y,CUI Y,BELONGIE S,et al.Learning deep representations for ground-to-aerial geolocalization[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:5007-5015.
[17]HU S,FENG M,NGUYEN R M H,et al.Cvm-net:cross-view matching network for image-based ground-to-aerial geo-localization[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7258-7267.
[18]SHI Y,YU X,LIU L,et al.Optimal feature transport for cross-view image geo-localization[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:11990-11997.
[19]YANG H,LU X,ZHU Y.Cross-view geo-localization with la-yer-to-layer Transformer[J].Advances in Neural Information Processing Systems,2021,34:29009-29020.
[20]ZHU R,YANG M,YIN L,et al.UAV’s status is worth consi-dering:A fusion representations matching method for geo-localization[J].Sensors,2023,23(2):720.
[21]HE K,ZHANG X,REN S,et al.Identity mappings in deep residual networks[C]//Computer Vision-ECCV 2016:14th European Conference,Amsterdam,The Netherlands,October 11－14,2016,Proceedings,Part IV 14.Springer International Publishing,2016:630-645.
[22]DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.Animage is worth 16x16 words:Transformers for image recognition at scale[J].arXiv:2010.11929,2020.
[23]ZHANG X,LI X,SULTANI W,et al.Cross-view geo-localization via learning disentangled geometric layout correspondence[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2023:3480-3488.
[24]BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[J].arXiv:1409.0473,2014.
[25]LI S,BAK S,CARR P,et al.Diversity regularized spatiotemporal attention for video-based person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:369-378.
[26]XU J,ZHAO R,ZHU F,et al.Attention-aware compositional network for person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:2119-2128.
[27]FU J,ZHENG H,MEI T.Look closer to see better:recurrent attention convolutional neural network for fine-grained image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:4438-4446.
[28]CAI S,GUO Y,KHAN S,et al.Ground-to-aerial image geo-localization with a hard exemplar reweighting triplet loss[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:8391-8400.
[29]ZHUANG J,DAI M,CHEN X,et al.A faster and more effective cross-view matching method of UAV and satellite images for UAV geolocalization[J].Remote Sensing,2021,13(19):3979.
[30]ZHUANG J,CHEN X,DAI M,et al.A semantic guidance and Transformer-based matching method for UAVs and satellite Images for UAV Geo-Localization[J].IEEE Access,2022,10:34277-34287.
[31]ZHU Y,YANG H,LU Y,et al.Simple,effective and general:a new backbone for cross-view image geo-localization[J].arXiv:2302.01572,2023.
[32]LIU Z,MAO H,WU C Y,et al.A convnet for the 2020s[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:11976-11986.
[33]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[34]SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:1-9.
[35]MISRA D,NALAMADA T,ARASANIPALAI A U,et al.Rotate to attend:convolutional triplet attention module[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision.2021:3139-3148.
[36]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7132-7141.
[37]LI X,WANG W,HU X,et al.Selective kernel networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:510-519.
[38]ZENG W,WANG T,CAO J,et al.Clustering-guided pairwise metric triplet loss for person reidentification[J].IEEE Internet of Things Journal,2022,9(16):15150-15160.
[39]HE K,ZHANG X,REN S,et al.Delving deep into rectifiers:Surpassing human-level performance on imagenet classification[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:1026-1034.
[40]DAI M,HU J,ZHUANG J,et al.A Transformer-based feature segmentation and region alignment method for UAV-view geo-localization[J].IEEE Transactions on Circuits and Systems for Video Technology,2021,32(7):4376-4389.
[41]BUI D V,KUBO M,SATO H.A part-aware attention neural network for cross-view geo-localization between UAV and satellite[J].Journal of Robotics,Networking and Artificial Life,2022,9(3):275-284.
[42]WANG T,ZHENG Z,ZHU Z,et al.Learning cross-view geo-localization embeddings via dynamic weighted decorrelation regularization[J].arXiv:2211.05296,2022.
[43]TIAN X,SHAO J,OUYANG D,et al.UAV-satellite view synthesis for cross-view geo-localization[J].IEEE Transactions on Circuits and Systems for Video Technology,2021,32(7):4804-4815.
[44]CHEN Q,WANG T,YANG Z,et al.Sdpl:shifting-dense partition learning for UAV-view geo-localization[J].arXiv:2403.04172,2024.
[45]SONG H,WANG Z,LEI Y,et al.Learning visual representation clusters for cross-view geo-location[J].IEEE Geoscience and Remote Sensing Letters,2023,20:1-5.
[46]WANG Y P,LI Y,WANG J B,et al.A robust lightweight deep learning method for remote sensing scene image classification and retrieval under label noise[J].Journal of Image and Gra-phics,2021,26(12):2991-3004.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Triplet Interaction Mechanism in Cross-view Geo-localization

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 4

Metrics

Comments

Recommended 0

[1]	WANG Jinfu, WANG Siwei, LIANG Weixuan, YU Shengju, ZHU En. Multi-view Clustering Based on Bipartite Graph Cross-view Graph Diffusion [J]. Computer Science, 2025, 52(7): 69-74.
[2]	LIU Xudong, YU Ping. Cross-view Geo-visual Localization [J]. Computer Science, 2023, 50(11A): 221100066-7.
[3]	GUI Yi-nan, LAO Song-yang, KANG Lai, BAI Liang. Accuracy Assessment Method of PnP Algorithm in Visual Geo-localization [J]. Computer Science, 2018, 45(8): 13-16.
[4]	PU Jiang. Comprehensive Cognition Emotional Theory—A New Mind Computing Model [J]. Computer Science, 2014, 41(7): 15-24.