计算机科学 ›› 2025, Vol. 52 ›› Issue (6A): 240800014-7.doi: 10.11896/jsjkx.240800014

• 图像处理&多媒体技术 • 上一篇    下一篇

基于多尺度特征和增强混合注意力机制的材料SEM图像检索方法

曾凡运, 廉贺淳, 冯珊珊, 王庆梅   

  1. 北京科技大学国家材料服役安全科学中心 北京 100083
  • 出版日期:2025-06-16 发布日期:2025-06-12
  • 通讯作者: 王庆梅(qmwang@ustb.edu.cn)
  • 作者简介:(m202221220@xs.ustb.edu.cn)
  • 基金资助:
    国家重大科技基础设施运行项目(GJFG2024001)

Material SEM Image Retrieval Method Based on Multi-scale Features and Enhanced HybridAttention Mechanism

ZENG Fanyun, LIAN Hechun, FENG Shanshan, WANG Qingmei   

  1. National Center for Materials Service Safety,University of Science and Technology Beijing,Beijing 100083,China
  • Online:2025-06-16 Published:2025-06-12
  • About author:ZENG Fanyun,born in 2000,postgra-duate.His main research interests include data mining and image retrieval.
    WANG Qingmei,born in 1975,Ph.D,associate researcher,master supervisor.Her main research interests include deep learning and computer vision.
  • Supported by:
    National Major Science and Technology Infrastructure Operation Program(GJFG2024001).

摘要: 材料SEM图像内容丰富,传统检索方法以及通用领域的检索方法在提取图像特征时容易受图像失真和纹理复杂等多种因素干扰,对关键特征的提取效果不佳。针对常规方法在提取材料SEM图像特征和高效检索方面存在的不足,提出一种基于多尺度特征信息的融合空洞卷积池化金字塔(ASPP)与增强混合注意力机制(ECBAM)的图像检索方法。该方法使用ConvNeXt网络进行特征提取,ConvNeXt结合膨胀卷积的大尺寸感受野和残差网络提取语义特征的优势,有助于捕捉到更多的细节和复杂纹理,能够更好地提取局部和全局特征;此外,通过引入最新的Mamba模块并将其改为双向架构以融入CBAM,提出了增强型混合注意力机制ECBAM,并将ASPP与ECBAM结合使用,从而稳定高效地对特征进行融合与增强。实验结果表明,在材料SEM图像数据集上,该方法获得了较好的检索效果,与主流检索方法相比平均检索精度提升了1.5%。

关键词: 微观图像, 图像检索, 空间金字塔, 混合注意力机制, Mamba

Abstract: Material SEM images are rich in content,and traditional retrieval methods and general-domain retrieval methods are easily affected byvarious factors such as image distortion and complex textures in image feature extraction,resulting in suboptimal extraction of key features.Aiming at the shortcomimgs of conventional methods in feature extraction and efficient retrieval of material SEM images,this paper proposes an image retrieval method based on multi-scale feature information,integratingAtrous Spatial Pyramid Pooling(ASPP) and an enhanced convolutional block attention module(ECBAM).This method employs the ConvNeXt network for feature extraction,leveraging the advantages of dilated convolutions with large receptive fields and residual networks to capture more details and complex textures,effectively extracting both local and global features.Additionally,by incorporating the latest Mamba module and modifying it into a bidirectional architecture to integrate CBAM,the enhanced mixed attention mechanism ECBAM is proposed.The combination of ASPP and ECBAM ensures stable and efficient feature fusion and enhancement.Experimental results demonstrate that this method achieves superior retrieval performance on material SEM image datasets,with an average retrieval accuracy improvement of 1.5% compared to mainstream retrieval methods.

Key words: Micro image, Image retrieval, ASPP, Hybrid attention mechanism, Mamba

中图分类号: 

  • TP391
[1]MA C,MU X,SHA D.Multi-layers feature fusion of convolutional neural network for scene classification of remote sensing[J].IEEE Access,2019,7:121685-121694.
[2]LIU Z,MAO H,WU C Y,et al.A convnet for the 2020s[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:11976-11986.
[3]XIE S,GIRSHICK R,DOLLÁR P,et al.Aggregated residual transformations for deep neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:1492-1500.
[4]WOO S,PARK J,LEE J Y,et al.Cbam:Convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:3-19.
[5]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[6]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[J].arXiv:1706.03762,2017.
[7]HAN K,WANG Y,CHEN H,et al.A survey on vision transformer[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,45(1):87-110.
[8]YANG P,WANG F,WEI W.Research on ConvNeXt FeatureExtraction for Image Data [J].Computer Science,2024,51(S1):295-301.
[9]CHEN L C,PAPANDREOU G,SCHROFF F,et al.Rethinking atrous convolution for semantic image segmentation[J].arXiv:1706.05587,2017.
[10]WANG Q,HE L,WANG Z Q,et al.Road Extraction Algorithm for Remote Sensing Images Based on Improved DeepLabv3+ [J].Computer Science,2024,51(8):168-175.
[11]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7132-7141.
[12]JIANG B,WAN Y,XIE X.Lightweight Steel Surface Defect Detection Model Based on Improved YOLOv5s [J].Computer Science,2023,50(S2):271-277.
[13]ZHANG S,LIH,ZHANG Y,et al.Image Retrieval Algorithm Based on Independent Attention Mechanism [J].Computer Science,2023,50(S1):328-333.
[14]ZHU L,LIAO B,ZHANG Q,et al.Vision mamba:Efficient visual representation learning with bidirectional state space model[J].arXiv:2401.09417,2024.
[15]LI K,LI X,WANG Y,et al.Videomamba:State space model for efficient video understanding[J].arXiv:2403.06977,2024.
[16]GUO T,WANG Y,MENG C,et al.Mambamorph:a mamba-based backbone with contrastive feature learning for deformable mr-ct registration[J].arXiv:2401.13934,2024.
[17]TEICHMANN M,ARAUJO A,ZHU M,et al.Detect-to-re-trieve:Efficient regional aggregation for image search[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:5109-5118.
[18]LINDENBERGER P,SARLIN P E,POLLEFEYS M.Light-glue:Local feature matching at light speed[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2023:17627-17638.
[19]GLEIZE P,WANG W,FEISZLI M.Silk:Simple learned keypoints[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2023:22499-22508.
[20]SUN J,SHEN Z,WANG Y,et al.LoFTR:Detector-free local feature matching with transformers[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:8922-8931.
[21]JIANG W,TRULLS E,HOSANG J,et al.Cotr:Correspondence transformer for matching across images[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:6207-6217.
[22]TIAN Y,BALNTAS V,NG T,et al.D2D:Keypoint extraction with describe to detect approach[C]//Proceedings of the Asian Conference on Computer Vision.2020.
[23]HOWARD A,SANDLER M,CHU G,et al.Searching for mobilenetv3[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:1314-1324.
[24]HUANG G,LIU Z,VANDERMAATEN L,et al.Densely connected convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:4700-4708.
[25]TAN M,LE Q.Efficientnet:Rethinking model scaling for convolutional neural networks[C]//International Conference on Machine Learning.PMLR,2019:6105-6114.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!