计算机科学 ›› 2025, Vol. 52 ›› Issue (11): 196-205.doi: 10.11896/jsjkx.240900088

• 计算机图形学&多媒体 • 上一篇    下一篇

基于多分支注意力和深度下采样的医疗图像目标检测方法

顾成杰1, 孟义2, 朱东郡1, 张俊军1   

  1. 1 安徽理工大学公共安全与应急管理学院 合肥 231100
    2 安徽理工大学计算机科学与工程学院 安徽 淮南 232000
  • 收稿日期:2024-09-13 修回日期:2024-12-19 出版日期:2025-11-15 发布日期:2025-11-06
  • 通讯作者: 朱东郡(djzhu_aust@163.com)
  • 作者简介:(cjgu@aust.edu.cn)
  • 基金资助:
    国家自然科学基金重大科研仪器研制项目(52227901);科技部国家重点研发计划(长三角科技创新共同体联合攻关专项)(2023CSJGG1103);安徽省高等学校科学研究项目(2023AH051197);安徽理工大学引进人才科研启动基金(2023yjrc33)

Medical Image Target Detection Method Based on Multi-branch Attention and Deep Down- sampling

GU Chengjie1, MENG Yi2, ZHU Dongjun1, ZHANG Junjun1   

  1. 1 School of Public Safety and Emergency Management,Anhui University of Science and Technology,Hefei 231100,China
    2 School of Computer Science and Engineering,Anhui University of Science and Technology,Huainan,Anhui 232000,China
  • Received:2024-09-13 Revised:2024-12-19 Online:2025-11-15 Published:2025-11-06
  • About author:GU Chengjie,born in 1985,Ph.D,professor.His main research interests include trusted network architecture and security in cyberspace.
    ZHU Dongjun,born in 1991,Ph.D,lecturer.His main research interests include deep learning and machine vision.
  • Supported by:
    National Natural Science Foundation of Special Fund for Research on National Major Research Instrument(52227901),National Key Research and Development Plan of the Ministry of Science and Technology (Yangtze River Delta Science and Technology Innovation Community Joint Research Special Fund)(2023CSJGG1103),Natural Science Research Project of Colleges and Universities in Anhui Province (2023AH051197) and Scientific Research Foundation for High-level Talents of Anhui University of Science and Technology(2023yjrc33).

摘要: 人工智能技术的发展,使得基于深度学习的医疗图像检测在临床实践中具有广泛的应用前景。然而,针对一些如肿瘤、斑块等医疗图像的目标检测,存在待标面积小、目标可提取特征少、提取难度大等问题。针对上述问题,提出了一种基于多分支注意力和深度下采样的医疗图像目标检测方法(MD-Det)。该方法引入特征提取模块(C2f-DWR),对多尺度特征进行提取,增强目标的特征表示。为了能够更有效地捕捉图像中的上下文信息,增强特征的提取能力,设计了一种深度下采样模块(D-down),其核心思想是通过融合多种采样方式,结合平均池化和最大池化的操作,充分利用它们各自的优势来提高特征提取的效果。在保持计算效率的同时,提高了目标检测精度。随后,提出了一种多分支注意力机制(Multi-branch Attention,MA),用于提取和加权融合不同维度的特征,每个分支提取输入张量的不同维度特征,包括空间和通道特征。通过生成注意力权重,强调重要特征并进行加权融合,增强了网络的特征提取能力,提升了模型的检测性能。最后,提出了一种新的联合优化策略,将Wise-IoU损失和NWD损失进行加权,形成一个联合回归损失函数,进一步提高了目标识别的准确率。实验表明,所提方法可以有效提高医疗图像目标的检测精度,在医疗数据集Tumor和Liver上的mAP0.5相较于基准模型YOLOv8n,分别提高了2.5个百分点和1.1个百分点。

关键词: 可变形卷积, 目标检测, YOLO, 注意力机制

Abstract: With the development of artificial intelligence technology,medical image detection based on deep learning has a wide application prospect in clinical practice.However,for some medical image target detection such as tumor and plaque,there are some problems,such as small area to be labeled,few features to be extracted and difficult to extract.To solve these problems,this paper proposes a medical image target detection method(MD-Det) based on multi-branch attention and deep subsampling.The feature extraction module(C2f-DWR) is introduced to extract multi-scale features and enhance the feature representation of the target.This paper designes a deep down-sampling module(D-down) to capture the context information in the image more effectively and enhance the feature extraction capability.The core idea is to combine average pooling and maximum pooling operations to make full use of their respective advantages to improve the feature extraction effect by fusing multiple sampling methods.The accuracy of target detection is improved while maintaining the computational efficiency.Then,a multi-branch attention(MA) mechanism is proposed,which extracts and weights features of different dimensions,with each branch extracting features of different dimensions of the input tensor,including spatial and channel features.By generating attention weights,important features are emphasized and weighted together.The feature extraction capability of the network is enhanced,and the detection perfor-mance of the model is improved.Finally,a new joint optimization strategy is proposed,which weights Wise-IoU loss and NWD loss to form a joint regression loss function to further improve the accuracy of target recognition.Experiments show that the proposed method can effectively improve the detection accuracy of the model in medical image targets,and the mAP0.5of the medical data sets Tumor and Liver are increased by 2.5 percentage points and 1.1 percentage points,respectively.

Key words: Deformable convolution, Target detection, YOLO, Attention mechanism

中图分类号: 

  • TP391
[1]LINT Y,MAIRE M,BELONGIE S,et al.Microsoft coco:Common objects in context[C]//European Conference on Computer Vision.2014:740-755.
[2]NAGARAJANM B,HUBER M B,SCHLOSSBAUER T,et al.Classification of small lesions in dynamic breast mri:eliminating the need for precise lesion segmentation through spatiotemporal analysis of contrast enhancement[J].Machine Vision and Applications,2013,24(7):1371-1381.
[3]SETIO A A A,TRAVERSO A,DE BEL T,et al.Validation,comparison,and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images:the luna16 challenge[J].Medical Image Analysis,2017,42:1-13.
[4]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.ImageNetclassification with deep convolutional neural networks[J].Communications of the ACM,2017,60(6):84-90.
[5]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition [J].arXiv:1409.1556,2014.
[6]HE K M,ZHANG X Y,REN S Q,et al.Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2016:770-778.
[7]SONG L M,WANG S P,LI Y P,et al.A weld feature points detection method based on improved YOLO for welding robots in strong noise environment[J].Signal,Image and Video Processing,2022,17(5):1801-1809.
[8]GIRSHICK R.Fast R-CNN[C]//Proceedings of 2015 IEEE International Conference on Computer Vision.IEEE,2015:1440-1448.
[9]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,39(6):91-99.
[10]REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:unified,real-time object detection [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2016:779-788.
[11]REDMON J,FARHADI A.YOLO9000:better,faster,stronger[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2017:6517-6525.
[12]LEE S,BAE J S,KIM H,et al.Liver lesion detection fromweakly-labeled multi-phase CT volumes with a grouped single shot multibox detector[C]//Medical Image Computing and Computer Assisted Intervention.Springer,2018:693-701.
[13]ALBAHLI S,NIDA N,IRTAZA A,et al.Melanoma lesion detection and segmentation using YOLOv4-DarkNet and active contour[J].IEEE Access,2020,8:198403-198414.
[14]BOCHKOVSKIY A,WANG C Y,LIAO H.YOLOv4:optimal speed and accuracy of object detection[J].arXiv:2004.10934,2020.
[15]ZHANG Z,ZHANG X,LIN X,et al.Ultrasonic diagnosis ofbreast nodules using modified faster R-CNN[J].Ultrasonic Ima-ging,2019,41(6):353-367.
[16]LI F,HUANG H,WU Y,et al.Lung nodule detection with a 3d convnet via iou self-normalization and maxout unit[C]//ICASSP 2019-2019 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).IEEE,2019:1214-1218.
[17]ZHAO Y,WANG Z,LIU X,et al.Pulmonary Nodule Detection Based on Multiscale Feature Fusion[J].Computational and Mathematical Methods in Medicine,2022,22(41):1-13.
[18]DING J,LI A,HU Z,et al.Accurate Pulmonary Nodule Detection in Computed Tomography Images Using Deep ConvolutionalNeural Networks[C]//Medical Image Computing and Computer Assisted Intervention.Springer,2017:559-567.
[19]CARION N,MASSA F,SYNNAEVE G,et al.End-to-end object detection with transformers[C]//European Conference on Computer Vision.Cham:Springer,2020:213-229.
[20]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Proceedings of the 31st International Confe-rence on Neural Information Processing Systems.Curran Asso-ciates Inc.,2017:6000-6010.
[21]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE,2018:7132-7141.
[22]JADERBERG M,SIMONYAN K,ZISSERMAN A,et al.Spatial transformernetworks [J].arXiv:1506.02025,2015.
[23]WOO S,PARK J,LEE J Y,et al.CBAM:convolutional block attention module[C]//European Conference on Computer Vision.Cham:Springer,2018:3-19.
[24]ZHAO J,DING Z,ZHOU Y,et al.Oriented Former:An End-to-End Transformer-Based Oriented Object Detector in Remote Sensing Images[C]//IEEE Transactions on Geoscience and Remote Sensing.IEEE,2024:1-1.
[25]CHEN Y Z,WANG S Q,ZHOU W,et al.small target detection of fish swarm based on SPD-Conv structure and NAM attention mechanism [J].Computer Science,2018,51(S1):438-444.
[26]ZHOU X,JIANG L,GUAN X J,et al.Infrared small target detection algorithm with complex background based on YOLO-NWD[C]//Proceedings of the 4th International Conference on Image Processing and Machine Vision.ACM,2022:6-12.
[27]YE L M,CHEN W W.A method for detecting cascaded insulator defects that combines semantic segmentation and object detection [J].Computers and Modernization,2023(6):82-88.
[28]JIANG R Q,YE Z C,PENG Y P,et al.Lightweight target detection algorithm for weak UAV targets[J].Advances in Lasers and Optoelectronics,2022,59(8):109-120.
[29]FENG C,ZHONG Y,GAO Y,et al.Tood:Task-aligned one-stage object detection[C]//2021 IEEE/CVF International Conference on Computer Vision(ICCV).IEEE Computer Society,2021:3490-3499.
[30]ZHANG S,WANG X,WANG J,et al.Dense distinct query for end-to-end object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:7329-7338.
[31]ZHANG S,CHI C,YAO Y,et al.Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:9759-9768.
[32]ZHANG H,LI F,LIU S,et al.Dino:Detr with improved denoi-sing anchor boxes for end-to-end object detection[J].arXiv:2203.03605,2022.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!