Computer Science ›› 2025, Vol. 52 ›› Issue (11): 196-205.doi: 10.11896/jsjkx.240900088

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Medical Image Target Detection Method Based on Multi-branch Attention and Deep Down- sampling

GU Chengjie1, MENG Yi2, ZHU Dongjun1, ZHANG Junjun1   

  1. 1 School of Public Safety and Emergency Management,Anhui University of Science and Technology,Hefei 231100,China
    2 School of Computer Science and Engineering,Anhui University of Science and Technology,Huainan,Anhui 232000,China
  • Received:2024-09-13 Revised:2024-12-19 Online:2025-11-15 Published:2025-11-06
  • About author:GU Chengjie,born in 1985,Ph.D,professor.His main research interests include trusted network architecture and security in cyberspace.
    ZHU Dongjun,born in 1991,Ph.D,lecturer.His main research interests include deep learning and machine vision.
  • Supported by:
    National Natural Science Foundation of Special Fund for Research on National Major Research Instrument(52227901),National Key Research and Development Plan of the Ministry of Science and Technology (Yangtze River Delta Science and Technology Innovation Community Joint Research Special Fund)(2023CSJGG1103),Natural Science Research Project of Colleges and Universities in Anhui Province (2023AH051197) and Scientific Research Foundation for High-level Talents of Anhui University of Science and Technology(2023yjrc33).

Abstract: With the development of artificial intelligence technology,medical image detection based on deep learning has a wide application prospect in clinical practice.However,for some medical image target detection such as tumor and plaque,there are some problems,such as small area to be labeled,few features to be extracted and difficult to extract.To solve these problems,this paper proposes a medical image target detection method(MD-Det) based on multi-branch attention and deep subsampling.The feature extraction module(C2f-DWR) is introduced to extract multi-scale features and enhance the feature representation of the target.This paper designes a deep down-sampling module(D-down) to capture the context information in the image more effectively and enhance the feature extraction capability.The core idea is to combine average pooling and maximum pooling operations to make full use of their respective advantages to improve the feature extraction effect by fusing multiple sampling methods.The accuracy of target detection is improved while maintaining the computational efficiency.Then,a multi-branch attention(MA) mechanism is proposed,which extracts and weights features of different dimensions,with each branch extracting features of different dimensions of the input tensor,including spatial and channel features.By generating attention weights,important features are emphasized and weighted together.The feature extraction capability of the network is enhanced,and the detection perfor-mance of the model is improved.Finally,a new joint optimization strategy is proposed,which weights Wise-IoU loss and NWD loss to form a joint regression loss function to further improve the accuracy of target recognition.Experiments show that the proposed method can effectively improve the detection accuracy of the model in medical image targets,and the mAP0.5of the medical data sets Tumor and Liver are increased by 2.5 percentage points and 1.1 percentage points,respectively.

Key words: Deformable convolution, Target detection, YOLO, Attention mechanism

CLC Number: 

  • TP391
[1]LINT Y,MAIRE M,BELONGIE S,et al.Microsoft coco:Common objects in context[C]//European Conference on Computer Vision.2014:740-755.
[2]NAGARAJANM B,HUBER M B,SCHLOSSBAUER T,et al.Classification of small lesions in dynamic breast mri:eliminating the need for precise lesion segmentation through spatiotemporal analysis of contrast enhancement[J].Machine Vision and Applications,2013,24(7):1371-1381.
[3]SETIO A A A,TRAVERSO A,DE BEL T,et al.Validation,comparison,and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images:the luna16 challenge[J].Medical Image Analysis,2017,42:1-13.
[4]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.ImageNetclassification with deep convolutional neural networks[J].Communications of the ACM,2017,60(6):84-90.
[5]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition [J].arXiv:1409.1556,2014.
[6]HE K M,ZHANG X Y,REN S Q,et al.Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2016:770-778.
[7]SONG L M,WANG S P,LI Y P,et al.A weld feature points detection method based on improved YOLO for welding robots in strong noise environment[J].Signal,Image and Video Processing,2022,17(5):1801-1809.
[8]GIRSHICK R.Fast R-CNN[C]//Proceedings of 2015 IEEE International Conference on Computer Vision.IEEE,2015:1440-1448.
[9]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,39(6):91-99.
[10]REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:unified,real-time object detection [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2016:779-788.
[11]REDMON J,FARHADI A.YOLO9000:better,faster,stronger[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2017:6517-6525.
[12]LEE S,BAE J S,KIM H,et al.Liver lesion detection fromweakly-labeled multi-phase CT volumes with a grouped single shot multibox detector[C]//Medical Image Computing and Computer Assisted Intervention.Springer,2018:693-701.
[13]ALBAHLI S,NIDA N,IRTAZA A,et al.Melanoma lesion detection and segmentation using YOLOv4-DarkNet and active contour[J].IEEE Access,2020,8:198403-198414.
[14]BOCHKOVSKIY A,WANG C Y,LIAO H.YOLOv4:optimal speed and accuracy of object detection[J].arXiv:2004.10934,2020.
[15]ZHANG Z,ZHANG X,LIN X,et al.Ultrasonic diagnosis ofbreast nodules using modified faster R-CNN[J].Ultrasonic Ima-ging,2019,41(6):353-367.
[16]LI F,HUANG H,WU Y,et al.Lung nodule detection with a 3d convnet via iou self-normalization and maxout unit[C]//ICASSP 2019-2019 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).IEEE,2019:1214-1218.
[17]ZHAO Y,WANG Z,LIU X,et al.Pulmonary Nodule Detection Based on Multiscale Feature Fusion[J].Computational and Mathematical Methods in Medicine,2022,22(41):1-13.
[18]DING J,LI A,HU Z,et al.Accurate Pulmonary Nodule Detection in Computed Tomography Images Using Deep ConvolutionalNeural Networks[C]//Medical Image Computing and Computer Assisted Intervention.Springer,2017:559-567.
[19]CARION N,MASSA F,SYNNAEVE G,et al.End-to-end object detection with transformers[C]//European Conference on Computer Vision.Cham:Springer,2020:213-229.
[20]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Proceedings of the 31st International Confe-rence on Neural Information Processing Systems.Curran Asso-ciates Inc.,2017:6000-6010.
[21]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE,2018:7132-7141.
[22]JADERBERG M,SIMONYAN K,ZISSERMAN A,et al.Spatial transformernetworks [J].arXiv:1506.02025,2015.
[23]WOO S,PARK J,LEE J Y,et al.CBAM:convolutional block attention module[C]//European Conference on Computer Vision.Cham:Springer,2018:3-19.
[24]ZHAO J,DING Z,ZHOU Y,et al.Oriented Former:An End-to-End Transformer-Based Oriented Object Detector in Remote Sensing Images[C]//IEEE Transactions on Geoscience and Remote Sensing.IEEE,2024:1-1.
[25]CHEN Y Z,WANG S Q,ZHOU W,et al.small target detection of fish swarm based on SPD-Conv structure and NAM attention mechanism [J].Computer Science,2018,51(S1):438-444.
[26]ZHOU X,JIANG L,GUAN X J,et al.Infrared small target detection algorithm with complex background based on YOLO-NWD[C]//Proceedings of the 4th International Conference on Image Processing and Machine Vision.ACM,2022:6-12.
[27]YE L M,CHEN W W.A method for detecting cascaded insulator defects that combines semantic segmentation and object detection [J].Computers and Modernization,2023(6):82-88.
[28]JIANG R Q,YE Z C,PENG Y P,et al.Lightweight target detection algorithm for weak UAV targets[J].Advances in Lasers and Optoelectronics,2022,59(8):109-120.
[29]FENG C,ZHONG Y,GAO Y,et al.Tood:Task-aligned one-stage object detection[C]//2021 IEEE/CVF International Conference on Computer Vision(ICCV).IEEE Computer Society,2021:3490-3499.
[30]ZHANG S,WANG X,WANG J,et al.Dense distinct query for end-to-end object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:7329-7338.
[31]ZHANG S,CHI C,YAO Y,et al.Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:9759-9768.
[32]ZHANG H,LI F,LIU S,et al.Dino:Detr with improved denoi-sing anchor boxes for end-to-end object detection[J].arXiv:2203.03605,2022.
[1] PENG Jiao, HE Yue, SHANG Xiaoran, HU Saier, ZHANG Bo, CHANG Yongjuan, OU Zhonghong, LU Yanyan, JIANG dan, LIU Yaduo. Text-Dynamic Image Cross-modal Retrieval Algorithm Based on Progressive Prototype Matching [J]. Computer Science, 2025, 52(9): 276-281.
[2] GAO Long, LI Yang, WANG Suge. Sentiment Classification Method Based on Stepwise Cooperative Fusion Representation [J]. Computer Science, 2025, 52(9): 313-319.
[3] LIU Jian, YAO Renyuan, GAO Nan, LIANG Ronghua, CHEN Peng. VSRI:Visual Semantic Relational Interactor for Image Caption [J]. Computer Science, 2025, 52(8): 222-231.
[4] WANG Fengling, WEI Aimin, PANG Xiongwen, LI Zhi, XIE Jingming. Video Super-resolution Model Based on Implicit Alignment [J]. Computer Science, 2025, 52(8): 232-239.
[5] LIU Yajun, JI Qingge. Pedestrian Trajectory Prediction Based on Motion Patterns and Time-Frequency Domain Fusion [J]. Computer Science, 2025, 52(7): 92-102.
[6] LIU Chengzhuang, ZHAI Sulan, LIU Haiqing, WANG Kunpeng. Weakly-aligned RGBT Salient Object Detection Based on Multi-modal Feature Alignment [J]. Computer Science, 2025, 52(7): 142-150.
[7] ZHUANG Jianjun, WAN Li. SCF U2-Net:Lightweight U2-Net Improved Method for Breast Ultrasound Lesion SegmentationCombined with Fuzzy Logic [J]. Computer Science, 2025, 52(7): 161-169.
[8] XU Yongwei, REN Haopan, WANG Pengfei. Object Detection Algorithm Based on YOLOv8 Enhancement and Its Application Norms [J]. Computer Science, 2025, 52(7): 189-200.
[9] ZHENG Cheng, YANG Nan. Aspect-based Sentiment Analysis Based on Syntax,Semantics and Affective Knowledge [J]. Computer Science, 2025, 52(7): 218-225.
[10] WANG Youkang, CHENG Chunling. Multimodal Sentiment Analysis Model Based on Cross-modal Unidirectional Weighting [J]. Computer Science, 2025, 52(7): 226-232.
[11] KONG Yinling, WANG Zhongqing, WANG Hongling. Study on Opinion Summarization Incorporating Evaluation Object Information [J]. Computer Science, 2025, 52(7): 233-240.
[12] GUAN Xin, YANG Xueyong, YANG Xiaolin, MENG Xiangfu. Tumor Mutation Prediction Model of Lung Adenocarcinoma Based on Pathological [J]. Computer Science, 2025, 52(6A): 240700010-8.
[13] TAN Jiahui, WEN Chenyan, HUANG Wei, HU Kai. CT Image Segmentation of Intracranial Hemorrhage Based on ESC-TransUNet Network [J]. Computer Science, 2025, 52(6A): 240700030-9.
[14] CHEN Xianglong, LI Haijun. LST-ARBunet:An Improved Deep Learning Algorithm for Nodule Segmentation in Lung CT Images [J]. Computer Science, 2025, 52(6A): 240600020-10.
[15] LI Daicheng, LI Han, LIU Zheyu, GONG Shiheng. MacBERT Based Chinese Named Entity Recognition Fusion with Dependent Syntactic Information and Multi-view Lexical Information [J]. Computer Science, 2025, 52(6A): 240600121-8.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!