计算机科学 ›› 2025, Vol. 52 ›› Issue (4): 231-239.doi: 10.11896/jsjkx.240700039

• 计算机图形学&多媒体 • 上一篇    下一篇

大选择性核双边网络的长尾分布医学图像分类方法

孙汤慧1, 赵刚1,2, 郭美倩1   

  1. 1 南昌航空大学数学与信息科学学院 南昌 330063
    2 南昌航空大学无损检测技术教育部重点实验室 南昌 330063
  • 收稿日期:2024-07-08 修回日期:2024-10-21 出版日期:2025-04-15 发布日期:2025-04-14
  • 通讯作者: 赵刚(zhaogang0209@163.com)
  • 作者简介:(sunth666@163.com)
  • 基金资助:
    国家自然科学基金(62366033);无损检测技术教育部重点实验室开放基金(EW202107216);江西省自然科学基金(20242BAB25121)

Long-tail Distributed Medical Image Classification Based on Large Selective Nuclear Bilateral-branch Networks

SUN Tanghui1, ZHAO Gang1,2, GUO Meiqian1   

  1. 1 School of Mathematics and Information Science,Nanchang Hangkong University,Nanchang 330063,China
    2 Key Laboratory of Nondestructive Testing Technology of the Ministry of Education,Nanchang Hangkong University,Nanchang 330063,China
  • Received:2024-07-08 Revised:2024-10-21 Online:2025-04-15 Published:2025-04-14
  • About author:SUN Tanghui,born in 1996,postgra-duate,is a member of CCF(No.U7258G).Her main research interests include deep learning and computer vision.
    ZHAO Gang,born in 1976,Ph.D,asso-ciate professor,master’s supervisor.His main research interests include machine learning and so on.
  • Supported by:
    National Natural Science Foundation of China(62366033),Open Fund of Key Laboratory of Non-destructive Testing Technology of the Ministry of Education(EW202107216) and Natural Science Foundation of Jiangxi Province,China(20242BAB25121).

摘要: 医学场景下的数据集通常呈现长尾分布的特点,这种不平衡性可能导致模型偏向头部类,而对尾部类的识别性能较差,从而影响模型的准确性。常见的解决方法是对原始数据进行数据增强,使其具备平衡分布的特点,但增强后的尾部类样本质量往往不佳,没有真正改善尾部类的分类精度。针对此问题,提出一种大选择性核双边网络模型(LSKBB)。该模型主要由传统学习分支和重新再平衡分支两部分组成,采用LSK模块来获取关键信息和关注上下文信息,设计了可以使模型由一个关注方向逐渐过渡到另一个关注方向的动态损失函数,从而提高分类精度。在不改变长尾分布特点的医学数据集中进行图像分类实验,与现有方法相比,所提出的LSKBB模型性能在不平衡率为10,50和100时,在BreaKHis数据集下,准确率分别提高1.41%,1.25%和1.25%;在ChestX-ray数据集下,准确率分别提高6.10%,3.15%和2.47%。实验结果表明,LSKBB模型在不同的不平衡率下性能较好,可用于长尾分布的医学数据集的分类检测。

关键词: 长尾分布, 深度学习, 双分支网络, LSK模块, 图像分类

Abstract: In medical scenarios,datasets often exhibit characteristics of a long-tailed distribution,where in the imbalance may cause models to favor head classes,resulting in poorer performance in identifying tail classes and thus affecting model accuracy.Common approaches involve data augmentation to transform original data into a balanced distribution.However,the quality of augmented tail class samples is often inadequate,failing to genuinely improve the classification accuracy of tail classes.Addressing this issue,this paper proposes a large selective kernel bilateral branch network model(LSKBB).The model mainly consists of two parts:the traditional learning branch and the re-balancing branch.It adopts the LSK module to acquire key information and focus on contextual information.Additionally,a dynamic loss function is designed to enable the model to transition gradually from one focus direction to another,thereby enhancing classification accuracy.In image classification experiments conducted on medical datasets with long-tail distributions without altering their characteristics,the proposed LSKBB model shows performance improvements compared to existing methods.When the imbalance ratios are 10,50,and 100,the accuracy of the LSKBB model increases by 1.41%,1.25%,and 1.25%,respectively,on BreaKHis dataset.On ChestX-ray dataset,the accuracy increases by 6.10%,3.15%,and 2.47%,respectively.The experimental results indicate that the LSKBB model achieves good performance under different imbalance ratios and is suitable for classification and detection on medical datasets with long-tail distributions.

Key words: Long-tail distribution, Deep learning, Double branch network, LSK module, Image classification

中图分类号: 

  • TP391.4
[1]BORSOS Z,MARINIER R,VINCENT D,et al.Audiolm:a language modeling approach to audio generation[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2023,31:2523-2533.
[2]WU J,GAUR Y,CHEN Z,et al.On decoder-only architecture for speech-to-text and large language model integration[C]//2023 IEEE Automatic Speech Recognition and Understanding Workshop(ASRU).IEEE,2023:1-8.
[3]WU J,JIANG Y,LIU Q,et al.General object foundation model for images and videos at scale[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:3783-3795.
[4]LI M,LI S,ZHANG X,et al.Univs:Unified and universal video segmentation with prompts as queries[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:3227-3238.
[5]SCHICK T,DWIVEDI-YU J,DESSÌ R,et al.Toolformer:Language models can teach themselves to use tools[J].Advances in Neural Information Processing Systems,2024,36:68539-68551.
[6]KASNECI E,SEßLER K,KÜCHEMANN S,et al.ChatGPT for good? On opportunities and challenges of large language models for education[J].Learning and Individual Differences,2023,103:102274.
[7]RUSSAKOVSKY O,DENG J,SU H,et al.Imagenet large scale visual recognition challenge[J].International Journal of Computer Vision,2015,115:211-252.
[8]KRIZHEVSKY A,HINTON G.Learning multiple layers of fea-tures from tiny images[J].Handbook of Systemic Autoimmune Diseases,2009,1(4):1-58.
[9]LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft coco:Common objects in context [C]//Computer Vision-ECCV 2014: 13th European Conference,Zurich,Switzerland,September 6-12,2014,Proceedings,Part V 13.Springer International Publishing,2014:740-755.
[10]DESUKY A S,HUSSAIN S.An improved hybrid approach for handling classimbalance problem[J].Arabian Journal for Science and Engineering,2021,46:3853-3864.
[11]HOYOS-OSORIO J,ALVAREZ-MEZA A,DAZA-SANTACO-LOMA G,et al.Relevant information undersampling to support imbalanced data classification[J].Neurocomputing,2021,436:136-146.
[12]CHEN X,ZHOU Y,WU D,et al.Area:adaptive reweighting via effective area for long-tailed classification[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2023:19277-19287.
[13]HUANG C,LI Y,LOY CC,et al.Learning deep representation for imbalanced classification [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:5375-5384.
[14]KANG B,XIE S,ROHRBACH M,et al.Decoupling representation and classifier for long-tailed recognition[EB/OL].[2024-04-24].https://arxiv.org/abs/1910.09217.
[15]ZHOU B,CUI Q,WEI X S,et al.Bbn: Bilateral-branch network with cumulative learning for long-tailed visual recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:9719-9728.
[16]KANG B,LI Y,XIE S,et al.Exploring balanced feature spaces for representation learning[C]//International Conference on Learning Representations.2021.
[17]LI Y,HOU Q,ZHENG Z,et al.Large selective kernel network for remote sensing object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2023:16794-16805.
[18]MIENYE I D,SUN Y.Performance analysis of cost-sensitivelearning methods with application to imbalanced medical data[J].Informatics in Medicine Unlocked,2021,25:100690.
[19]YEUNG M,SALA E,SCHÖNLIEBC B,et al.Unified focalloss:Generalising dice and cross entropy-based losses to handle class imbalanced medical image segmentation[J].Computerized Medical Imaging and Graphics,2022,95:102026.
[20]HUYNH T,NIBALI A,HE Z.Semi-supervised learning formedical image classification using imbalanced training data[J].Computer Methods and Programs in Biomedicine,2022,216:106628.
[21]KIM H E,COSA-LINAN A,SANTHANAMN,et al.Transfer learning for medical image classification:a literature review[J].BMC Medical Imaging,2022,22(1):69.
[22]GALDRAN A,CARNEIRO G,GONZÁLEZ BALLESTER MA.Balanced-mixup for highly imbalanced medical image classification[C]//Medical Image Computing and Computer Assisted Intervention-MICCAI 2021:24th International Conference,Strasbourg,France,September 27-October 1,2021,Procee-dings,Part V 24.Springer International Publishing,2021:323-333.
[23]DING H,HUANG N,CUI X.Leveraging GANs data augmentation for imbalanced medical image classification[J].Applied Soft Computing,2024,165:112050.
[24]ALQUDAH A,ALQUDAH A M.Sliding window based deep ensemble system for breast cancer classification[J].Journal of Medical Engineering & Technology,2021,45(4):313-323.
[25]GUO M H,LU C Z,LIU Z N,et al.Visual attention network[J].Computational Visual Media,2023,9(4):733-752.
[26]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal Loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2980-2988.
[27]CUI Y,JIA M,LIN T Y,et al.Class-balanced loss based on effective number of samples[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:9268-9277.
[28]ZHANG M,SU H,WEN J.Classification of flower image based on attention mechanism and multi-Loss attention network[J].Computer Communications,2021,179:307-317.
[29]CAO K,WEI C,GAIDON A,et al.Learning imbalanced data-sets with label-distribution-aware margin loss[J].Advances in Neural Information Processing Systems,2019,32:1-12.
[30]ZHONG Z,CUI J,LIU S,et al.Improving calibration for long-tailed rerecognition[C]//Proeedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2021:16489-16498.
[31]ZHU L,WANG X,KE Z,et al.Biformer:Vision transformer with bi-level routing attention[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:10323-10333.
[32]YANG L,ZHANG R Y,LI L,et al.Simam:A simple,parameter-free attention module for convolutional neural networks[C]//International Conference on Machine Learning.PMLR,2021:11863-11874.
[33]SIMONYAN K,ZISSERMAN A.Very deep convolutional net-works for large-scale image recognition.[EB/OL].[2024-04-24].https://arxiv.org/abs/1409.1556.
[34]HAN Z,WEI B,ZHENG Y,et al.Breast cancer multi-classification from histopathological images with structured deep learning model[J].Scientific reports,2017,7(1):4172.
[35]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!