计算机科学 ›› 2021, Vol. 48 ›› Issue (6A): 67-73.doi: 10.11896/jsjkx.201000188

• 图像处理&多媒体技术 • 上一篇    下一篇

自适应小数据集乳腺癌病理组织分类研究

和青芳, 王慧, 程光   

  1. 北京联合大学计算机技术研究所 北京100101
  • 出版日期:2021-06-10 发布日期:2021-06-17
  • 通讯作者: 和青芳(qingfang@buu.edu.cn)
  • 基金资助:
    北京市自然科学基金(L191006);北京联合大学科研项目(XP202021)

Research on Classification of Breast Cancer Pathological Tissues with Adaptive Small Data Set

HE Qing-fang, WANG Hui, CHENG Guang   

  1. Institute of Computer Technology,Beijing Union University,Beijing 100101,China
  • Online:2021-06-10 Published:2021-06-17
  • About author:HE Qing-fang,born in 1968,Ph.D,associate professor.Her main research interests include pattern recognition,image understanding and image enhancement.
  • Supported by:
    Beijing Natural Science Foundation(L191006) and Academic Research Projects of Beijing Union University(XP202021).

摘要: 针对乳腺癌病理组织图像数据普遍存在数据集规模小、良性和恶性样本数量分布不均衡、自动识别精度低的现状,利用深度可分离卷积、小卷积核堆叠、增深降维等技术,结合文中提出的“SoftMax+WF”设计具备合理深度和宽度、适应小数据集、轻型的病理组织图像分类模型。在图像旋转、扭曲等传统增强数据方法基础上,采用随机不重复裁切法均衡良、恶性样本数量并扩充数据集。针对训练集中难以聚类的样本,提出“弱特征”概念、“弱特征”样本提取算法和自适应调整、二次训练算法改进模型训练。在参数设置和运行环境相同的条件下,进行8组比对实验,模型的准确率、敏感度、特异度均可达97%以上。实验结果证明文中设计的模型性能稳定,对小数据集和不均衡数据集具有较好的包容性和适应性。

关键词: 卷积神经网络, 乳腺癌病理组织图像, 弱特征, 深度可分离卷积, 深度学习, 自适应小数据集

Abstract: Aiming at the problems of small data set,uneven distribution of benign and malignant samples,and low automatic re-cognition accuracy of breast cancer pathological tissue image data,a lightweight pathological tissue image classification model with reasonable depth and width is designed,which is suitable for small data sets.Based on the traditional data enhancement methods such as image rotation and distortion,the random non-repeated cutting method is used to balance the number of benign and malignant samples and expand the data set.For the samples that are difficult to cluster in the training set,the concept of “weak feature”,“weak feature” sample extraction algorithm and adaptive adjustment,secondary training algorithm are proposed to improve the model training.Under the condition of the same parameter setting and running environment,eight groups of comparative experiments are carried out,and the accuracy,sensitivity and specificity of the model can reach more than 97%.The experimental results show that the performance of the model designed in this paper is stable,and it has good tolerance and adaptability for small data sets and unbalanced data sets.

Key words: Adaptive small data sets, Breast cancer pathological tissue images, Convolutional neural networks, Deep learning, Deep separable convolution, Weak features

中图分类号: 

  • TP391
[1] BRAY F,FERLAY J,SOERJOMATARAM I,et al.Globalcancer statistics 2018:GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries[J].CA Cancer J Clin,2018,68(6):394-424.
[2] ZHENG R S,SUN K X,ZHANG S W,et al.Analysis of the prevalence of malignant tumors in China in 2015[J].Chinese Journal of Oncology,2019,41(1):19-28.
[3] LV F,YANG S L.Analysis of pathological diagnosis of frozen sections of breast cancer [J].Chinese Community Physician,2012,14 (3):224-225.
[4] ZENG Z H.Study on pathological and clinical factors of diagnosis accuracy of frozen section of breast malignant tumor [J].Chinese Medical Guide,2012,10(10):29-30.
[5] HE X Y,HAN Z Y,ZHENG W B.Automatic classification of breast cancer pathological images based on deep learning[J].Computer Engineering and Applications,2018,54 (12):121-125.
[6] WANG M,LIU B,FOROOSH H.Factorized convolutional neural networks[J].arXiv:1608.04337,2016.
[7] CHOLLET F.Xception:Deep Learning with Depthwise Separable Convolutions[J].arXiv:1610.02357v3,2017.
[8] KOWAL M,FILIPCZUK P,OBUCHOWICZ A,et al.Compu-ter-aided diagnosis of breast cancer based on fine needle biopsy microscopic images[J].Computers in Biology and Medicine,2013,43(10):1563-1572.
[9] TIMMANA H K,AJABHUSHNAM C.Bosom Malignant Di-seases (Cancer) Identification by using Deep Learning Technique[C]//2019 Third International conference on I-SMAC (IoT in Social Mobile Analytics and Cloud) (I-SMAC).2019:1-7.
[10] SPANHOL F A,OLIVEIRA L S,PETITJEAN C,et al.A dataset for breast cancer histopathological image classification[J].IEEE Transactions on Biomedical Engineering,2016,63(7):1455-1462.
[11] BAYRAMOGLU N,KANNALA J,HEIKKILÄ J.Deep learning for magnification independent breast cancer histopathology image classification[C]//International Conference on Pattern Recognition(ICPR).2016:2441-2446.
[12] SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[C]//International Conference on Learning Representations.2015:1-14.
[13] BENGIO Y,DELALLEAU O.On the expressive power of deep architectures[C]//International Conference on Algorithmic Learning Theory.Berlin Heidelberg:Springer,2011:18-36.
[14] CHOLLET F.Xception:Deep learning with depthwise separable convolutions[J].arXiv:1610.02357,2017.
[15] LI L,YIN S C.Realization of FPGA-based Softmax layer of convolutional neural network [J].Modern Computer (Professional Edition),2017(26):21-24.
[16] GOODFELLOW I,ENGIO Y,COURVILLE A.Deep learning[M].Mit Press,2017:306-309.
[17] HE K M.Identity Mappings in Deep Residual Networks[J].arXiv:1603.05027,2016.
[18] IOFFE S,SZEGEDY C.Batch normalization:Accelerating deep network training by reducing internal covariate shift[J].arXiv:1502.03167,2015.
[19] HOWARD A G,ZHU M L,CHEN B.MobileNets:EfficientConvolutional Neural Networks for Mobile Vision Applications[J].arXiv:1704.04861,2017.
[20] WARDEN P.Why you need to improve your training data,and how to do it[OL].[2018-05-28].https://petewarden.com/2018/05/28/why-you-need-to-improve-your-training-data-and-how-to-do-it/.
[21] KINGMA D,BA J.Adam:A method for stochastic optimization[J].arXiv:1412.6980,2014.
[22] Editorial Department of Journal of Practical Medicine.The clinical significance of sensitivity and specificity [J].Journal of Practical Medicine,2000,16(2):904-904.
[23] SPANHOL F A,OLIVEIRA L S,PETITJEAN C,et al.Breast cancer histopathological image classification using Convolutional Neural Networks[C]//2016 International Joint Conference on Neural Networks(IJCNN).2016:2560-2567.
[24] HE X Y,HAN Z Y,WEI B Z.Automatic classification of pathological images of breast cancer based on deep learning[J].Computer Engineering and Applications,2018,54(12):121-125.
[25] WANG H,LI X,SHEN Q,et al.Research on breast cancerpathological image classification based on AutoAugment and residual network[J].Journal of China Jiliang University,2019,30(3):343-350.
[26] RAJPURKAR P,HANNUN A Y,HAGHPANAHI M,et al.Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks[J].Computer Science,2017,225(5):308-325.
[27] WANG W D,WANG R Z,WEI X L,et al.Automatic electrocardiogram recognition algorithm based on stacked two-way LSTM [J].Computer Science,2020,47 (7):118-124.
[28] FENG Y Q,ZHANG L,MO J.Deep Manifold Preserving Autoencoder for Classifying Breast Cancer Histopathological Images[J].IEEE/ACM Transactions on Computational Biology and Bioinformatics,2020,17(1):91-101.
[1] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[2] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[3] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[4] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[5] 陈泳全, 姜瑛.
基于卷积神经网络的APP用户行为分析方法
Analysis Method of APP User Behavior Based on Convolutional Neural Network
计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121
[6] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[7] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[8] 檀莹莹, 王俊丽, 张超波.
基于图卷积神经网络的文本分类方法研究综述
Review of Text Classification Methods Based on Graph Convolutional Network
计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[9] 李宗民, 张玉鹏, 刘玉杰, 李华.
基于可变形图卷积的点云表征学习
Deformable Graph Convolutional Networks Based Point Cloud Representation Learning
计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[10] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[11] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[12] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[13] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[14] 张颖涛, 张杰, 张睿, 张文强.
全局信息引导的真实图像风格迁移
Photorealistic Style Transfer Guided by Global Information
计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036
[15] 戴朝霞, 李锦欣, 张向东, 徐旭, 梅林, 张亮.
基于DNGAN的磁共振图像超分辨率重建算法
Super-resolution Reconstruction of MRI Based on DNGAN
计算机科学, 2022, 49(7): 113-119. https://doi.org/10.11896/jsjkx.210600105
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!