计算机科学 ›› 2014, Vol. 41 ›› Issue (Z6): 103-109.

• 智能计算 • 上一篇    下一篇

ADST:用机器学习方法鉴别结节病和肺结核

陈蔼祥,陈智锋   

  1. 广东财经大学数学与统计学院 广州510320;肇庆市第一人民医院 肇庆526021
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受国家自然科学(60773201),广东省自然科学基金(10451032001006140,5),广州市科技和信息化局应用基础研究项目(10C12140131),广东省教育厅普通高校育苗工程(LYM10081),肇庆市科技创新计划(2011E241)资助

ADST:Approache of Automated Differentiating Sarcoidosis from Tuberculosis Based on Statistical Learning Theory

CHEN Ai-xiang and CHEN Zhi-feng   

  • Online:2018-11-14 Published:2018-11-14

摘要: 结节病和肺结核的临床鉴别诊断目前仍然是困难的。搜集了106例结节病和肺结核的对比资料,并筛选出对分类有意义的临床指标作为特征,将其进行必要的量化和缩放形成训练数据,然后分别用支持向量机(SVM:Support Vector Machine)、决策分类树(DCT:Decision Classification Tree)、朴素贝叶斯(NB:Nave Bayes) 3种不同的方法进行训练,并用5倍交叉验证评估各种不同的模型的有效性。实验结果表明,这3种方法在识别结节病时对应的ROC曲线下的面积分别为0.978,0.96,0.690,得到的测试精度分别达到100%,96.15%,96.15%,训练精度分别为95.28%,90.57%,92.38%。用这3种方法得到的分类器对19例临床未能确诊的病患进行预测,DCT方法的预测结果与SVM方法的结果高度吻合(19例中仅1例预测结果不同),而NB方法预测结果稍差(19例中有3例与SVM预测结果不一致)。实验结果表明,3种方法中,SVM方法的分类能力和分类精度最高。临床实验结果表明,19例临床未能确诊的病患按照SVM算法预测的结果进行治疗均得到了康复。

关键词: 结节病,肺结核,DCT,NB,SVM 中图法分类号TP181文献标识码A

Abstract: Differentiating sarcoidosis from tuberculosis is still difficult.The support vector machine is a powerful tool in statistical learning.In this paper,we collected 106cases of sarcoidosis and tuberculosis,used an SVM to build a disease classifier named ADST(Automated Differentiating Sarcoidosis from Tuberculosis).In order to get the raw medical data into a form usable by SVM,we extracted feature vectors of the raw medical data by turnning the qualitative feature into digital one and dropping the features that do not have much classification value.Then ADST conducts simple scaling on the data,uses cross-validation to find the best parameter of model,uses the best parameter to train the whole training set to obtain the SVM model.Finally ADST uses the resulted SVM model to predict a new patient case.The experiment result shows that the ROC areas of SVM,DCT and NB are 0.978,0.96,0.690respectively,and the training accuracy is 95.28%,90.57%,92.38%,and test accuracy is 100%,96.15%,96.15%.Clinical pratice shows that the classification result is correct:19cases of undiagnosed patients are recovered after treatment according to the results of the diagnosis of ADST.

Key words: Sarcoidosis,Tuberculosis,Statistical learning,SVM

[1] Iannuzzi MC,Fontana JR.Sarcoidosis:Clinical Presentation,Immunopathogenesis,and Therapeutics[J].JAMA,2011,305(4):391-399
[2] 中华医学会呼吸病学会结节病组.结节病诊断及治疗方案(第三次修订稿案)[J].中华结核和呼吸杂志,1994,7(1):9-10
[3] 中华医学会结核学分会.肺结核诊断和治疗指南[J].中华结核和呼吸杂志,2001,24(2):70-74
[4] 邹兰芳,杨吉刚,李春林,等.结节病18F-FDG符合线路显像胸部淋巴结的特征表现[J].临床和实验医学杂志,2013,2(3):169-170
[5] 黄燕,陆聪哲,王彩彩,等.结节病患者外周血Th7细胞表达及临床意义[J].中国呼吸与危重监护杂志,2013,2(2):173-176
[6] 刘长军,李洪松.64排螺旋CT在肺结节病变经皮穿刺活检中的临床应用研究[J].实用医学影像杂志,2012,3(6):367-370
[7] 叶秋月.肺结节病与肺结核鉴别诊断的临床分析[D].北京:北京协和医学院(中国医学科学院),2011
[8] 李秋红.结节病与不典型结核病鉴别诊断方法的研究[D].苏州:苏州大学,2007
[9] 李秋红,赵兰,李惠萍,等.实时定量聚合酶链反应技术在鉴别结节病与增殖性结核病中的应用[J].中华结核和呼吸杂志,2007,0(9):686-690
[10] 沈瓅,周瑛,李秋红,等.实时荧光定量PCR在结节病与不典型结核鉴别诊断中的临床应用[J].同济大学学报:医学版,2010,1(6):46-50
[11] 周瑛.结节病与不典型结核病鉴别诊断方法的研究[D].上海:同济大学医学院,2009
[12] ACCESS Research Group.Design of A Case Control Etiologic Study of Sarcoidosis(ACCESS)[J].J Clin Epidemiol,1999,2(12):1173-1186
[13] Wirnsberger RM,Vries J de,Wouters EFM,et al.Clinical presentation of sarcoidosis in the Netherlands An epidemiological study[J].Netherlands Journal of Medicine,1998,3:53-56
[14] Hastie T,Tibshirani R,Friedman J.The Elements of Statistical Learning:Data Mining,Inference,and Prediction(Second Edition)[M].Springer,February 2009

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!