计算机科学 ›› 2025, Vol. 52 ›› Issue (9): 80-87.doi: 10.11896/jsjkx.250100150

• 智能医学工程 • 上一篇    下一篇

基于LLaMa3和Choquet积分的最优相似度选择集成学习方法

付超, 余良菊, 常文军   

  1. 合肥工业大学管理学院 合肥 230009
  • 收稿日期:2025-01-24 修回日期:2025-05-13 出版日期:2025-09-15 发布日期:2025-09-11
  • 通讯作者: 常文军(changwenjun@hfut.edu.cn)
  • 作者简介:(chaofu@hfut.edu.cn)
  • 基金资助:
    国家自然科学基金(72171066,72471073)

Selective Ensemble Learning Method for Optimal Similarity Based on LLaMa3 and Choquet Integrals

FU Chao, YU Liangju, CHANG Wenjun   

  1. School of Management,Hefei University of Technology,Hefei 230009,China
  • Received:2025-01-24 Revised:2025-05-13 Online:2025-09-15 Published:2025-09-11
  • About author:FU Chao,born in 1978,Ph.D,professor.His main research interest is intelligent decision-making methods and applications.
    CHANG Wenjun,born in 1991,Ph.D,lecturer.His main research interest is intelligent decision-making methods and applications.
  • Supported by:
    National Natural Science Foundation of China(72171066,72471073).

摘要: 为了在多分类器集成过程中筛选和赋权存在相关关系的基分类器,提出了一种基于LLaMa3和Choquet积分的最优相似度选择集成方法(LCOS-SELM)。首先在开源大模型LLaMa3的基础上,通过少量标注样本数据进行提示词学习,以实现非结构化文本的关键特征提取。然后,通过Choquet积分融合存在相关关系的分类器预测结果,并评估其相关关系以优化分类器选择。最后,采用最优相似度策略学习分类器权重,在确保样本一致性的同时,提升集成方法的性能。将LCOS-SELM用于克罗恩病的辅助诊断,以合肥市某三甲医院的297份检查报告为基础进行实验,通过与内镜检查报告进行比对,验证了所提方法的有效性。在相同实验条件下将LCOS-SELM与单分类器和传统集成模型进行实验对比,结果显示:在相同实验条件下,与单分类器相比,LCOS-SELM 在 Acc,F1和 AUC 指标上均提升了约 8%;与传统集成模型相比,LCOS-SELM 在3个指标上均提升了约2%,进一步验证了其性能优势。

关键词: 选择集成学习, LLaMa3, Choquet积分, 权重学习, 相似案例学习

Abstract: To screen and weight base classifiers with correlation during the ensemble process of multiple classifiers,this paper proposes a selective ensemble learning method for optimal similarity based on LLaMa3 and Choquet integrals(LCOS-SELM).Leveraging the open-source LLaMa3,this method efficiently achieves key feature extraction from unstructured text through prompt learning,using only a small amount of labeled data.Subsequently,it integrates the prediction results of classifiers with correlated relationships using the Choquet integral,evaluating their correlation to optimize classifier selection.Finally,it employs an optimal similarity strategy to learn classifier weights,ensuring sample consistency while enhancing the performance of the ensemble me-thod.The LCOS-SELM is used for the auxiliary diagnosis of Crohn’s disease,using 297 examination reports from a tertiary hospital in Hefei.Experiments are conducted by comparing it with endoscopic examination reports to validate the effectiveness of the proposed method.Under the same experimental conditions,LCOS-SELM demonstrates an approximate 8% improvement in Accuracy (Acc),F1-score,and AUC compared to the single classifier,and an approximate 2% improvement in all three metrics compared to traditional ensemble models,further validating its performance advantage.

Key words: Selective ensemble learning, LLaMa3, Choquet integral, Weight learning, Similar case learning

中图分类号: 

  • TP391.9
[1]ZHANG L,WANG X,MOON W M.PolSAR images classification through GA-based selective ensemble learning[C]//2015 IEEE International Geoscience and Remote Sensing Symposium(IGARSS).2015:3770-3773.
[2]ZHOU Z H.Ensemble methods:foundations and algorithms[J].IEEE Computational Intelligence Magazine,2013,8(1):9-77.
[3]OZCIFT A,GULTEN A.Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithms[J].Computer Methods and Programs in Biomedicine,2011,104(3):51-443.
[4]LIN C,CHEN W,QIU C,et al.LibD3C:Ensemble classifierswith a clustering and dynamic selection strategy[J].Neurocomputing,2014,123:35-424.
[5]DONG X,YU Z,CAO W,et al.A survey on ensemble learning[J].Frontiers of Computer Science,2019,14(2):241-58.
[6]CORRIVEAU G,GUILBAULT R,TAHAN A,et al.Reviewand study of genotypic diversity measures for real-coded representations[J].IEEE Transactions on Evolutionary Computation,2012,16(5):695-710.
[7]SINGH P K,SARKAR R,NASIPURI M.Correlation-basedclassifier combination in the field of pattern recognition[J].Computational Intelligence,2017,34(3):839-874.
[8]KUNCHEVA L I.A bound on kappa-error diagrams for analysis of classifier ensembles[J].IEEE Transactions on Knowledge and Data Engineering,2013,25(3):494-501.
[9]NANNI L,LUMINI A.Evolved feature weighting for random subspace classifier[J].IEEE Transactions on Neural Networks and Learning System,2008,19(2):6-363.
[10]ZENG X D,WONG D F,CHAO L S.Constructing better classifier ensemble based on weighted accuracy and diversity measure[J].The Scientific World Journal,2014,2014(1):1-12.
[11]ZOU H,ZHU J,HASTIE T.New multicategory boosting algorithms based on multicategory fisher-consistent losses[J].Annals of Applied Statistics,2008,2(4):306-1290.
[12]WIEST I C,FERBER D,ZHU J,et al.Privacy-preserving large language models for structured medical information retrieval[J].NPJ Digitital Medicine,2024,7(1):257.
[13]FANG Y,ZHAO X,XIAO W,et al.Few-shot Learning for Heterogeneous Information Networks[J].ACM Transcation on Information System,2024,42(4):1-24.
[14]BRANKE J,CORRENTE S,GRECO S,et al.Using Choquet integral as preference model in interactive evolutionary multiobjective optimization[J].European Journal of Operational Research,2016,250(3):884-901.
[15]CHOQUET G.Theory of capacities[J].Annales de l’Institut Fourier,1954,5:131-295.
[16]GRABISCH M,LABREUCHE C.A decade of application of the Choquet and Sugeno integrals in multi-criteria decision aid[J].Annals of Operations Research,2010,175(1):90-247.
[17]MURPHY K P.Machine learning-a probabilistic perspective[C]//2012 Adaptive Computation and Machine Learning Series.2012:147-182.
[18]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Proceedings of the 31st International Confe-rence on Neural Information Processing Systems.2017:6000-6010.
[19]HARRIS D,HARRIS S.Digital Design and Computer Architecture[M].Morgan Kaufmann Publishers Inc.,2007:180-200.
[20]BHOWAL P,SEN S,VELASQUEZ J D,et al.Fuzzy ensemble of deep learning models using choquet fuzzy integral,coalition game and information theory for breast cancer histology classification[J].Expert Systems with Applications,2022,190:116-167.
[21]SIAMI M,NADERPOUR M,LU J.A choquet fuzzy integralvertical bagging classifier for mobile telematics data analysis[C]//2019 IEEE International Conference on Fuzzy Systems(FUZZ-IEEE).2019:1-6.
[22]XU X R,XU H X,LIU C,et al.Prospective study on the diagnosis and activity assessment of inflammatory bowel disease using intestinal ultrasound[J].Journal of Gastroenterology and Hepatology,2013,22(10):6-1013.
[23]LI Z Y,YANG X L,LI L H,et al.Study on the disease assessment of ulcerative colitis and Crohn’s disease using ultrasound contrast quantification[J].Sichuan Medical Journal,2024,45(4):62-357.
[24]MEDELLIN-KOWALEWSKI A,WILKENS R,WILSON A,et al.Quantitative contrast-Enhanced ultrasound parameters in Crohn Disease:their role in disease activity determination with ultrasound[J].AJR American Journal of Roentgenology,2016,206(1):64-73.
[25]NOVAK K L,KAPLAN G G,PANACCIONE R,et al.A simple ultrasound score for the accurate detection of inflammatory activity in Crohn’s Disease[J].Inflammatory Bowel Diseases,2017,23(11):2001-2010.
[26]WINSTON W L,GOLDBERG J B.Operations research:applications and algorithms(4th ed)[M]//Belmont,CA:Thomson Brooks Cole,2004:227-261.
[27]STUDEN T.The probable error of a mean[J].Biometrika,1908,6(1):1-25.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!