计算机科学 ›› 2025, Vol. 52 ›› Issue (11A): 241200119-9.doi: 10.11896/jsjkx.241200119

• 数据库&大数据&数据科学 • 上一篇    下一篇

公平性增强的决策树算法

姜文慧, 叶剑虹, 高灵婷, 黄一凡   

  1. 华侨大学计算机科学与技术学院 福建 厦门 361021
  • 出版日期:2025-11-15 发布日期:2025-11-10
  • 通讯作者: 叶剑虹(leafever@163.com)
  • 作者简介:jwh1010@qq.com
  • 基金资助:
    :福建省科技厅引导性项目(2024H0014(2024H01010100))

Fairness-enhancing Decision Tree Algorithm

JIANG Wenhui, YE Jianhong, GAO Lingting, HUANG Yifan   

  1. College of Computer Science and Technology,Huaqiao University,Xiamen,Fujian 361021,China
  • Online:2025-11-15 Published:2025-11-10
  • Supported by:
    Science and Technology Planning Project of Fujian Province,China(2024H0014(2024H01010100)).

摘要: 在机器学习领域,模型的内在偏见问题日益受到关注,这些偏见往往源自训练数据的不平衡性或算法设计缺陷,从而导致某些群体在预测结果上受到不公正对待。为了解决这一问题,提出了一种公平性增强的决策树算法,通过引入公平性预处理方法,有效减少数据中的不平衡性,并且改变传统的决策树分裂标准,在决策树的分裂标准中综合考虑了分类准确性和公平性。所提方法旨在实现不同群体间预测结果的公平分配,减少模型决策中的偏见,确保所有个体得到公正对待。实验结果表明,所提出的方法在多种公平性度量标准下展现出良好的性能,显著降低了不同群体间的预测偏差,具有比现有传统算法更强的公平性纠偏性能。

关键词: 机器学习, 分类, 决策树, 公平性, 预处理

Abstract: In the field of machine learning,the problem of intrinsic biases in models has received increasing attention,and these biases often originate from imbalances in the training data or flaws in the algorithm design,which lead to unfair treatment of certain groups in the prediction results.To address this problem,this paper proposes a fairness-enhanced decision tree algorithm,which effectively reduces the imbalance in the data by introducing a fairness preprocessing method,and changes the traditional decision tree splitting criterion by integrating classification accuracy and fairness in the splitting criterion of the decision tree.The proposed method aims to achieve the fair distribution of prediction results among different groups,reduce the bias in model decision-making,and ensure that all individuals are treated fairly.Experimental results show that the proposed method demonstrates good performance under multiple fairness metrics,significantly reduces the prediction bias among different groups,and exhibits stronger fairness bias-correction performance than the existing traditional algorithms.

Key words: Machine learning, Classification, Decision tree, Fairness, Preprocessing

中图分类号: 

  • TP391
[1]PESSACH D,SHMUELI E.A review on fairness in machinelearning[J].ACM Computing Surveys,2022,55(3):1-44.
[2]KANAKIS M E,KHALILI R,WANG L.Machine learning for computer systems and networking:A survey[J].ACM Computing Surveys,2022,55(4):1-36.
[3]CATON S,HAAS C.Fairness in machine learning:A survey[J].ACM Computing Surveys,2024,56(7):1-38.
[4]CALDERS T,VERWER S.Three naive bayes approac-hes fordiscrimination-free classification[J].Data Mining and Know-ledge Discovery,2010,21:277-292.
[5]KAMIRAN F,CALDERS T.Data preprocessing techn-iques for classification without discrimination[J].Knowledge and Information Systems,2012,33(1):1-33.
[6]CALDERS T,KAMIRAN F,PECHENIZKIY M.Building classifiers with independency constraints[C]//2009 IEEE International Conference on Data Mining Workshops.IEEE,2009:13-18.
[7]ZHANG W,TANG X,WANG J.On fairness-aware learning for non-discriminative decision-making [C]//2019 International Conference on Data Mining Workshops(ICDMW).IEEE,2019:1072-1079.
[8]CHAWLA N V,BOWYER K W,HALL L O,et al.SMOTE:synthetic minority over-sampling technique[J].Journal of Artificial Intelligence Research,2002,16:321-357.
[9]FRIEDLER S A,SCHEIDEGGER C,VENKATASUBRAMA-NI-AN S,et al.A comparative study of fairness-enhancing interventions in machine learning[C]//Proceedings of the Confe-rence on Fairness,Accounttability,and Transparency.2019:329-338.
[10]TAE K H,ROH Y,OH Y H,et al.Data cleaning for accurate,fair,and robust models:A big data-AI integration approach[C]//Proceedings of the 3rd International Workshop on Data Management for End-to-end Machine Learning.2019:1-4.
[11]GONZÁLEZ-ZELAYA V,SALAS J,MEGÍAS D,et al.Fair and private data preprocessing through microa-ggregation[J].ACM Transactions on Knowledge Discovery from Data,2023,18(3):1-24.
[12]ZAFAR M B,VALERA I,ROGRIGUEZM G,et al.Fairness constraints:Mechanisms for fair classification[C]//Artificial Intelligence and Statistics.PMLR,2017:962-970.
[13]LE QUY T,ROY A,IOSIFIDIS V,et al.A survey on datasets for fairness-aware machine learning[J].Wiley Interdisciplinary Reviews:Data Mining and Knowledge Discovery,2022,12(3):e1452.
[14]HARDT M,PRICE E,SREBRO N.Equality of opportunity in supervised learning[J].Advances in Neural Information Proces-sing Systems,2016,29:3323-3331.
[15]BERK R,HEIDARI H,JABBARI S,et al.Fairness in criminal justice risk assessments:The state of the art[J].Sociological Methods & Research,2021,50(1):3-44.
[16]ZAFAR M B,VALERA I,GOMEZ RODRIGUEZ M,et al.Fairness beyond disparate treatment & disparate impact:Lear-ning classification without disparate mistreatment[C]//Procee-dings of the 26th International Conference on World Wide Web.2017:1171-1180.
[17]LI Y,SUN H,WANGW H.Towards fair truth discovery from biased crowdsourced answers[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Disco-very & Data Mining.2020:599-607.
[18]KAMIRAN F,CALDERS T,PECHENIZKIY M.Discrimi-nation aware decision tree learning[C]//2010 IEEE International Conference on Data Mining.IEEE,2010:869-874.
[19]SPINELLI I,SCARDAPANE S,HUSSAIN A,et al.Fairdrop:Biased edge dropout for enhancing fairness in graph representation learning[J].IEEE Transactions on Artificial Intelligence,2021,3(3):344-354.
[20]TANGIRALA S.Evaluating the impact of GINI index and information gain on classification using decision tree classifier algorithm[J].International Journal of Advanced Computer Science and Applications,2020,11(2):612-619.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!