计算机科学 ›› 2025, Vol. 52 ›› Issue (11): 90-97.doi: 10.11896/jsjkx.240900061
王文鹏, 葛洪伟, 李婷
WANG Wenpeng, GE Hongwei, LI Ting
摘要: 针对消除数据中敏感属性与非敏感属性之间的相关性、减轻实现公平性对模型准确性的损失以及多敏感属性去偏的问题,提出一种对抗生成式的多敏感属性数据去偏方法。在多敏感属性去偏问题上,该方法通过多个敏感属性的组合值来划分群组,并通过消除各群组与多敏感属性组合的相关性来提升各群组预测结果的公平性。在消除数据中敏感属性与非敏感属性之间的相关性问题上,采用自编码器与预测敏感属性的网络进行对抗式训练,这种训练机制能够深入挖掘并消除群组中潜藏的与敏感属性相关的信息,从而在保留数据有用性的同时,显著降低偏见。在减轻实现公平性对模型准确性损失,最大化准确性与公平性之间平衡的问题上,通过引入预测网络,并利用其损失函数作为约束,优化编码器的信息提取能力,确保在数据编码过程中能够更精准地捕捉关键信息,避免数据在去偏过程中过度牺牲模型的预测性能。在3个真实数据集上进行数据去偏实验,将经编码器编码的数据应用于逻辑回归模型,公平性提升50.5%~84%,验证了该教据去偏方法的有效性。综合考虑公平性、准确性以及公平性与准确性的平衡,该去偏方法优于其他去偏算法。
中图分类号:
| [1]MEHRABI N,MORSTATTER F,SAXENA N,et al.A survey on bias and fairness in machine learning[J].ACM computing surveys,2021,54(6):1-35. [2]PEDRESHI D,RUGGIERI S,TURINI F.Discrimination-aware data mining[C]//Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2008:560-568. [3]CATON S,HAAS C.Fairness in machine learning:A survey[J].ACM Computing Surveys,2024,56(7):1-38. [4]CHEN Z,ZHANG J M,SARRO F,et al.MAAT:a novel ensemble approach to addressing fairness and performance bugs for machine learning software[C]//Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering.2022:1122-1134. [5]KAMIRAN F,CALDERS T.Data preprocessing techniques for classification without discrimination[J].Knowledge and Information Systems,2012,33(1):1-33. [6]FELDMAN M,FRIEDLER S A,MOELLER J,et al.Certifying and removing disparate impact[C]//Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Disco-very and Data Mining.2015:259-268. [7]ZHANG L,WU Y,WU X.Achieving non-discrimination in data release[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2017:1335-1344. [8]CHAI J,WANG X.Fairness with adaptive weights[C]//International Conference on Machine Learning.PMLR,2022:2853-2866. [9]LI P,LIU H.Achieving fairness at no utility cost via data re-weighing with influence[C]//International Conference on Machine Learning.PMLR,2022:12917-12930. [10]XU D,YUAN S,ZHANG L,et al.Fairgan:Fairness-aware generative adversarial networks[C]//2018 IEEE International Conference on Big Data(Big Data).IEEE,2018:570-575. [11]PETROVIĆA,NIKOLIĆ M,JOVANOVIĆ M,et al.Fair classification via Monte Carlo policy gradient method[J].Engineering Applications of Artificial Intelligence,2021,104:104398. [12]ZHANG B H,LEMOINE B,MITCHELL M.Mitigating un-wanted biases with adversarial learning[C]//Proceedings of the 2018 AAAI/ACM Conference on AI,Ethics,and Society.2018:335-340. [13]WEI S,NIETHAMMER M.The fairness-accuracy Pareto front[J].Statistical Analysis and Data Mining:The ASA Data Science Journal,2022,15(3):287-302. [14]HU Z,XU Y,TIAN X.Adaptive priority reweighing for genera-lizing fairness improvement[C]//International Joint Conference on Neural Networks(IJCNN 2023).IEEE,2023:1-8. [15]KAMIRAN F,MANSHA S,KARIM A,et al.Exploiting reject option in classification for social discrimination control[J].Information Sciences,2018,425:18-33. [16]CHAKRABORTY J,MAJUMDER S,YU Z,et al.Fairway:a way to build fair ML software[C]//Proceedings of the 28th ACM Joint Meeting on European Software Engineering Confe-rence and Symposium on the Foundations of Software Enginee-ring.2020:654-665. [17]D'ALOISIO G,D'ANGELO A,DI MARCO A,et al.Debiaser for Multiple Variables to enhance fairness in classification tasks[J].Information Processing & Management,2023,60(2):103226 [18]CANALLI Y,BRAIDA F,ALVIM L,et al.Fair TransitionLoss:From label noise robustness to bias mitigation[J].Know-ledge-Based Systems,2024,294:111711. [19]KIM D,PARK S,HWANG S,et al.Fair classification by lossbalancing via fairness-aware batch sampling[J].Neurocompu-ting,2023,518:231-241. [20]KHALILIM M,ZHANG X,ABROSHAN M.Loss balancing for fair supervised learning[C]//International Conference on Machine Learning.PMLR,2023:16271-16290. [21]LIANG Y,CHEN C,TIAN T,et al.Fair classification via domain adaptation:A dual adversarial learning approach[J].Frontiers in Big Data,2023,5:129. [22]GRARI V,LAMPRIER S,DETYNIECKI M.Adversarial lear-ning for counterfactual fairness[J].Machine Learning,2023,112(3):741-763. [23]CHEN H,ZHU T,ZHANG T,et al.Privacy and fairness in Federated learning:on the perspective of Tradeoff[J].ACM Computing Surveys,2023,56(2):1-37. [24]VUCINICH S,ZHU Q.The Current State and Challenges ofFairness in Federated Learning[J].IEEE Access,2023,11:80903-80914. [25]ANGWIN J,LARSON J,MATTU S,et al.Machine bias[M]//Ethics of Data and Analytics.Auerbach Publications,2022:254-264. [26]ASUNCION A,NEWMAN D.UCI machine learning repository[DB/OL].https://archive.ics.uci.edu/ml. [27]FOULDS J R,ISLAM R,KEYA K N,et al.An intersectional definition of fairness[C]//2020 IEEE 36th International Confe-rence on Data Engineering(ICDE).IEEE,2020:1918-1921. [28]GHOSH A,GENUIT L,REAGAN M.Characterizing intersec-tional group fairness with worst-case comparisons[C]//Artificial Intelligence Diversity,Belonging,Equity,and Inclusion.PMLR,2021:22-34. [29]AGARWAL A,BEYGELZIMER A,DUDÍK M,et al.A reductions approach to fair classification[C]//International Confe-rence on Machine Learning.PMLR,2018:60-69. [30]BIRD S,DUDÍK M,EDGAR R,et al.Fairlearn:A toolkit for assessing and improving fairness in AI:MSR-TR-2020-32 [R].Microsoft,2020. [31]FELDMAN M,FRIEDLER S A,MOELLER J,et al.Certifying and removing disparate impact[C]//Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Disco-very and Data Mining.2015:259-268. [32]ZAFAR M B,VALERA I,ROGRIGUEZ M G,et al.Fairnessconstraints:Mechanisms for fair classification[C]//Artificial Intelligence and Statistics.PMLR,2017:962-970. |
|
||