Computer Science ›› 2025, Vol. 52 ›› Issue (11): 90-97.doi: 10.11896/jsjkx.240900061

• Database & Big Data & Data Science • Previous Articles     Next Articles

Adversarial Generative Multi-sensitive Attribute Data Biasing Method

WANG Wenpeng, GE Hongwei, LI Ting   

  1. Engineering Research Center of Intelligent Technology for Healthcare,Ministry of Education,Jiangnan University,Wuxi,Jiangsu 214122,China
    School of Artificial Intelligence and Computer Science,Jiangnan University,Wuxi,Jiangsu 214122,China
  • Received:2024-09-10 Revised:2024-12-16 Online:2025-11-15 Published:2025-11-06
  • About author:WANG Wenpeng,born in 1998,postgraduate.His main research interests include recommendation system and machine learning.
    GE Hongwei,born in 1967,Ph.D,professor,Ph.D supervisor.His main research interests include artificial intelligence,pattern recognition,machine learning,image processing and analysis.
  • Supported by:
    National Natural Science Foundation of China(61806006).

Abstract: This paper proposes a method for multi-sensitive attribute data debiasing,leveraging adversarial learning and autoencoder to eliminate correlations between sensitive and non-sensitive attributes,minimize the impact on model accuracy when striving for fairness,and address the issue of multi-sensitive attribute debiasing.In addressing multi-sensitive attribute debiasing,this method groups based on the combined values of multiple sensitive attributes,enhancing the fairness of each group's predictions by eliminating group correlations with these sensitive attribute combinations.To eliminate correlations between sensitive and non-sensitive attributes,an adversarial training approach is employed,utilizing auto-encoders alongside networks predicting sensitive attributes.This training effectively uncovers and eliminates latent sensitive attribute-related information within the groups,signi-ficantly reducing bias while retaining data utility.To mitigate the impact on model accuracy from striving for fairness and optimize the balance between accuracy and fairness,a prediction network is introduced.Its loss function is used as a constraint to enhance the encoder's ability to extract information,ensuring more precise capture of key information during data encoding and preventing excessive sacrifice of predictive performance during the debiasing process.Data debiasing experiments on three real datasets are conducted,applying the encoded data to logistic regression models.The fairness improvements range from 50.5% to 84%,validating the effectiveness of the debiasing method.Considering fairness,accuracy,and their balance,this debiasing method outperforms other debiasing algorithms.

Key words: Data depolarization, Machine learning, Adversarial learning, Auto-encoder

CLC Number: 

  • TP391
[1]MEHRABI N,MORSTATTER F,SAXENA N,et al.A survey on bias and fairness in machine learning[J].ACM computing surveys,2021,54(6):1-35.
[2]PEDRESHI D,RUGGIERI S,TURINI F.Discrimination-aware data mining[C]//Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2008:560-568.
[3]CATON S,HAAS C.Fairness in machine learning:A survey[J].ACM Computing Surveys,2024,56(7):1-38.
[4]CHEN Z,ZHANG J M,SARRO F,et al.MAAT:a novel ensemble approach to addressing fairness and performance bugs for machine learning software[C]//Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering.2022:1122-1134.
[5]KAMIRAN F,CALDERS T.Data preprocessing techniques for classification without discrimination[J].Knowledge and Information Systems,2012,33(1):1-33.
[6]FELDMAN M,FRIEDLER S A,MOELLER J,et al.Certifying and removing disparate impact[C]//Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Disco-very and Data Mining.2015:259-268.
[7]ZHANG L,WU Y,WU X.Achieving non-discrimination in data release[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2017:1335-1344.
[8]CHAI J,WANG X.Fairness with adaptive weights[C]//International Conference on Machine Learning.PMLR,2022:2853-2866.
[9]LI P,LIU H.Achieving fairness at no utility cost via data re-weighing with influence[C]//International Conference on Machine Learning.PMLR,2022:12917-12930.
[10]XU D,YUAN S,ZHANG L,et al.Fairgan:Fairness-aware generative adversarial networks[C]//2018 IEEE International Conference on Big Data(Big Data).IEEE,2018:570-575.
[11]PETROVIĆA,NIKOLIĆ M,JOVANOVIĆ M,et al.Fair classification via Monte Carlo policy gradient method[J].Engineering Applications of Artificial Intelligence,2021,104:104398.
[12]ZHANG B H,LEMOINE B,MITCHELL M.Mitigating un-wanted biases with adversarial learning[C]//Proceedings of the 2018 AAAI/ACM Conference on AI,Ethics,and Society.2018:335-340.
[13]WEI S,NIETHAMMER M.The fairness-accuracy Pareto front[J].Statistical Analysis and Data Mining:The ASA Data Science Journal,2022,15(3):287-302.
[14]HU Z,XU Y,TIAN X.Adaptive priority reweighing for genera-lizing fairness improvement[C]//International Joint Conference on Neural Networks(IJCNN 2023).IEEE,2023:1-8.
[15]KAMIRAN F,MANSHA S,KARIM A,et al.Exploiting reject option in classification for social discrimination control[J].Information Sciences,2018,425:18-33.
[16]CHAKRABORTY J,MAJUMDER S,YU Z,et al.Fairway:a way to build fair ML software[C]//Proceedings of the 28th ACM Joint Meeting on European Software Engineering Confe-rence and Symposium on the Foundations of Software Enginee-ring.2020:654-665.
[17]D'ALOISIO G,D'ANGELO A,DI MARCO A,et al.Debiaser for Multiple Variables to enhance fairness in classification tasks[J].Information Processing & Management,2023,60(2):103226
[18]CANALLI Y,BRAIDA F,ALVIM L,et al.Fair TransitionLoss:From label noise robustness to bias mitigation[J].Know-ledge-Based Systems,2024,294:111711.
[19]KIM D,PARK S,HWANG S,et al.Fair classification by lossbalancing via fairness-aware batch sampling[J].Neurocompu-ting,2023,518:231-241.
[20]KHALILIM M,ZHANG X,ABROSHAN M.Loss balancing for fair supervised learning[C]//International Conference on Machine Learning.PMLR,2023:16271-16290.
[21]LIANG Y,CHEN C,TIAN T,et al.Fair classification via domain adaptation:A dual adversarial learning approach[J].Frontiers in Big Data,2023,5:129.
[22]GRARI V,LAMPRIER S,DETYNIECKI M.Adversarial lear-ning for counterfactual fairness[J].Machine Learning,2023,112(3):741-763.
[23]CHEN H,ZHU T,ZHANG T,et al.Privacy and fairness in Federated learning:on the perspective of Tradeoff[J].ACM Computing Surveys,2023,56(2):1-37.
[24]VUCINICH S,ZHU Q.The Current State and Challenges ofFairness in Federated Learning[J].IEEE Access,2023,11:80903-80914.
[25]ANGWIN J,LARSON J,MATTU S,et al.Machine bias[M]//Ethics of Data and Analytics.Auerbach Publications,2022:254-264.
[26]ASUNCION A,NEWMAN D.UCI machine learning repository[DB/OL].https://archive.ics.uci.edu/ml.
[27]FOULDS J R,ISLAM R,KEYA K N,et al.An intersectional definition of fairness[C]//2020 IEEE 36th International Confe-rence on Data Engineering(ICDE).IEEE,2020:1918-1921.
[28]GHOSH A,GENUIT L,REAGAN M.Characterizing intersec-tional group fairness with worst-case comparisons[C]//Artificial Intelligence Diversity,Belonging,Equity,and Inclusion.PMLR,2021:22-34.
[29]AGARWAL A,BEYGELZIMER A,DUDÍK M,et al.A reductions approach to fair classification[C]//International Confe-rence on Machine Learning.PMLR,2018:60-69.
[30]BIRD S,DUDÍK M,EDGAR R,et al.Fairlearn:A toolkit for assessing and improving fairness in AI:MSR-TR-2020-32 [R].Microsoft,2020.
[31]FELDMAN M,FRIEDLER S A,MOELLER J,et al.Certifying and removing disparate impact[C]//Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Disco-very and Data Mining.2015:259-268.
[32]ZAFAR M B,VALERA I,ROGRIGUEZ M G,et al.Fairnessconstraints:Mechanisms for fair classification[C]//Artificial Intelligence and Statistics.PMLR,2017:962-970.
[1] WANG Yongquan, SU Mengqi, SHI Qinglei, MA Yining, SUN Yangfan, WANG Changmiao, WANG Guoyou, XI Xiaoming, YIN Yilong, WAN Xiang. Research Progress of Machine Learning in Diagnosis and Treatment of Esophageal Cancer [J]. Computer Science, 2025, 52(9): 4-15.
[2] LIU Leyuan, CHEN Gege, WU Wei, WANG Yong, ZHOU Fan. Survey of Data Classification and Grading Studies [J]. Computer Science, 2025, 52(9): 195-211.
[3] JIANG Rui, FAN Shuwen, WANG Xiaoming, XU Youyun. Clustering Algorithm Based on Improved SOM Model [J]. Computer Science, 2025, 52(8): 162-170.
[4] YANG Jixiang, JIANG Huiping, WANG Sen, MA Xuan. Research Progress and Challenges in Forest Fire Risk Prediction [J]. Computer Science, 2025, 52(6A): 240400177-8.
[5] WU Xingli, ZHANG Haoyue, LIAO Huchang. Review of Doctor Recommendation Methods and Applications for Consultation Platforms [J]. Computer Science, 2025, 52(5): 109-121.
[6] JIAO Jian, CHEN Ruixiang, HE Qiang, QU Kaiyang, ZHANG Ziyi. Study on Smart Contract Vulnerability Repair Based on T5 Model [J]. Computer Science, 2025, 52(4): 362-368.
[7] LUO Zhengquan, WANG Yunlong, WANG Zilei, SUN Zhenan, ZHANG Kunbo. Study on Active Privacy Protection Method in Metaverse Gaze Communication Based on SplitFederated Learning [J]. Computer Science, 2025, 52(3): 95-103.
[8] HAN Lin, WANG Yifan, LI Jianan, GAO Wei. Automatic Scheduling Search Optimization Method Based on TVM [J]. Computer Science, 2025, 52(3): 268-276.
[9] XIONG Qibing, MIAO Qiguang, YANG Tian, YUAN Benzheng, FEI Yangyang. Malicious Code Detection Method Based on Hybrid Quantum Convolutional Neural Network [J]. Computer Science, 2025, 52(3): 385-390.
[10] ZUO Xuhong, WANG Yongquan, QIU Geping. Study on Integrated Model of Securities Illegal Margin Trading Accounts Identification Based on Trading Behavior Characteristics [J]. Computer Science, 2025, 52(2): 125-133.
[11] SHANG Qiuyan, LI Yicong, WEN Ruilin, MA Yinping, OUYANG Rongbin, FAN Chun. Two-stage Multi-factor Algorithm for Job Runtime Prediction Based on Usage Characteristics [J]. Computer Science, 2025, 52(2): 261-267.
[12] YI Lisha, PENG Ningning. Spatial Pyramid Bag of Words Algorithm Based on Persistent Homology [J]. Computer Science, 2025, 52(11): 71-81.
[13] WANG Baocai, WU Guowei. Interpretable Credit Risk Assessment Model:Rule Extraction Approach Based on AttentionMechanism [J]. Computer Science, 2025, 52(10): 50-59.
[14] LI Haixia, SONG Danlei, KONG Jianing, SONG Yafei, CHANG Haiyan. Evaluation of Hyperparameter Optimization Techniques for Traditional Machine Learning Models [J]. Computer Science, 2024, 51(8): 242-255.
[15] ZHANG Daili, WANG Tinghua, ZHU Xinglin. Overview of Sample Reduction Algorithms for Support Vector Machine [J]. Computer Science, 2024, 51(7): 59-70.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!