Computer Science ›› 2022, Vol. 49 ›› Issue (6A): 184-190.doi: 10.11896/jsjkx.210400234

• Intelligent Computing • Previous Articles     Next Articles

Data Debiasing Method Based on Constrained Optimized Generative Adversarial Networks

XU Guo-ning1, CHEN Yi-peng1, CHEN Yi-ming1, CHEN Jin-yin1,2, WEN Hao3   

  1. 1 College of Information Engineering,Zhejiang University of Technology,Hangzhou 310023,China
    2 Institute of Cyberspace Security,Zhejiang University of Technology,Hangzhou 310023,China
    3 Chongqing Zhongke Yuncong Technology Limited Company,Chongqing 400000,China
  • Online:2022-06-10 Published:2022-06-08
  • About author:XU Guo-ning,born in 1999.His main research interests include deep learning and artificial intelligence.
    CHEN Jin-yin,born in 1982,Ph.D,professor.Her main research interests include artificial intelligence security,graph data mining and evolutionary computing.
  • Supported by:
    National Natural Science Foundation of China(62072406),Natural Science Foundation of Zhejiang Province,China(LY19F020025),Major Special Funding for “Science and Technology Innovation 2025” in Ningbo(2018B10063) and Ministry of Education Cooperative Education Project.

Abstract: With the wide application of deep learning technology in image recognition,natural language processing and financial predicting,once there is bias in analysis results,it will cause negative impacts both on individuals and groups,thus any effects on its performance it is vital to enhance the fairness of the model without affecting the perfomance of deep learning model.Biased information about data is not only sensitive attributes,and non-sensitive attributes will also contain bias due to the correlation among attributes,therefore,the bias cannot be eliminated when debiasing algorithms only consider sensitive attributes.In order to eliminate the bias in the classification results of the deep learning model caused by the correlated sensitive attributions in the data,this paper proposes a data debiasing method based on the generative adversarial network.The loss function of the model combines the fairness constraints and the accuracy loss,and the model utilizes adversarial code to eliminate bias to generate debiased dataset,then with the alternating gaming training of the generator and the discriminator to reduce the loss of the no-bias information in the dataset,and the classification accuracy is ensured while the bias in the data is eliminated to improve the fairness of the subsequent classification tasks.Finally,data debiasing experiments are carried out on several real-world dataset to verify the effectiveness of the proposed algorithm.The results show that the proposed method can effectively decrease the bias information in datasets and generate datasets with less bias.

Key words: Adversarial training, Data debiasing, Deep learning, Generative adversarial networks

CLC Number: 

  • TP391
[1] BRENNAN T,DIETERICH W,EHRET B.Evaluating the Predictive Validity of the Compas Risk and Needs Assessment System[J].Criminal Justice & Behavior,2008,36(1):21-40.
[2] LIU L T,DEAN S,ROLF E,et al.Delayed impact of fair machine learning[C]//International Conference on Machine Lear-ning.PMLR,2018:3150-3158.
[3] CHAR D S,SHAH N H,MAGNUS D.Implementing Machine Learning in Health Care-Addressing Ethical Challenges[J].New England Journal of Medicine,2018,378(11):981-983.
[4] WADSWORTH C,VERA F,PIECH C.Achieving fairnessthrough adversarial learning:an application to recidivism prediction[J].arXiv:1807.00199,2018.
[5] LICHMAN M.UCI machine learning repository[EB/OL].URL-http://archive.ics.uci.edu/ml.
[6] WANG T,ZHAO J,YATSKAR M,et al.Balanced datasets are not enough:Estimating and mitigating gender bias in deep image representations[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:5310-5319.
[7] KAMIRAN F,CALDERS T.Data preprocessing techniques for classification without discrimination[J].Knowledge and Information Systems,2012,33(1):1-33.
[8] DWORK C,HARDT M,PITASSI T,et al.Fairness throughawareness[C]//Proceedings of the 3rd Innovations in Theoretical Computer Science Conference.2012:214-226.
[9] FELDMAN M,FRIEDLER S A,MOELLER J,et al.Certifying and removing disparate impact[C]//proceedings of the 21th ACM SIGKDD International Conference on Knowledge Disco-very and Data Mining.2015:259-268.
[10] KAMISHIMA T,AKAHO S,ASOH H,et al.Fairness-awareclassifier with prejudice remover regularizer[C]//Joint Euro-pean Conference on Machine Learning and Knowledge Discovery in Databases.Berlin:Springer,2012:35-50.
[11] CALMON F P,WEI D,VINZAMURI B,et al.Optimized pre-processing for discrimination prevention[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.2017:3995-4004.
[12] LUONG B T,RUGGIERI S,TURINI F.k-NN as an implementation of situation testing for discrimination discovery and prevention[C]//Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2011:502-510.
[13] ZHANG L,WU Y,WU X.Achieving non-discrimination in data release[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2017:1335-1344.
[14] KAMIRAN F,CALDERS T.Classifying without discriminating[C]//2009 2nd International Conference on Computer,Control and Communication.IEEE,2009:1-6.
[15] ŽLIOBAITE I,KAMIRAN F,CALDERS T.Handling condi-tional discrimination[C]//2011 IEEE 11th International Confe-rence on Data Mining.IEEE,2011:992-1001.
[16] ZEMEL R,WU Y,SWERSKY K,et al.Learning fair representations[C]//International Conference on Machine Learning.PMLR,2013:325-333.
[17] AGARWAL A,BEYGELZIMER A,DUDIK M,et al.A reductions approach to fair classification[C]//International Confe-rence on Machine Learning.PMLR,2018:60-69.
[18] ZHANG B H,LEMOINE B,MITCHELL M.Mitigating un-wanted biases with adversarial learning[C]//Proceedings of the 2018 AAAI/ACM Conference on AI,Ethics,and Society.2018:335-340.
[19] CELIS L E,HUANG L,KESWANI V,et al.Classification with fairness constraints:A meta-algorithm with provable guarantees[C]//Proceedings of the Conference on Fairness,Accountability,and Transparency.2019:319-328.
[20] ZHANG P,WANG J,SUN J,et al.White-box fairness testing through adversarial sampling[C]//Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering.2020:949-960.
[21] GALHOTRA S,BRUN Y,MELIOU A.Fairness testing:testing software for discrimination[C]//Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering.2017:498-510.
[22] UDESHI S,ARORA P,CHATTOPADHYAY S.Automated directed fairness testing[C]//Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering.New York:ACM,2018:98-108.
[23] Adult data[EB/OL].[2020-07-26].http://tinyurl.com/UCI-Adult,1996.
[24] ANGWIN J,LARSON J,MATTU S,et al.Machine bias.riskassessments in criminal sentencing[EB/OL].(2016-05-23)[2020-07-26].https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing,2016.
[25] HARDT M,PRICE E,SREBRO N.Equality of opportunity in supervised learning[J].arXiv:1610.02413,2016.
[26] ZAFAR M B,VALERA I,ROGRIGUEZ M G,et al.Fairnessconstraints:Mechanisms for fair classification[C]//Artificial Intelligence and Statistics.PMLR,2017:962-970.
[1] RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[2] TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[3] XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171.
[4] WANG Jian, PENG Yu-qi, ZHAO Yu-fei, YANG Jian. Survey of Social Network Public Opinion Information Extraction Based on Deep Learning [J]. Computer Science, 2022, 49(8): 279-293.
[5] HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[6] JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[7] SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[8] HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
[9] ZHOU Hui, SHI Hao-chen, TU Yao-feng, HUANG Sheng-jun. Robust Deep Neural Network Learning Based on Active Sampling [J]. Computer Science, 2022, 49(7): 164-169.
[10] SU Dan-ning, CAO Gui-tao, WANG Yan-nan, WANG Hong, REN He. Survey of Deep Learning for Radar Emitter Identification Based on Small Sample [J]. Computer Science, 2022, 49(7): 226-235.
[11] HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78.
[12] CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[13] LIU Wei-ye, LU Hui-min, LI Yu-peng, MA Ning. Survey on Finger Vein Recognition Research [J]. Computer Science, 2022, 49(6A): 1-11.
[14] SUN Fu-quan, CUI Zhi-qing, ZOU Peng, ZHANG Kun. Brain Tumor Segmentation Algorithm Based on Multi-scale Features [J]. Computer Science, 2022, 49(6A): 12-16.
[15] KANG Yan, XU Yu-long, KOU Yong-qi, XIE Si-yu, YANG Xue-kun, LI Hao. Drug-Drug Interaction Prediction Based on Transformer and LSTM [J]. Computer Science, 2022, 49(6A): 17-21.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!