基于对抗生成网络的众包内容隐私保护

doi:10.11896/jsjkx.250200123

Abstract

Abstract: Crowdsourcing is an emerging alternative of outsourcing strategy that aims at making use of the wisdom of the crowd.Dueto the cheap and efficient characteristics of crowdsourcing,it’s widely recognized as an ideal solution for massive data oriented processing tasks,such as data labeling and model training.In crowdsourcing,however,on the task owners side,to get benifits from the wisdom of the unforeseen workers,they have to first make their private data unlimited accessed publicly,which is unsafe as the risk of the information leakage is concerned.To address this issue,we propose a crowdsourcing model PrivCS that can ensure content privacy security.The essential idea of PrivCS is to synthetiz some new data with regard to the task owners’ private data and pulicly publish the synthetic data to the workers instead of the real data.The tool we adopt to synthetiz the new data is the adversarial generative networks(GAN).There have been lots of exploitations show that GAN is privacy-preserving,therefore PrivCS of course inherits the same ability from GAN.We also study the theoretic performance of PrivCS,our analysis show that the outputs of PrivCS are comparable with respect to those derived from the real data,in terms of both data labeling and model training tasks.In addition,our experimental results support the theoretic findings.

Key words: Crowdsourcing, Privacy preserving, Generating adversarial networks

CLC Number:

TP181

HUANG Xiaoyu, JIANG Hemeng, LING Jiaming. Privacy Preservation of Crowdsourcing Content Based on Adversarial Generative Networks[J].Computer Science, 2025, 52(6A): 250200123-7.

References

[1]HOWE J.The rise of crowdsourcing[J].Wired magazine,2006,14(6):176-183.
[2]RUSSAKOVSKY O,DENG J,SU H,et al.Imagenet large scale visual recognition challenge[J].International Journal of Computer Vision,2015,115:211-252.
[3]LIU A,LI Z X,LIU G F,et al.Privacy-preserving task assignment in spatial crowdsourcing[J].Journal of Computer Science and Technology,2017,32(5):905-918.
[4]TO H,GHINITA G,SHAHABI C.A framework for protecting worker location privacy in spatial crowdsourcing[J].Proceedings of the VLDB Endowment,2014,7(10):919-930.
[5]LIN C,HE D,ZEADALLY S,et al.SecBCS:a secure and privacy-preserving blockchain-based crowdsourcing system[J].Science China Information Sciences,2020,63:1-14.
[6]GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Generative adversarial networks[J].Communications of the ACM,2020,63(11):139-144.
[7]ALTHUIZEN N,CHEN B.Crowdsourcing ideas using product prototypes:the joint effect of prototype enhancement and the product design goal on idea novelty[J].Management Science,2022,68(4):3008-3025.
[8]KARGER D R,OH S,SHAH D.Budget-optimal task allocation for reliable crowdsourcing systems[J].Operations Research,2014,62(1):1-24.
[9]SHAH N,GUO Y,WENDELSDORF K V,et al.A crowdsourcing approach for reusing and meta-analyzing gene expression data[J].Nature Biotechnology,2016,34(8):803-806.
[10]DWORK C,ROTH A.The algorithmic foundations of differential privacy[J].Foundations and Trends© in Theoretical Computer Science,2014,9(3／4):211-407.
[11]SHALEV-SHWARTZ S,BEN-DAVID S.Understanding ma-chine learning:From theory to algorithms[M].Cambridge University Press,2014.
[12]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[13]CHIB S,GREENBERG E.Understanding the metropolis-has-tings algorithm[J].The American Statistician,1995,49(4):327-335.
[14]GELFAND A E.Gibbs sampling[J].Journal of the American statistical Association,2000,95(452):1300-1304.
[15]CRESWELL A,WHITE T,DUMOULIN V,et al.Generativeadversarial networks:An overview[J].IEEE Signal Processing Magazine,2018,35(1):53-65.
[16]GUI J,SUN Z,WEN Y,et al.A review on generative adversarial networks:Algorithms,theory,and applications[J].IEEE Transactions on Knowledge and Data Engineering,2021,35(4):3313-3332.
[17]CAI Z,XIONG Z,XU H,et al.Generative adversarial networks:A survey toward private and secure applications[J].ACM Computing Surveys(CSUR),2021,54(6):1-38.
[18]GULRAJANI I,AHMED F,ARJOVSKY M,et al.Improvedtraining of wasserstein gans[J].Advances in Neural Information Processing Systems,2017,30.
[19]LIN Z,SEKAR V,FANTI G.On the privacy properties of gan-generated samples[C]//International Conference on Artificial Intelligence and Statistics.PMLR,2021:1522-1530.
[20]XU D,RUAN C,KORPEOGLU E,et al.Rethinking neural vs.matrix-factorization collaborative filtering:the theoretical perspectives[C]//International Conference on Machine Learning.PMLR,2021:11514-11524.
[21]WAINWRIGHT M J.High-dimensional statistics:A non-as-ymptotic viewpoint[M].Cambridge University Press,2019.
[22]BOUCHERON S,LUGOSI G,MASSART P.Concentration Inequalities:A Nonasymptotic Theory of Independence[M].OUP:Oxford,2013.
[23]PASZKE A,GROSS S,MASSA F,et al.Pytorch:An imperative style,high-performance deep learning library[J].Advances in Neural Information Processing Systems,2019,32.

Related Articles 15

[1]	ZHANG Jindou, CHEN Jingwei, WU Wenyuan, FENG Yong. Privacy-preserving Principal Component Analysis Based on Homomorphic Encryption [J]. Computer Science, 2024, 51(8): 387-395.
[2]	SUN Jianming, ZHAO Mengxin. Survey of Application of Differential Privacy in Edge Computing [J]. Computer Science, 2024, 51(6A): 230700089-9.
[3]	CHENG Enze, ZHANG Lei, WEI Lifei. Fuzzy Labeled Private Set Intersection Protocol [J]. Computer Science, 2024, 51(12): 343-351.
[4]	WANG Zihang, YANG Min, WEI Zichong. Application of Parameter Decoupling in Differentially Privacy Protection Federated Learning [J]. Computer Science, 2024, 51(11): 379-388.
[5]	WANG Shaohui, ZHAO Zhengyu, WANG Huaqun, XIAO Fu. Analysis and Improvement on Identity-based Remote Data Integrity Verification Scheme [J]. Computer Science, 2023, 50(7): 302-307.
[6]	ZHAO Yuqi, YANG Min. Review of Differential Privacy Research [J]. Computer Science, 2023, 50(4): 265-276.
[7]	JIANG Jiuchuan, WEI Jinpeng, ZHANG Jinwei. Global Task Assignment Model for Crowdsourcing with Mixed-quality Worker Context [J]. Computer Science, 2023, 50(11A): 230200079-9.
[8]	LIU Qingju, PAN Qingxian, TONG Xiangrong, YU Song, PAN Yanan. Bidirectional Quality Control Strategies Based on CIDA and PI-cosine in Crowdsourcing [J]. Computer Science, 2023, 50(10): 282-290.
[9]	FU Yan-ming, ZHU Jie-fu, JIANG Kan, HUANG Bao-hua, MENG Qing-wen, ZHOU Xing. Incentive Mechanism Based on Multi-constrained Worker Selection in Mobile Crowdsourcing [J]. Computer Science, 2022, 49(9): 275-282.
[10]	TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[11]	WANG Jian. Back-propagation Neural Network Learning Algorithm Based on Privacy Preserving [J]. Computer Science, 2022, 49(6A): 575-580.
[12]	CHEN Dan-hong, PENG Zhang-lin, WAN De-quan, YANG Shan-lin. Identification and Segmentation of User Value in Crowdsourcing Platforms:An Improved RFMModel [J]. Computer Science, 2022, 49(4): 37-42.
[13]	LYU You, WU Wen-yuan. Linear System Solving Scheme Based on Homomorphic Encryption [J]. Computer Science, 2022, 49(3): 338-345.
[14]	KONG Yu-ting, TAN Fu-xiang, ZHAO Xin, ZHANG Zheng-hang, BAI Lu, QIAN Yu-rong. Review of K-means Algorithm Optimization Based on Differential Privacy [J]. Computer Science, 2022, 49(2): 162-173.
[15]	SHEN Biao, SHEN Li-wei, LI Yi. Dynamic Task Scheduling Method for Space Crowdsourcing [J]. Computer Science, 2022, 49(2): 231-240.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Privacy Preservation of Crowdsourcing Content Based on Adversarial Generative Networks

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0