计算机科学 ›› 2025, Vol. 52 ›› Issue (6A): 250200123-7.doi: 10.11896/jsjkx.250200123
• 信息安全 • 上一篇
黄晓宇1,2, 姜贺萌1, 凌嘉铭1
HUANG Xiaoyu1,2, JIANG Hemeng1, LING Jiaming1
摘要: 众包作为一种新兴的任务外包模式,被广泛认为是针对面向海量数据的标注、分析等工作需求的高效且经济的解决方案。但对众包任务的持有者(即任务主)而言,在众包机制下,众包工人可以不受限制地访问其私有的数据,这个过程蕴含了巨大的隐私泄漏风险。针对此问题,提出了能保证内容隐私安全的众包模型PrivCS。PrivCS的核心设计理念是使用对抗生成网络(GAN)生成的“合成数据”替代原始的真实数据面向众包工人公开发布。PrivCS对内容的隐私安全保护能力由GAN的理论性质保证,此外,还证明了PrivCS机制无论在数据标签提取,还是在模型训练等任务中,都能取得与传统的众包机制相近的效用。实验结果也对文中的理论论断提供了支持。
中图分类号:
[1]HOWE J.The rise of crowdsourcing[J].Wired magazine,2006,14(6):176-183. [2]RUSSAKOVSKY O,DENG J,SU H,et al.Imagenet large scale visual recognition challenge[J].International Journal of Computer Vision,2015,115:211-252. [3]LIU A,LI Z X,LIU G F,et al.Privacy-preserving task assignment in spatial crowdsourcing[J].Journal of Computer Science and Technology,2017,32(5):905-918. [4]TO H,GHINITA G,SHAHABI C.A framework for protecting worker location privacy in spatial crowdsourcing[J].Proceedings of the VLDB Endowment,2014,7(10):919-930. [5]LIN C,HE D,ZEADALLY S,et al.SecBCS:a secure and privacy-preserving blockchain-based crowdsourcing system[J].Science China Information Sciences,2020,63:1-14. [6]GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Generative adversarial networks[J].Communications of the ACM,2020,63(11):139-144. [7]ALTHUIZEN N,CHEN B.Crowdsourcing ideas using product prototypes:the joint effect of prototype enhancement and the product design goal on idea novelty[J].Management Science,2022,68(4):3008-3025. [8]KARGER D R,OH S,SHAH D.Budget-optimal task allocation for reliable crowdsourcing systems[J].Operations Research,2014,62(1):1-24. [9]SHAH N,GUO Y,WENDELSDORF K V,et al.A crowdsourcing approach for reusing and meta-analyzing gene expression data[J].Nature Biotechnology,2016,34(8):803-806. [10]DWORK C,ROTH A.The algorithmic foundations of differential privacy[J].Foundations and Trends© in Theoretical Computer Science,2014,9(3/4):211-407. [11]SHALEV-SHWARTZ S,BEN-DAVID S.Understanding ma-chine learning:From theory to algorithms[M].Cambridge University Press,2014. [12]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778. [13]CHIB S,GREENBERG E.Understanding the metropolis-has-tings algorithm[J].The American Statistician,1995,49(4):327-335. [14]GELFAND A E.Gibbs sampling[J].Journal of the American statistical Association,2000,95(452):1300-1304. [15]CRESWELL A,WHITE T,DUMOULIN V,et al.Generativeadversarial networks:An overview[J].IEEE Signal Processing Magazine,2018,35(1):53-65. [16]GUI J,SUN Z,WEN Y,et al.A review on generative adversarial networks:Algorithms,theory,and applications[J].IEEE Transactions on Knowledge and Data Engineering,2021,35(4):3313-3332. [17]CAI Z,XIONG Z,XU H,et al.Generative adversarial networks:A survey toward private and secure applications[J].ACM Computing Surveys(CSUR),2021,54(6):1-38. [18]GULRAJANI I,AHMED F,ARJOVSKY M,et al.Improvedtraining of wasserstein gans[J].Advances in Neural Information Processing Systems,2017,30. [19]LIN Z,SEKAR V,FANTI G.On the privacy properties of gan-generated samples[C]//International Conference on Artificial Intelligence and Statistics.PMLR,2021:1522-1530. [20]XU D,RUAN C,KORPEOGLU E,et al.Rethinking neural vs.matrix-factorization collaborative filtering:the theoretical perspectives[C]//International Conference on Machine Learning.PMLR,2021:11514-11524. [21]WAINWRIGHT M J.High-dimensional statistics:A non-as-ymptotic viewpoint[M].Cambridge University Press,2019. [22]BOUCHERON S,LUGOSI G,MASSART P.Concentration Inequalities:A Nonasymptotic Theory of Independence[M].OUP:Oxford,2013. [23]PASZKE A,GROSS S,MASSA F,et al.Pytorch:An imperative style,high-performance deep learning library[J].Advances in Neural Information Processing Systems,2019,32. |
|