计算机科学 ›› 2020, Vol. 47 ›› Issue (10): 41-47.doi: 10.11896/jsjkx.200700070

• 群智感知计算 • 上一篇    下一篇

移动群智感知中基于强化学习的双赢博弈

蔡威, 白光伟, 沈航, 成昭炜, 张慧丽   

  1. 南京工业大学计算机科学与技术学院 南京211816
  • 收稿日期:2020-07-12 修回日期:2020-08-01 出版日期:2020-10-15 发布日期:2020-10-16
  • 通讯作者: 沈航(hshen@njtech.edu.cn)
  • 作者简介:caiwei913243@163.com
  • 基金资助:
    国家自然科学基金(61502230);江苏省自然科学基金(BK20150960);江苏省“六大人才高峰”高层次人才资助项目(RJFW-020);计算机软件新技术国家重点实验室(南京大学)资助项目(KFKT2017B21)

Reinforcement Learning Based Win-Win Game for Mobile Crowdsensing

CAI Wei, BAI Guang-wei, SHEN Hang, CHENG Zhao-wei, ZHANG Hui-li   

  1. College of Computer Science and Technology,Nanjing Tech University,Nanjing 211816,China
  • Received:2020-07-12 Revised:2020-08-01 Online:2020-10-15 Published:2020-10-16
  • About author:CAI Wei,born in 1997,postgraduate.His main research interests include privacy protection,mobile crowdsensing and reinforcement learning.
    SHEN Hang,born in 1984,Ph.D,asso-ciate professor,master supervisor,is a member of China Computer Federation.His main research interests include cyber security,privacy protection and 5G network.
  • Supported by:
    National Natural Science Foundation of China (61502230),Natural Science Foundation of Jiangsu Province (BK20150960),Jiangsu Province “Six Talent Peaks” High-level Talent Project (RJFW-020) and State Key Laboratory of New Technology of Computer Software (Nanjing University) Project (KFKT2017B21)

摘要: 移动群智感知系统需要为用户提供个性化隐私保护,以吸引更多用户参与任务。然而,由于恶意攻击者的存在,用户提升隐私保护力度会导致位置可用性变差,降低任务分配效率。针对该问题,提出了一种基于强化学习的用户与平台共赢的博弈机制。该机制首先通过可信第三方的两个虚拟实体分别模拟用户并与平台进行交互,一个模拟用户选择隐私预算为位置数据添加噪声,另一个模拟平台根据用户的扰动位置分配任务;然后,将交互过程构建为博弈,并推导出均衡点,其中交互的两个虚拟实体就是博弈双方;最后,使用强化学习方法不断尝试不同的位置扰动策略,输出一个最优的位置扰动方案。实验结果表明,该机制能在优化任务分配效用的同时,尽可能地提高用户的整体效用,使用户与平台达成双赢。

关键词: 移动群智感知, 任务分配, 个性化隐私保护, 博弈论, 强化学习

Abstract: Mobile crowdsensing system should offer the personalized privacy protection of users’ location to attract more users to participate in the task.However,due to the existence of malicious attackers,users’ enhanced privacy protection will lead to poor location availability and reduce the efficiency of task allocation.To solve this problem,this paper proposes a win-win game based on reinforcement learning.Firstly,two virtual entities of the trusted third party are used to simulate the interaction between users and the platform,one simulating user chooses the privacy budget to add noise to their locations and the other simulates the platform allocating tasks with users’ disturbed locations.Then,the interaction process is constructed as a game,in which the two virtual entities of interaction are the adversaries,and the equilibrium point is derived.Finally,the reinforcement learning method is used to try different location disturbance strategies and output an optimal location disturbance scheme.The experimental results show that the mechanism can optimize the task distribution utility while improving the user’s overall utility as much as possible,so that the user and the platform can achieve a win-win situation.

Key words: Mobile crowdsensing, Task allocation, Personalized privacy-preserving, Game theory, Reinforcement learning

中图分类号: 

  • TP393
[1]WANG L Y,ZHANG D Q,WANG Y S,et al.Sparse MobileCrowdsensing:Challenges and Opportunities[J].IEEE Communications Magazine,2016,54(7):161-167.
[2]TANG Y,LIU R Q,YANG P L,et al.A Secure Task Allocation Technology Based on Crowd Sensing Network [J].Computer Engineering,2016,42(6):161-166.
[3]GUO B,LIU Y,WU W L,et al.ActiveCrowd:A Framework for Optimized Multitask Allocation in Mobile Crowdsensing Systems[J].IEEE Transactions on Human-Machine Systems,2017,47(3):392-403.
[4]LIU Y,GUO B,WANG Y,et al.TaskMe:Multi-Task Allocation in Mobile Crowd Sensing [C]//Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing.2016:403-414.
[5]WANG L Y,ZHANG D Q,PATHAK A,et al.CCS-TA:Quality-Guaranteed Online Task Allocation in Compressive Crowdsensing[C]//Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing.2015:683-694.
[6]QIAN Y F,JIANG Y Y,HOSSAIN M S,et al.Privacy-Preserving based Task Allocation with Mobile Edge Clouds[J].Information Sciences,2020,507:288-297.
[7]LIU B,ZHOU W L,ZHU T Q,et al.Invisible Hand:A Privacy Preserving Mobile Crowd Sensing Framework Based on Economic Models[J].IEEE Transactions on Vehicular Technology,2016,66(5):4410-4423.
[8]TO H,GHINITA G,SHAHABI C.A Framework for Protecting Worker Location Privacy in Spatial Crowdsourcing[J].Proceedings of the VLDB Endowment,2014,7(10):919-930.
[9]POURNAJAF L,XIONG L,SUNDERAM V,et al.Spatial Task Assignment for Crowd Sensing with Cloaked Locations[C]//2014 IEEE 15th International Conference on Mobile Data Ma-nagement.IEEE,2014,1:73-82.
[10]WANG T C,LIU Y,JIN X,et al.Research on K-Anonymity-Based Privacy Protection in Crowd Sensing[J].Journal on Communications,2018,39(A01):170-178.
[11]LONG H,ZHANG S K,ZHANG L.Privacy Protection Method Based on Voronoi Cell in Crowd Sensing[J].Computer Engineering,2020,46(5):181-186,192.
[12]DWORK C.Differential Privacy:A Survey of Results[C]//International Conference on Theory and Applications of Models of Computation.Springer,Berlin,Heidelberg,2008:1-19.
[13]XIONG J B,MA R,CHEN L,et al.A Personalized Privacy Protection Framework for Mobile Crowdsensing in IIoT[J].IEEE Transactions on Industrial Informatics,2020,16(6):4231-4241.
[14]WANG L Y,YANG D Q,HAN X,et al.Location Privacy-Preserving Task Allocation for Mobile Crowdsensing with Differential Geo-Obfuscation[C]//Proceedings of the 26th International Conference on World Wide Web.2017:627-636.
[15]WANG Z B,HU J H,LV R Z,et al.Personalized Privacy-Preserving Task Allocation for Mobile Crowdsensing[J].IEEE Transactions on Mobile Computing,2019,18(6):1330-1341.
[16]NIE J T,LUO J,XIONG Z H,et al.A Stackelberg Game Approach Toward Socially-Aware Incentive Mechanisms for Mobile Crowdsensing[J].IEEE Transactions on Wireless Communications,2019,18(1):724-738.
[17]XIAO L,CHEN T H,XIE C X,et al.Mobile Crowdsensing Games in Vehicular Networks[J].IEEE Transactions on Vehi-cular Technology,2017,67(2):1535-1545.
[18]ALSHEIKH M A,NIYATO D,LEONG D,et al.Privacy Mana-gement and Optimal Pricing in People-Centric Sensing[J].IEEE Journal on Selected Areas in Communications,2017,35(4):906-920.
[19]CHATZIKOKOLAKIS K,ANDRÉS M E,BORDENABE N E,et al.Broadening the Scope of Differential Privacy Using Metrics[C]//International Symposium on Privacy Enhancing Technologies Symposium.Springer,Berlin,Heidelberg,2013:82-102.
[1] 马堉银, 郑万波, 马勇, 刘航, 夏云霓, 郭坤银, 陈鹏, 刘诚武. 一种基于深度强化学习与概率性能感知的边缘计算环境多工作流卸载方法[J]. 计算机科学, 2021, 48(1): 40-48.
[2] 毛莺池, 周彤, 刘鹏飞. 基于延迟接受的多用户任务卸载策略[J]. 计算机科学, 2021, 48(1): 49-57.
[3] 刘凌云, 钱辉, 邢红杰, 董春茹, 张峰. 一种基于Q-学习算法的增量分类模型[J]. 计算机科学, 2020, 47(8): 171-177.
[4] 刘君良, 李晓光. 个性化推荐系统技术进展[J]. 计算机科学, 2020, 47(7): 47-55.
[5] 郑帅, 罗飞, 顾春华, 丁炜超, 卢海峰. 基于双估计器的改进Speedy Q-learning算法[J]. 计算机科学, 2020, 47(7): 179-185.
[6] 黄锦灏, 丁钰真, 肖亮, 沈志荣, 朱珍民. 一种基于强化学习的嵌入式系统抗拒绝服务攻击的缓存调度方案[J]. 计算机科学, 2020, 47(7): 282-286.
[7] 刘青松, 陈建平, 傅启明, 高振, 陆悠, 吴宏杰. 一种新的基于函数逼近协同更新的DQN算法[J]. 计算机科学, 2020, 47(6A): 130-134.
[8] 李建军, 汪校铃, 杨玉, 付佳. 基于CQPSO移动群智感知紧急任务分配方法研究[J]. 计算机科学, 2020, 47(6A): 273-277.
[9] 包峻波, 闫光辉, 李俊成. 结合非完全信息博弈的SIR传播模型[J]. 计算机科学, 2020, 47(6): 230-235.
[10] 李虎, 方宝富. 基于积极团队情感基调的情感机器人协作任务分配拍卖算法[J]. 计算机科学, 2020, 47(4): 169-177.
[11] 唐文君,张佳丽,陈荣,郭世凯. 基于强化学习的Web服务众测任务分派方法[J]. 计算机科学, 2020, 47(3): 54-60.
[12] 杨惟轶,白辰甲,蔡超,赵英男,刘鹏. 深度强化学习中稀疏奖励问题研究综述[J]. 计算机科学, 2020, 47(3): 182-191.
[13] 孙浩,陈春林,刘琼,赵佳宝. 基于深度强化学习的交通信号控制方法[J]. 计算机科学, 2020, 47(2): 169-174.
[14] 李丽,郑嘉利,王哲,袁源,石静. 基于异步优势动作评价的RFID室内定位算法[J]. 计算机科学, 2020, 47(2): 233-238.
[15] 陈梦蓉,林英,兰微,单今朝. 基于“奖励制度”的DPoS共识机制改进[J]. 计算机科学, 2020, 47(2): 269-275.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 雷丽晖,王静. 可能性测度下的LTL模型检测并行化研究[J]. 计算机科学, 2018, 45(4): 71 -75 .
[2] 周燕萍,业巧林. 基于L1-范数距离的最小二乘对支持向量机[J]. 计算机科学, 2018, 45(4): 100 -105 .
[3] 崔琼,李建华,王宏,南明莉. 基于节点修复的网络化指挥信息系统弹性分析模型[J]. 计算机科学, 2018, 45(4): 117 -121 .
[4] 杨羽琦,章国安,金喜龙. 车载自组织网络中基于车辆密度的双簇头路由协议[J]. 计算机科学, 2018, 45(4): 126 -130 .
[5] 施超,谢在鹏,柳晗,吕鑫. 基于稳定匹配的容器部署策略的优化[J]. 计算机科学, 2018, 45(4): 131 -136 .
[6] 朱淑芹,王文宏,李俊青. 针对基于感知器模型的混沌图像加密算法的选择明文攻击[J]. 计算机科学, 2018, 45(4): 178 -181 .
[7] 张景,朱国宾. 基于CBOW-LDA主题模型的Stack Overflow编程网站热点主题发现研究[J]. 计算机科学, 2018, 45(4): 208 -214 .
[8] 童泽平,李涛,李立杰,任亮. 基于随机需求与产能限制的供应链协同优化研究[J]. 计算机科学, 2018, 45(4): 260 -265 .
[9] 李慧,周林,辛文波. 基于双层规划的网络化防空作战编队结构优化[J]. 计算机科学, 2018, 45(4): 266 -272 .
[10] 瞿中,赵从梅. 一种抗遮挡的自适应尺度目标跟踪算法[J]. 计算机科学, 2018, 45(4): 296 -300 .