计算机科学 ›› 2022, Vol. 49 ›› Issue (3): 308-312.doi: 10.11896/jsjkx.210300231

• 信息安全 • 上一篇    下一篇

基于注意力机制的用户轨迹识别模型

李昊, 曹书瑜, 陈亚青, 张敏   

  1. 中国科学院软件研究所可信计算与信息保障实验室 北京100190
  • 收稿日期:2021-03-24 修回日期:2021-06-07 出版日期:2022-03-15 发布日期:2022-03-15
  • 基金资助:
    国家重点研发计划(2018YFC0809300);中国科学院青年创新促进会(2019113)

User Trajectory Identification Model via Attention Mechanism

LI Hao, CAO Shu-yu, CHEN Ya-qing, ZHANG Min   

  1. Department of TCA,Institute of Software,Chinese Academy of Sciences,Beijing 100190,China
  • Received:2021-03-24 Revised:2021-06-07 Online:2022-03-15 Published:2022-03-15
  • About author:LI Hao,born in 1983,Ph.D,associate professor,master’s supervisor,is a member of China Computer Federation and Youth Innovation Promotion Association.His main research interests include data privacy and access control.
  • Supported by:
    National Key R & D Program of China(2018YFC0809300) and Youth Innovation Promotion Association CAS(2019113).

摘要: 近年来,基于位置服务的应用逐渐开始普及,它在为人们生活提供便利的同时,也对个人隐私造成了巨大威胁。现有研究表明,在具备大量历史轨迹数据的情况下,攻击者能够从匿名化的轨迹数据集中识别出用户身份与轨迹的链接关系。然而,这些相关研究都面临着数据稀疏和数据质量差这两方面的问题。数据稀疏指用户的轨迹往往只分布在局部区域,同时缺乏与自然语言处理领域一样规模庞大的语料库;数据质量差指轨迹中的位置点往往存在采样率低和噪音大的问题。针对上述问题,文中提出了一种基于注意力机制的用户轨迹识别模型,包括位置嵌入模块、基于注意力的位置转移特征编码模块和轨迹用户识别模块。位置嵌入模块用于将原始轨迹中位置点的空间关系嵌入到位置向量中;基于注意力的位置转移特征编码模块用于提取轨迹中位置点间的转移依赖关系,生成轨迹的表征向量;轨迹用户识别模块用于对轨迹表征向量的用户身份进行预测。最后,在Gowalla和Geolife数据集上进行了实验验证,实验结果表明,所提方案有效缓解了轨迹数据稀疏和数据质量差带来的问题,能够提高轨迹的用户身份识别率。

关键词: 轨迹隐私, 轨迹用户识别, 深度学习, 循环神经网络, 注意力机制

Abstract: Recently the application of location-based services has gradually become popular.It provides convenience in people’s daily life,and also brings a great threat to personal privacy.The existing research shows that,with a large amount of historical trajectory data,attackers can identify the user who generates the trajectory from the anonymous trajectory dataset.In these rela-ted studies,both data sparsity and poor data quality are faced.Data sparsity refers to the fact that trajectories are often distributed only in a few local areas,and there is no large corpus contrast to the natural language processing field.The poor data quality refers to the low sampling rate and existing noise of the location points in a trajectory.To address these two problems,this paper proposes a user trajectory identification model based on attention mechanism,including the location embedding module,the attention-based transitional feature encoder and trajectory-user identification module.The location embedding module is used to embed the spatial relation of the trajectory points into the location vector;the attention-based transitional feature encoder is used to extract the sequential dependencies from a single trajectory;and the trajectory-user identification module is used to predict the user identity of the trajectory based on the outputs of the transitional feature encoder.Finally,the experimental verification is carried out on Gowalla and Geolife datasets.The experimental results show that the proposed model in this paper can effectively alleviate the problem of data sparsity and poor data quality,and can achieve better accuracy than existing methods.

Key words: Attention mechanism, Deep learning, Recurrent neural network, Trajectory privacy, Trajectory-user identification

中图分类号: 

  • TP309
[1]FENG D G.Big data security and privacy preservation[M].Tsinghua University Press,2018.
[2]ZANG H,BOLOT J.Anonymization of location data does notwork:A large-scale measurement study[C]//Proceedings of the 17th Annual International Conference on Mobile Computing and Networking.New York:ACM,2011:145-156.
[3]FREUDIGER J,SHOKRI R,HUBAUX J P.Evaluating the privacy risk of location-based services[C]//Proceedings of the 2011 International Conference on Financial Cryptography and Data Security.Berlin:Springer,2011:31-46.
[4]LUO X,GAO Q,ZHOU F,et al.Identifying Human Mobilityvia Trajectory Embeddings[C]//IJCAI.2017:1689-1695.
[5]WANG G,LIAO D,LI J.Complete User Mobility via User and Trajectory Embeddings[J].IEEE Access,2018,6:72125-72136.
[6]ZHOU F,GAO Q,TRAJCEVSKI G,et al.Trajectory-UserLinking via Variational AutoEncoder[C]//IJCAI.2018:3212-3218.
[7]YI B K,JAGADISH H V,FALOUTSOS C.Efficient Retrieval of Similar Time Sequences under Time Warping[C]//Fourteenth International Conference on Data Engineering.IEEE Computer Society,1998.
[8]CHEN L,NG R.On the Marriage of Edit Distance and LpNorms[C]//Very Large Data Bases.2004.
[9]WANG R,ZHANG M,FENG D,et al.A De-anonymization Attack on Geo-Located Data Considering Spatio-temporal In-fluences[C]//Proceedings of the 2015 International Conference on Information and Communications Security.Cham:Springer,2015:478-484.
[10]CHEN Z,FU Y,ZHANG M,et al.The De-anonymizationMethod Based on User Spatio-Temporal Mobility Trace[C]//Proceedings of the 2017 International Conference on Information and Communications Security.Cham:Springer,2017:459-471.
[11]QUAN B,YANG B C,HU K Q,et al.Prediction Model of Ship Trajectory Based on LSTM[J].Computer Science,2018,45(z2):126-131.
[12]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient Estimation of Word Representations in Vector Space[C]//Internatio-nal Conference on Learning Representations:Workshops Track.Computer Science,2013.
[13]GowallaData[EB/OL].[2020-01-29].http://snap.stanford.edu/data/loc-gowalla.html.
[14]GeoLifeData[EB/OL].[2020-01-29].http://research.microsoft.com/enus/downloads/b16d359d-d164-469e-9fd4-daa38f2b2e13/default.asp.
[15]LIU Q,WU S,WANG L,et al.Predicting the Next Location:A Recurrent Model with Spatial and Temporal Contexts[C]//Thirtieth AAAI Conference on Artificial Intelligence.AAAI Press,2016.
[16]DAS G,GUNOPULOS D,MANNILA H.Finding Similar Time Series[C]//Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery.1997:88-100.
[17]CORTES C,VAPNIK V.Support-Vector Networks[J].Ma-chine Learning,1995,20(3):273-297.
[1] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[2] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[3] 周芳泉, 成卫青.
基于全局增强图神经网络的序列推荐
Sequence Recommendation Based on Global Enhanced Graph Neural Network
计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[4] 戴禹, 许林峰.
基于文本行匹配的跨图文本阅读方法
Cross-image Text Reading Method Based on Text Line Matching
计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032
[5] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[6] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[7] 熊丽琴, 曹雷, 赖俊, 陈希亮.
基于值分解的多智能体深度强化学习综述
Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization
计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112
[8] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[9] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[10] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[11] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[12] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[13] 闫佳丹, 贾彩燕.
基于双图神经网络信息融合的文本分类方法
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[14] 汪鸣, 彭舰, 黄飞虎.
基于多时间尺度时空图网络的交通流量预测模型
Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction
计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[15] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!