计算机科学 ›› 2023, Vol. 50 ›› Issue (3): 114-120.doi: 10.11896/jsjkx.211200287

• 数据库&大数据&数据科学 • 上一篇    下一篇

融合多类时空轨迹特征的跨网络用户身份识别

刘红1, 朱焱1, 李春平2   

  1. 1 西南交通大学计算机与人工智能学院 成都 611756
    2 清华大学软件学院 北京 100091
  • 收稿日期:2021-12-27 修回日期:2022-04-14 出版日期:2023-03-15 发布日期:2023-03-15
  • 通讯作者: 朱焱(yzhu@swjtu.edu.cn)
  • 作者简介:(leuh1997@my.swjtu.edu.cn)
  • 基金资助:
    四川省科技计划(2019YFSY0032)

Cross-network User Identification Based on Multiple Spatio-Temporal Trajectory Features

LIU Hong1, ZHU Yan1, LI Chunping2   

  1. 1 School of Computing and Artificial Intelligence,Southwest Jiaotong University,Chengdu 611756,China
    2 School of Software,Tsinghua University,Beijing 100091,China
  • Received:2021-12-27 Revised:2022-04-14 Online:2023-03-15 Published:2023-03-15
  • About author:LIU Hong,born in 1995,postgraduate.Her main research interests include cross-network user identification and data mining.
    ZHU Yan,born in 1965,Ph.D,professor,Ph.D co-supervisor,is a member of China Computer Federation.Her main research interests include Web data mining,social networking,privacy preserving,deep learning and AI.
  • Supported by:
    Sichuan Science and Technology Project(2019YFSY0032).

摘要: 随着位置社交网络的蓬勃发展,用户移动行为数据得到极大丰富,推动了基于时空数据的身份识别问题的相关研究。跨位置社交网络的用户身份识别,强调学习不同平台时空序列间的相关性,旨在发现同一用户在不同平台的注册账号。为解决现有研究面临的数据稀疏、低质量和时空不匹配问题,提出了一种融合双向时空依赖和时空分布的识别算法UI-STDD。该算法主要包含3个模块:时空序列模块通过结合成对注意力的双向长短时记忆网络来刻画用户移动模式;时间偏好模块从粗、细两个粒度定义用户个性化模式;空间位置模块挖掘位置点的局部和全局信息,量化空间邻近性。基于上述模块得到的用户轨迹对特征,UI-STDD利用多层前馈网络判断跨网络的两个账户是否对应于现实中的同一个人。为验证UI-STDD的可行性和有效性,在3组公开的数据集上进行了实验。实验结果表明,所提算法能够提高基于时空数据的用户身份识别率,F1值平均高于最优对比方法10%以上。

关键词: 用户身份识别, 时空数据, 移动模式, 时间偏好, 长短时记忆网络

Abstract: With the flourishing of location-based social networks,users’mobile behavior data has been greatly enriched,which promotes the research on user identification based on spatio-temporal data.User identification in cross-location social networks emphasizes learning the correlation between time and space sequences of different platforms,aiming at discovering the accounts registered by the same user on different platforms.In order to solve the problems of data sparsity,low quality and spatio-temporal mismatch faced by existing researches,a recognition algorithm UI-STDD combining bidirectional spatio-temporal dependence and spatio-temporal distribution is proposed.The algorithm mainly consists of three modules:the space-time sequence module is combined with the bidirectional long short-term memory network of paired attention to describe user movement patterns;the time preference module defines the user personalized mode from coarse and fine granularity;the spatial location module mines local and global information of location points to quantify spatial proximity.Based on the user trajectory pair features obtained by the above modules,a multi-layer feedforward network is used in UI-STDD to distinguish whether two accounts across the network corres-pond to the same person in real life.To verify the feasibility and effectiveness of UI-STDD,experiments are carried out on three publicly available datasets.Experimental results show that the proposed algorithm can improve the user identification rate based on spatio-temporal data,and the average F1 value is more than 10% higher than the optimal comparison method.

Key words: User identification, Spatio-Temporal data, Mobile mode, Time preference, Long short-term memory

中图分类号: 

  • TP301
[1]LUO Y T,LIU Q,LIU Z C.STAN:Spatio-Temporal Attention Network for Next Location Recommendation[C]//Proceedings of the Web Conference.New York:ACM,2021:2177-2185.
[2]SINA D,CHANG T L,KEVIN H,et al.Semi-Supervised Deep Learning Approach for Transportation Mode Identification Using GPS Trajectory Data[J].IEEE Transactions on Know-ledge and Data Engineering,2020,32(5):1010-1023.
[3]GUO Y S,LIU M D.Anomaly detection based on spatio-temporal trajectory data[J].Computer Science,2021,48(S1):213-219.
[4]CHEN W,YIN H Z,WANG W Q,et al.Effective And Efficient User Account Linkage Across Location Based Social Networks[C]//IEEE 34th International Conference on Data Enginee-ring.New York:IEEE Press,2018:1085-1096.
[5]ZHOU X P,LIANG X,ZHAO J C,et al.A review of relateduser mining methods for social network convergence[J].Journal of Software,2017,28(6):1565-1583.
[6]LI H,CAO S Y,CHEN Y Z,et al.User Trajectory Identification Model via Attention Mechanism[J].Computer Science,2021,49(3):308-312.
[7]FARID M N,JAVKRICHAN U,PATRICK T,et al.WhereYou Are Is Who You Are:User Identification by Matching Statistics[J].IEEE Transactions on Information Forensics and Security,2016,11(2):358-372.
[8]RIEDERER C,KIM Y S,CHAINTREAU A,et al.LinkingUsers Across Domains with Location Data:Theory and Validation[C]//World Wide Web.New York:ACM,2016:707-719.
[9]WANG H D,GAO C,LI Y,et al.De-anonymization of Mobility Trajectories:Dissecting the Gaps between Theory and Practice[J].IEEE Transactions on Mobile Computing,2021,20(3):796-815.
[10]FENG J,ZHANG M Y,WANG H D,et al.DPLink:User Identity Linkage via Deep Neural Network From Heterogeneous Mobility Data[C]//World Wide Web.New York:ACM,2019:459-469.
[11]LUCA R,MIRCO M.It's the way you check-in:Identifyingusers in location-based social networks[C]//Proceedings of the Second ACM Conference on Online Social Networks.New York:ACM,2014:215-226.
[12]ALKET C,MARCO M,FRANCO Z.Re-identifification and information fusion between anonymized CDR and social network data[J].Journal Ambient Intelligent Human Computing,2016,7(1):83-96.
[13]DING F X,MA X Q,YANG Y,et al.User Identity Linkage across Location-Based Social Networks with Spatio-Temporal Check-in Patterns[C]//IEEE International Conference on Pa-rallel & Computing & Communications.New York:IEEE Press,2020:1278-1285.
[14]LI X L,ZHAO K Q,CONG C,et al.Deep RepresentationLearning for Trajectory Similarity Computation[C]//IEEE 34th International Conference on Data Engineering.New York:IEEE Press,2018:617-628.
[15]XI D B,ZHUANG F Z,LIU Y C,et al.Modelling of Bi-Directional Spatio-Temporal Dependence and Users’ Dynamic Prefe-rences for Missing POI Check-In Identification[C]//Proceedings of the AAAI Conference on Artificial Intelligence.Palo Alto:AAAI Press,2019:5458-5465.
[16]CHEN W,WANG W Q,YIN H Z,et al.User Account Linkage Across Multiple Platforms with Location Data[J].Journal of Computer Science and Technology,2020,35:751-768.
[17]KONG X G,ZHANG J W,PHILIP S Y,Inferring anchor links across multiple heterogeneous social networks[C]//Proceedings of the 22nd ACM International Conference on Information & Knowledge Management.New York:ACM,2013:179-188.
[18]ZHANG J W,PHILIP S Y.Integrated Anchor and Social Link Predictions across Social Networks[C]//Proceeding of the 24th International Joint Conference on Artificial Intelligence.California:Morgan Kaufmann,2015:2125-2132.
[19]CHO E,MYERS S A,LESKOVES J.Friendship and Mobility:Friendship and Mobility:User Movement in Location-Based Social Networks[C]//ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM,2011:1082-1090.
[20]MIAO C C,WANG J L,YU H,et al.Trajectory-User Linking with Attentive Recurrent Network[C]//Proceeding of the 19th International Conference on Autonomous Agents and Multi Agent Systems.Richland:Springer,2020:878-886.
[1] 黎嵘繁, 钟婷, 吴劲, 周帆, 匡平.
基于时空注意力克里金的边坡形变数据插值方法
Spatio-Temporal Attention-based Kriging for Land Deformation Data Interpolation
计算机科学, 2022, 49(8): 33-39. https://doi.org/10.11896/jsjkx.210600161
[2] 金方焱, 王秀利.
融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取
Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM
计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
[3] 王杉, 徐楚怡, 师春香, 张瑛.
基于CNN-LSTM的卫星云图云分类方法研究
Study on Cloud Classification Method of Satellite Cloud Images Based on CNN-LSTM
计算机科学, 2022, 49(6A): 675-679. https://doi.org/10.11896/jsjkx.210300177
[4] 潘志豪, 曾碧, 廖文雄, 魏鹏飞, 文松.
基于交互注意力图卷积网络的方面情感分类
Interactive Attention Graph Convolutional Networks for Aspect-based Sentiment Classification
计算机科学, 2022, 49(3): 294-300. https://doi.org/10.11896/jsjkx.210100180
[5] 丁锋, 孙晓.
基于注意力机制和BiLSTM-CRF的消极情绪意见目标抽取
Negative-emotion Opinion Target Extraction Based on Attention and BiLSTM-CRF
计算机科学, 2022, 49(2): 223-230. https://doi.org/10.11896/jsjkx.210100046
[6] 宋美琦, 傅湘玲, 闫晨巍, 仵伟强, 任芸.
基于双向长短时记忆网络的企业弹性能力预测模型
Prediction Model of Enterprise Resilience Based on Bi-directional Long Short-term Memory Network
计算机科学, 2022, 49(11): 197-205. https://doi.org/10.11896/jsjkx.210900195
[7] 王如斌, 李瑞远, 何华均, 刘通, 李天瑞.
面向海量空间数据的分布式距离连接算法
Distributed Distance Join Algorithm for Massive Spatial Data
计算机科学, 2022, 49(1): 95-100. https://doi.org/10.11896/jsjkx.210100060
[8] 宋龙泽, 万怀宇, 郭晟楠, 林友芳.
面向出租车空载时间预测的多任务时空图卷积网络
Multi-task Spatial-Temporal Graph Convolutional Network for Taxi Idle Time Prediction
计算机科学, 2021, 48(7): 112-117. https://doi.org/10.11896/jsjkx.201000089
[9] 张宁, 方靖雯, 赵雨宣.
基于LSTM混合模型的比特币价格预测
Bitcoin Price Forecast Based on Mixed LSTM Model
计算机科学, 2021, 48(11A): 39-45. https://doi.org/10.11896/jsjkx.210600124
[10] 李浩, 王飞, 谢思宇, 寇勇奇, 张兰, 杨兵, 康雁.
一种基于改进图波网的双重自回归分量交通预测模型
Dual Autoregressive Components Traffic Prediction Based on Improved Graph WaveNet
计算机科学, 2021, 48(11A): 159-165. https://doi.org/10.11896/jsjkx.201200051
[11] 游兰, 韩雪薇, 何正伟, 肖丝雨, 何渡, 潘筱萌.
基于改进Seq2Seq的短时AIS轨迹序列预测模型
Improved Sequence-to-Sequence Model for Short-term Vessel Trajectory Prediction Using AIS Data Streams
计算机科学, 2020, 47(9): 169-174. https://doi.org/10.11896/jsjkx.190800060
[12] 刘云,尹传环,胡迪,赵田,梁宇.
基于循环神经网络的通信卫星故障检测
Communication Satellite Fault Detection Based on Recurrent Neural Network
计算机科学, 2020, 47(2): 227-232. https://doi.org/10.11896/jsjkx.190600147
[13] 徐鹤, 吴昊, 李鹏.
面向物联网的时空数据处理算法设计
Design of Temporal-spatial Data Processing Algorithm for IoT
计算机科学, 2020, 47(11): 310-315. https://doi.org/10.11896/jsjkx.200400045
[14] 孙天旭, 赵蕴龙, 练作为, 孙毅, 蔡月啸.
基于时空数据的城市人流移动模式挖掘
Mobility Pattern Mining for People Flow Based on Spatio-Temporal Data
计算机科学, 2020, 47(10): 91-96. https://doi.org/10.11896/jsjkx.200100001
[15] 王启发, 王中卿, 李寿山, 周国栋.
基于交叉注意力机制和新闻正文的评论情感分类
Comment Sentiment Classification Using Cross-attention Mechanism and News Content
计算机科学, 2020, 47(10): 222-227. https://doi.org/10.11896/jsjkx.190900173
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!