Computer Science ›› 2025, Vol. 52 ›› Issue (2): 99-106.doi: 10.11896/jsjkx.240600031

• Database & Big Data & Data Science • Previous Articles     Next Articles

Check-in Trajectory and User Linking Based on Natural Language Augmentation

WANG Tianyi, LIN Youfang, GONG Letian, CHEN Wei, GUO Shengnan, WAN Huaiyu   

  1. School of Computer and Information Technology,Beijing Jiaotong University,Beijing 100044,China
    Beijing Key Laboratory of Traffic Data Analysis and Mining,Beijing 100044,China
  • Received:2024-06-04 Revised:2024-08-29 Online:2025-02-15 Published:2025-02-17
  • About author:WANG Tianyi,born in 2000,postgra-duate.His main research interests include spatio-temporal data mining and deep learning.
    WAN Huaiyu,born in 1981,Ph.D,professor,Ph.D supervisor,is a member of CCF(No.17732D).His main research interests include spatial-temporal data mining,information extraction and social networks mining.
  • Supported by:
    National Natural Science Foundation of China(62372031).

Abstract: With the rapid development of positioning technology and sensors,user movement trajectory data is becoming increa-singly abundant but scattered on different platforms.In order to fully utilize these data and accurately reflect users' real beha-vior,the study of trajectory user linking has become crucial.This task aims to accurately correlate user identities from massive check-in trajectory data.In recent years,researchers have tried to use methods such as recurrent neural networks and attention mechanisms to deeply mine trajectory data.However,current methods face two major challenges when processing user check-in sequences.First,the limited spatiotemporal features in the check-in data are insufficient to comprehensively model check-in point information from both subjective and objective perspectives.Second,the topic of the user check-in sequence will affect understan-ding and modeling check-in sequences.In response to these two challenges,this paper proposes a trajectory user linking model based on natural language augmentation named NLATUL,and designs a set of natural language templates and soft prompt tokens to describe the check-in sequence,and uses the language model to understand the subjective intention in the check-in points,integrating the user's spatiotemporal status,and providing a new perspective and representation that fully models the check-in points from both subjective and objective aspects.On this basis,this paper infer the topic of the check-in sequence through prompt learning,and performs bi-direction encoding on the trajectory represented by the modeled check-in points,so as to achieve an accurate understanding of the user's check-in sequence through the combination of the check-in sequence topic and the check-in sequence encoding,which can link the trajectory with the user more effectively.Verified on two check-in datasets,the experimental results show that proposed method can more accurately link check-in trajectories and their corresponding users.

Key words: Trajectory user link, Check-in sequence learning, Spatiotemporal data mining, Language model, Prompt learning

CLC Number: 

  • TP391
[1]CHEN L,NG R.On The Marriage of Lp-norms and Edit Distance[C]//Proceedings 2004 VLDB Conference.2004:792-803.
[2]DING H,GOCE T,PETER S,et al.Querying and mining oftime series data[C]//Proceedings of the VLDB Endowment.2008:1542-1552.
[3]RENDLE S,FREUDENTHALER C,SCHMIDT-THIEME L.Factorizing personalized Markov chains for next-basket recommend-dation[C]//Proceedings of the 19th International Confe-rence on World Wide Web.2010.
[4]GAO Q,ZHOU F,ZHANG K P,et al.Identifying Human Mo-bility via Trajectory Embeddings[C]//International Joint Confe-rences on Artificial Intelligence(IJCAI 2017).2019:1689-1695.
[5]HOPFIELD J J.Neural networks and physical systems withemergent collective computational abilities[C]//Proceedings of the National Academy of Sciences.1982:2554-2558.
[6]HOCHREITER S,SCHMIDHUBER J.Long Short-Term Me-mory[J].Neural Computation,1997,9(8):1735-1780.
[7]CHUNG J,GÜLÇEHRE Ç,CHO K,et al.Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling.[J].arXiv:1412.3555,2014.
[8]ZHOU F,GAO Q,TRAJCEVSKI C,et al.Trajectory-UserLinking via Variational AutoEncoder[C]//27th International Joint Conference on Artificial Intelligence and 23rd European Conference on Artificial Intelligence(IJCAI-ECAI 2018).2018:3212-3218.
[9]XU W D,SUN H Z,DENG C,et al.Variational Autoencoder for Semi-Supervised Text Classification[C]//Proceeding of the Thirty-First AAAI Conference on Artificial Intelligence:Twenth-Ninth Innovative Applications of Artificial Intelligence Conference and Seventh Symposium on Educational Advances in Artificial Intelligence.2017:3358-3364.
[10]MIAO C C,WANG J,YU H,et al.Trajectory-User Linkingwith Attentive Recurrent Network[C]//International Confe-rence on Autonomous Agents and Multiagent Systems(AAMAS 2020).2021:869-877.
[11]ZHOU F,CHEN S P,WU J,et al.Trajectory-User Linking via Graph Neural Network[C]//2021 IEEE International Confe-rence on Communications:IEEE International Confe-rence on Communications(ICC).2021:1-6.
[12]YU Y,TANG H,WANG F,et al.TULSN:Siamese Network for Trajectory-user Linking[C]//2020 International Joint Conference on Neural Networks(IJCNN).2020.
[13]GONG L,LIN Y,GUO S,et al.Contrastive Pre-training withAdversarial Perturbations for Check-In Sequence Representation Learning [C]//Proceedings of the AAAI Conference on Artificial Intelligence.2023:4276-4283.
[14]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[C]//Proceedings of the 2019 Conference of the North,Minneapolis,Minneso Data.2019.
[15]ALEC R,JEFFREY W,REWON C,et al.Language models are unsupervised multitask learners[EB/OL].https://openai.com/research/overview.
[16]XUE H,FLORA D,REN Y L,et al.Translating Human Mobi-lity Forecasting through Natural Language Generation[C]//Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining(WSDM '22).Association for Computing Machinery,New York,NY,USA,2022:1224-1233.
[17]HU B L.Sentiment Analysis,Stock Price,Text Mining,StockMarket[J].Statistics and Application,2021,10(6):957-962.
[18]ZAMANI H,SHAKERY A.A language model-based frame-work for multi-publisher content-based recommender systems[J].Information Retrieval Journal,2018,21(5):369-409.
[19]LIU P,YUAN W,FU J,et al.Pre-train,Prompt,and Predict:A Systematic Survey of Prompting Methods in Natural Language Processing[J].arXiv:2107.13586,2021.
[20]Chris Veness.Geohash[EB/OL].https://www.movable-type.co.uk/scripts/geohash.html.
[21]GUO H,CHEN B,TANG R,et al.An Embedding LearningFramework for Numerical Features in CTR Prediction[C]//Proceedings of the 27th ACM SIGKDD Conference on Know-ledge Discovery & Data Mining.2021.
[22]CHO E,MYERS S A,LESKOVEC J.Friendship and mobility[C]//Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2011.
[23]LIU Y,WEI W,SUN A,et al.Exploiting Geographical Neighborhood Characteristics for Location Recommendation [C]//Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management.2014.
[1] TU Ji, XIAO Wendong, TU Wenji, LI Lijian. Application of Large Language Models in Medical Education:Current Situation,Challenges and Future [J]. Computer Science, 2025, 52(6A): 240400121-6.
[2] LI Bo, MO Xian. Application of Large Language Models in Recommendation System [J]. Computer Science, 2025, 52(6A): 240400097-7.
[3] ZOU Rui, YANG Jian, ZHANG Kai. Low-resource Vietnamese Speech Synthesis Based on Phoneme Large Language Model andDiffusion Model [J]. Computer Science, 2025, 52(6A): 240700138-6.
[4] ZHOU Lei, SHI Huaifeng, YANG Kai, WANG Rui, LIU Chaofan. Intelligent Prediction of Network Traffic Based on Large Language Model [J]. Computer Science, 2025, 52(6A): 241100058-7.
[5] BAI Yuntian, HAO Wenning, JIN Dawei. Study on Open-domain Question Answering Methods Based on Retrieval-augmented Generation [J]. Computer Science, 2025, 52(6A): 240800141-7.
[6] ZHANG Le, CHE Chao, LIANG Yan. Hallucinations Proactive Relief in Diabetes Q&A LLM [J]. Computer Science, 2025, 52(6A): 240700182-10.
[7] YIN Baosheng, ZONG Chen. Research on Semantic Fusion of Chinese Polysemous Words Based on Large LanguageModel [J]. Computer Science, 2025, 52(6A): 240400139-7.
[8] HU Caishun. Study on Named Entity Recognition Algorithms in Audit Domain Based on Large LanguageModels [J]. Computer Science, 2025, 52(6A): 240700190-4.
[9] ZHAO Zheyu, WANG Zhongqing, WANG Hongling. Commodity Attribute Classification Method Based on Dual Pre-training [J]. Computer Science, 2025, 52(6A): 240500127-8.
[10] GAO Hongkui, MA Ruixiang, BAO Qihao, XIA Shaojie, QU Chongxiao. Research on Hybrid Retrieval-augmented Dual-tower Model [J]. Computer Science, 2025, 52(6): 324-329.
[11] WANG Xiaoyi, WANG Jiong, LIU Jie, ZHOU Jianshe. Study on Text Component Recognition of Narrative Texts Based on Prompt Learning [J]. Computer Science, 2025, 52(6): 330-335.
[12] CHEN Xuhao, HU Sipeng, LIU Hongchao, LIU Boran, TANG Dan, ZHAO Di. Research on LLM Vector Dot Product Acceleration Based on RISC-V Matrix Instruction Set Extension [J]. Computer Science, 2025, 52(5): 83-90.
[13] CONG Yingnan, HAN Linrui, MA Jiayu, ZHU Jinqing. Research on Intelligent Judgment of Criminal Cases Based on Large Language Models [J]. Computer Science, 2025, 52(5): 248-259.
[14] ZHU Shucheng, HUO Hongying, WANG Weikang, LIU Ying, LIU Pengyuan. Automatic Optimization and Evaluation of Prompt Fairness Based on Large Language Model Itself [J]. Computer Science, 2025, 52(4): 240-248.
[15] SONG Xingnuo, WANG Congyan, CHEN Mingkai. Survey on 3D Scene Reconstruction Techniques in Metaverse [J]. Computer Science, 2025, 52(3): 17-32.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!