计算机科学 ›› 2021, Vol. 48 ›› Issue (9): 95-102.doi: 10.11896/jsjkx.200700097

所属专题: 智能数据治理技术与系统

• 智能数据治理技术与系统* 上一篇    下一篇

基于关联分析的铁路旅客同行预测方法

李思颖1, 徐杨1, 王欣2, 赵若成3   

  1. 1 西南交通大学经济管理学院 成都610031
    2 西南石油大学计算机科学学院 成都610500
    3 伦敦大学伯贝克学院商业经济和信息学院 伦敦WC1E 7HX
  • 收稿日期:2020-07-14 修回日期:2020-11-04 出版日期:2021-09-15 发布日期:2021-09-10
  • 通讯作者: 王欣(xinwang.ed@gmail.com)
  • 作者简介:1049810415@qq.com

Railway Passenger Co-travel Prediction Based on Association Analysis

LI Si-ying1, XU Yang1, WANG Xin2, ZHAO Ruo-cheng3   

  1. 1 School of Economics and Management,Southwest Jiaotong University,Chengdu 610031,China
    2 School of Computer Science,Southwest Petroleum University,Chengdu 610500,China
    3 School of Business,Economics and Informatics,Birkbeck,University of London,London WC1E 7HX,UK
  • Received:2020-07-14 Revised:2020-11-04 Online:2021-09-15 Published:2021-09-10
  • About author:LI Si-ying,born in 1996,postgraduate,is a member of China Computer Federation.Her main research interest include data mining and so on.
    WANG Xin,born in 1981,Ph.D,professor,Ph.D supervisor,is a member of ACM,IEEE,CCF and CAAI.His main research interests include knowledge disco very in database,artificial in telligence,machine learning and data mining.

摘要: 随着运输技术的快速发展,铁路已成为人们出差、度假、探亲时选择的主要出行方式之一。与此同时,旅客共同出行(以下简称同行)的行为特征也越来越普遍。依据旅客间的同行关系,可以构建同行关系网络;而对该网络中潜在的链接进行预测,将有助于提供个性化的服务和产品。为此,文中提出一种原创的方法,用于在旅客同行关系网络中发现潜在的同行关系。首先对传统的图模式关联规则进行扩展,提出了两类“同行图模式关联规则”,用于预测新的同行关系和未来的同行频次。然后,将上述规则挖掘计算的问题分解为频繁同行模式挖掘、规则生成以及关联分析3个子问题,并设计了有效的分布式和集中式的算法。通过在大规模真实数据集上的测试,证明了所提方法能够高效且准确地预测旅客同行关系网络中潜在的同行关系,且两类规则的预测准确率均高于50%,远高于传统方法(如Jaccard的预测准确率为24%)。

关键词: 关联分析, 同行模式, 同行网络, 同行预测, 图模式匹配

Abstract: With the fast development of transportation technology,the railway has become one of the main choices for people when they travel for business,vacation or visiting.As a result,the behavior of co-travel has become more and more common.Based on this co-travel relationship,people can construct a co-travel network,where each node represents a passenger and an edge indicates co-travel frequency between two passengers this edge connects,and the link prediction on the network such that persona-lized service and product can be provided even better.In light of this,this paper proposes a novel approach to predicting potential co-travel relationship.Specifically,we first propose two types of co-travel graph pattern association rules which are extended from their traditional counterparts,and can be used to predict new co-travel relationship and co-travel frequency,respectively.We then decompose this mining problem into three sub-problems,i.e.,frequent co-travel pattern mining,rules generation and association analysis,and develop parallel and centralized algorithms for these sub-problems.Extensive experimental studies on large real-life datasets show that our approach can predict potential co-travel relationship efficiently and accurately,with accuracies higher than 50% for two types of rules,and substantially superior to the traditional method (e.g.,Jaccard with accuracy 24%).

Key words: Association analysis, Co-travel network, Co-travel pattern, Co-travel prediction, Graph pattern matching

中图分类号: 

  • F532.8
[1]FENG X,LI Y,CHEN H M.Research on Constructing SocialNetwork of Airline Customers From Data of PNR[J].Computer Simulation,2013,30(6):51-54.
[2]JIA X G,ZHOU Y W,LIN Y F.A relationship classification algorithm for the social networks of civil aviation passengers[J].Journal of Beijing Jiaotong University,2013,37(6):103-106.
[3]CHEN D,WU G,LI Y H,et al.Review on Passenger Service Management Research[J].Journal of Transportation Enginee-ring and Information,2017,15(2):14-20.
[4]ZHANG Y T.Predicting the Family Travels Based on Passenger Social Networks[J].Modern Computer,2016(8):1007-1423.
[5]LARSEN J,URRY J,AXHAUSEN K W.Networks andtourism:Mobile Social Life[J].Annals of Tourism Research,2007,34(1):244-262.
[6]LIN Y F,WAN H Y,JIANG R,et al.Inferring travel purposes of passenger groups for better understanding of passengers[J].IEEE Transactions on Intelligent Transportation System,2015,16(1):235-243.
[7]LIN Y F,JIA X G,LIN M G,et al.Inferring High Quality Co-Travel Networks[C]//CAAC.2013.
[8]CHEN Y Y,CHENG A J,WINSTON H.Travel Recommendation by Mining People Attributes and Travel Group Types From Community-Contributed Photos[J].IEEE Transaction on Multimedia,2013,15(6):1283-1295.
[9]GENG W,YANG G.Partial Correlation between Spatial andTemporal Regularities of Human Mobility[J].Scientific Reports,2017,7(1).
[10]YE S G.Research on link prediction in co-travel network[D].Beijing:Beijing Jiaotong University,2016.
[11]AGRAWAL R,IMIELINSKI T,SWAMI A.Mining association rules between sets of items in large databases[C]//Acm Sigmod Record.ACM 1993,22(2):207-216.
[12]FAN W F,WANG X,WU Y H,et al.Association rules with graph patterns[J].PVLDB,2015,8(12):1502-1513.
[13]WANG X,XU Y.Mining Graph Pattern Association Rules[C]//International Conference on Database and Expert Systems Applications.Springer,Cham,2018.
[14]JIANG H,LIU Z,LIU C,et al.Community detection in complex networks with an ambiguous structure using central node based link prediction[J].Knowledge-Based Systems,2020,195:105626.
[15]WU J H,SHEN J,ZHOU B.Community Features Based Ba-lanced Modularity Maximization Social Link Prediction Model[J].Computer Science,2019,46(3):253-259.
[16]YANG X H,YU J,ZHANG R.Link Prediction Method Based on Local Community and Nodes' Relativity[J].Computer Science,2019,46(1):155-161.
[17]CAI L,JI S W.A Multi-Scale Approach for Graph Link Prediction[J].The Thirty-Fourth AAAI Conference on Artifificial Intelligence (AAAI-20),2020,34(4):3308-3315.
[18]LI X,SHANG Y,CAO Y,et al.Type-Aware Anchor Link Prediction across Heterogeneous Networks Based on Graph Attention Network[J].The Thirty-Fourth AAAI Conference on Artifificial Intelligence (AAAI-20),2020,34(1):147-155.
[19]SHU J,ZHANG X P,LUO X Y,et al.Link Prediction for Opportunistic Networks Based on Deep Learning[J].Journal of Software,2016,27(1):36-48.
[20]WU X M,WU J S,LI Y F,et al.Link prediction of time-evolving network based on node ranking[J].Knowledge-Based Systems,2020,195:105740.
[21]ELSEIDY M,ABDELHAMID E,SKIADOPOULOS S,et al.GRAMI:frequent sub-graph and pattern mining in a single large graph[J].PVLDB,2014,7(7):517-528.
[22]FIEDLER M,BORGELT C.Subgraph Support in a Single Large Graph[C]//IEEE International Conference on Data Mining Workshops.IEEE,2007.
[23]GUDES E,SHIMONY S E,VANETIK N.Discovering Frequent Graph Patterns Using Disjoint Paths[J].IEEE Transactions on Knowledge and Data Engineering,2006,18(11):1441-1456.
[24]YAN X,HAN J.gSpan:Graph-Based Substructure Pattern Mi-ning[C]//IEEE Computer Society.2002.
[25]GOUDA K,ZAKI M J.Efficiently Mining Maximal Frequent Itemsets[C]//IEEE International Conference on Data Mining.IEEE,2002.
[26]HUAN J,WANG W,PRINS J.Spin:Mining maximal frequent subgraphs from graph databases[C]//Tenth Acm Sigkdd International Conference on Knowledge Discovery & Data Mining.ACM,2004.
[27]JACCARD P.Etede de la distribution florale dans une portion des Alpes et du Jura[J].Bulletin De La Societe Vaudoise Des Sciences Natuerlles,1901,37(142):547-579.
[1] 孙林, 平国楼, 叶晓俊.
基于本地化差分隐私的键值数据关联分析
Correlation Analysis for Key-Value Data with Local Differential Privacy
计算机科学, 2021, 48(8): 278-283. https://doi.org/10.11896/jsjkx.201200122
[2] 孙明玮, 司维超, 董琪.
基于多维度数据的网络服务质量的综合评估研究
Research on Comprehensive Evaluation of Network Quality of Service Based on Multidimensional Data
计算机科学, 2021, 48(6A): 246-249. https://doi.org/10.11896/jsjkx.200900131
[3] 张琴, 陈红梅, 封云飞.
一种基于粗糙集和密度峰值的重叠社区发现方法
Overlapping Community Detection Method Based on Rough Sets and Density Peaks
计算机科学, 2020, 47(5): 72-78. https://doi.org/10.11896/jsjkx.190400160
[4] 李刚, 王超, 韩德鹏, 刘强伟, 李莹.
基于深度主成分相关自编码器的多模态影像遗传数据研究
Study on Multimodal Image Genetic Data Based on Deep Principal Correlated Auto-encoders
计算机科学, 2020, 47(4): 60-66. https://doi.org/10.11896/jsjkx.190300073
[5] 王妍, 韩笑, 曾辉, 刘荆欣, 夏长清.
边缘计算环境下服务质量可信的任务迁移节点选择
Task Migration Node Selection with Reliable Service Quality in Edge Computing Environment
计算机科学, 2020, 47(10): 240-246. https://doi.org/10.11896/jsjkx.190900054
[6] 鲁显光, 杜学绘, 王文娟.
基于改进FP growth的告警关联算法
Alert Correlation Algorithm Based on Improved FP Growth
计算机科学, 2019, 46(8): 64-70. https://doi.org/10.11896/j.issn.1002-137X.2019.08.010
[7] 付泽强, 王晓锋, 孔军.
高性能网络安全告警信息的关联分析方法
High-performance Association Analysis Method for Network Security Alarm Information
计算机科学, 2019, 46(5): 116-121. https://doi.org/10.11896/j.issn.1002-137X.2019.05.018
[8] 茹锋, 徐锦, 常琪, 阚丹会.
一种用于影像遗传学关联分析的高阶统计量结构化稀疏算法
High Order Statistics Structured Sparse Algorithm for Image Genetic Association Analysis
计算机科学, 2019, 46(4): 66-72. https://doi.org/10.11896/j.issn.1002-137X.2019.04.010
[9] 李广璞, 黄妙华.
频繁项集挖掘的研究进展及主流方法
Research Progress and Mainstream Methods of Frequent Itemsets Mining
计算机科学, 2018, 45(11A): 1-11.
[10] 吴珺,王春枝.
面向大数据的多维粒矩阵关联分析及应用
Multiple Correlation Analysis and Application of Granular Matrix Based on Big Data
计算机科学, 2017, 44(Z11): 407-410. https://doi.org/10.11896/j.issn.1002-137X.2017.11A.086
[11] 琚安康,郭渊博,朱泰铭,王通.
网络安全事件关联分析技术与工具研究
Survey on Network Security Event Correlation Analysis Methods and Tools
计算机科学, 2017, 44(2): 38-45. https://doi.org/10.11896/j.issn.1002-137X.2017.02.004
[12] 胡文生,杨剑锋,赵明.
类设计质量评估方法的研究
Methodology for Classes Design Quality Assessment
计算机科学, 2017, 44(12): 150-155. https://doi.org/10.11896/j.issn.1002-137X.2017.12.029
[13] 余勇,林为民.
基于等级保护的电力信息安全监控系统的设计
Design of the Electric Power System's Security Monitoring System Based on Classified Protection
计算机科学, 2012, 39(Z11): 440-442.
[14] 贾 焰,王晓伟,韩伟红,李爱平,程文聪.
YHSSAS:面向大规模网络的安全态势感知系统
YHSSAS: Large-scale Network Oriented Security Situational Awareness System
计算机科学, 2011, 38(2): 4-8.
[15] 周延年,朱怡安.
基于灰嫡绝对关联分析在嵌入式计算机性能评价中的应用
New and Better Algorithm for Evaluation of Overall Performance of Embedded Computer through Combining Grey Entropy with Absolute Correlation Degree
计算机科学, 2011, 38(11): 206-207.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!