计算机科学 ›› 2021, Vol. 48 ›› Issue (9): 95-102.doi: 10.11896/jsjkx.200700097
所属专题: 智能数据治理技术与系统
李思颖1, 徐杨1, 王欣2, 赵若成3
LI Si-ying1, XU Yang1, WANG Xin2, ZHAO Ruo-cheng3
摘要: 随着运输技术的快速发展,铁路已成为人们出差、度假、探亲时选择的主要出行方式之一。与此同时,旅客共同出行(以下简称同行)的行为特征也越来越普遍。依据旅客间的同行关系,可以构建同行关系网络;而对该网络中潜在的链接进行预测,将有助于提供个性化的服务和产品。为此,文中提出一种原创的方法,用于在旅客同行关系网络中发现潜在的同行关系。首先对传统的图模式关联规则进行扩展,提出了两类“同行图模式关联规则”,用于预测新的同行关系和未来的同行频次。然后,将上述规则挖掘计算的问题分解为频繁同行模式挖掘、规则生成以及关联分析3个子问题,并设计了有效的分布式和集中式的算法。通过在大规模真实数据集上的测试,证明了所提方法能够高效且准确地预测旅客同行关系网络中潜在的同行关系,且两类规则的预测准确率均高于50%,远高于传统方法(如Jaccard的预测准确率为24%)。
中图分类号:
[1]FENG X,LI Y,CHEN H M.Research on Constructing SocialNetwork of Airline Customers From Data of PNR[J].Computer Simulation,2013,30(6):51-54. [2]JIA X G,ZHOU Y W,LIN Y F.A relationship classification algorithm for the social networks of civil aviation passengers[J].Journal of Beijing Jiaotong University,2013,37(6):103-106. [3]CHEN D,WU G,LI Y H,et al.Review on Passenger Service Management Research[J].Journal of Transportation Enginee-ring and Information,2017,15(2):14-20. [4]ZHANG Y T.Predicting the Family Travels Based on Passenger Social Networks[J].Modern Computer,2016(8):1007-1423. [5]LARSEN J,URRY J,AXHAUSEN K W.Networks andtourism:Mobile Social Life[J].Annals of Tourism Research,2007,34(1):244-262. [6]LIN Y F,WAN H Y,JIANG R,et al.Inferring travel purposes of passenger groups for better understanding of passengers[J].IEEE Transactions on Intelligent Transportation System,2015,16(1):235-243. [7]LIN Y F,JIA X G,LIN M G,et al.Inferring High Quality Co-Travel Networks[C]//CAAC.2013. [8]CHEN Y Y,CHENG A J,WINSTON H.Travel Recommendation by Mining People Attributes and Travel Group Types From Community-Contributed Photos[J].IEEE Transaction on Multimedia,2013,15(6):1283-1295. [9]GENG W,YANG G.Partial Correlation between Spatial andTemporal Regularities of Human Mobility[J].Scientific Reports,2017,7(1). [10]YE S G.Research on link prediction in co-travel network[D].Beijing:Beijing Jiaotong University,2016. [11]AGRAWAL R,IMIELINSKI T,SWAMI A.Mining association rules between sets of items in large databases[C]//Acm Sigmod Record.ACM 1993,22(2):207-216. [12]FAN W F,WANG X,WU Y H,et al.Association rules with graph patterns[J].PVLDB,2015,8(12):1502-1513. [13]WANG X,XU Y.Mining Graph Pattern Association Rules[C]//International Conference on Database and Expert Systems Applications.Springer,Cham,2018. [14]JIANG H,LIU Z,LIU C,et al.Community detection in complex networks with an ambiguous structure using central node based link prediction[J].Knowledge-Based Systems,2020,195:105626. [15]WU J H,SHEN J,ZHOU B.Community Features Based Ba-lanced Modularity Maximization Social Link Prediction Model[J].Computer Science,2019,46(3):253-259. [16]YANG X H,YU J,ZHANG R.Link Prediction Method Based on Local Community and Nodes' Relativity[J].Computer Science,2019,46(1):155-161. [17]CAI L,JI S W.A Multi-Scale Approach for Graph Link Prediction[J].The Thirty-Fourth AAAI Conference on Artifificial Intelligence (AAAI-20),2020,34(4):3308-3315. [18]LI X,SHANG Y,CAO Y,et al.Type-Aware Anchor Link Prediction across Heterogeneous Networks Based on Graph Attention Network[J].The Thirty-Fourth AAAI Conference on Artifificial Intelligence (AAAI-20),2020,34(1):147-155. [19]SHU J,ZHANG X P,LUO X Y,et al.Link Prediction for Opportunistic Networks Based on Deep Learning[J].Journal of Software,2016,27(1):36-48. [20]WU X M,WU J S,LI Y F,et al.Link prediction of time-evolving network based on node ranking[J].Knowledge-Based Systems,2020,195:105740. [21]ELSEIDY M,ABDELHAMID E,SKIADOPOULOS S,et al.GRAMI:frequent sub-graph and pattern mining in a single large graph[J].PVLDB,2014,7(7):517-528. [22]FIEDLER M,BORGELT C.Subgraph Support in a Single Large Graph[C]//IEEE International Conference on Data Mining Workshops.IEEE,2007. [23]GUDES E,SHIMONY S E,VANETIK N.Discovering Frequent Graph Patterns Using Disjoint Paths[J].IEEE Transactions on Knowledge and Data Engineering,2006,18(11):1441-1456. [24]YAN X,HAN J.gSpan:Graph-Based Substructure Pattern Mi-ning[C]//IEEE Computer Society.2002. [25]GOUDA K,ZAKI M J.Efficiently Mining Maximal Frequent Itemsets[C]//IEEE International Conference on Data Mining.IEEE,2002. [26]HUAN J,WANG W,PRINS J.Spin:Mining maximal frequent subgraphs from graph databases[C]//Tenth Acm Sigkdd International Conference on Knowledge Discovery & Data Mining.ACM,2004. [27]JACCARD P.Etede de la distribution florale dans une portion des Alpes et du Jura[J].Bulletin De La Societe Vaudoise Des Sciences Natuerlles,1901,37(142):547-579. |
[1] | 孙林, 平国楼, 叶晓俊. 基于本地化差分隐私的键值数据关联分析 Correlation Analysis for Key-Value Data with Local Differential Privacy 计算机科学, 2021, 48(8): 278-283. https://doi.org/10.11896/jsjkx.201200122 |
[2] | 孙明玮, 司维超, 董琪. 基于多维度数据的网络服务质量的综合评估研究 Research on Comprehensive Evaluation of Network Quality of Service Based on Multidimensional Data 计算机科学, 2021, 48(6A): 246-249. https://doi.org/10.11896/jsjkx.200900131 |
[3] | 张琴, 陈红梅, 封云飞. 一种基于粗糙集和密度峰值的重叠社区发现方法 Overlapping Community Detection Method Based on Rough Sets and Density Peaks 计算机科学, 2020, 47(5): 72-78. https://doi.org/10.11896/jsjkx.190400160 |
[4] | 李刚, 王超, 韩德鹏, 刘强伟, 李莹. 基于深度主成分相关自编码器的多模态影像遗传数据研究 Study on Multimodal Image Genetic Data Based on Deep Principal Correlated Auto-encoders 计算机科学, 2020, 47(4): 60-66. https://doi.org/10.11896/jsjkx.190300073 |
[5] | 王妍, 韩笑, 曾辉, 刘荆欣, 夏长清. 边缘计算环境下服务质量可信的任务迁移节点选择 Task Migration Node Selection with Reliable Service Quality in Edge Computing Environment 计算机科学, 2020, 47(10): 240-246. https://doi.org/10.11896/jsjkx.190900054 |
[6] | 鲁显光, 杜学绘, 王文娟. 基于改进FP growth的告警关联算法 Alert Correlation Algorithm Based on Improved FP Growth 计算机科学, 2019, 46(8): 64-70. https://doi.org/10.11896/j.issn.1002-137X.2019.08.010 |
[7] | 付泽强, 王晓锋, 孔军. 高性能网络安全告警信息的关联分析方法 High-performance Association Analysis Method for Network Security Alarm Information 计算机科学, 2019, 46(5): 116-121. https://doi.org/10.11896/j.issn.1002-137X.2019.05.018 |
[8] | 茹锋, 徐锦, 常琪, 阚丹会. 一种用于影像遗传学关联分析的高阶统计量结构化稀疏算法 High Order Statistics Structured Sparse Algorithm for Image Genetic Association Analysis 计算机科学, 2019, 46(4): 66-72. https://doi.org/10.11896/j.issn.1002-137X.2019.04.010 |
[9] | 李广璞, 黄妙华. 频繁项集挖掘的研究进展及主流方法 Research Progress and Mainstream Methods of Frequent Itemsets Mining 计算机科学, 2018, 45(11A): 1-11. |
[10] | 吴珺,王春枝. 面向大数据的多维粒矩阵关联分析及应用 Multiple Correlation Analysis and Application of Granular Matrix Based on Big Data 计算机科学, 2017, 44(Z11): 407-410. https://doi.org/10.11896/j.issn.1002-137X.2017.11A.086 |
[11] | 琚安康,郭渊博,朱泰铭,王通. 网络安全事件关联分析技术与工具研究 Survey on Network Security Event Correlation Analysis Methods and Tools 计算机科学, 2017, 44(2): 38-45. https://doi.org/10.11896/j.issn.1002-137X.2017.02.004 |
[12] | 胡文生,杨剑锋,赵明. 类设计质量评估方法的研究 Methodology for Classes Design Quality Assessment 计算机科学, 2017, 44(12): 150-155. https://doi.org/10.11896/j.issn.1002-137X.2017.12.029 |
[13] | 余勇,林为民. 基于等级保护的电力信息安全监控系统的设计 Design of the Electric Power System's Security Monitoring System Based on Classified Protection 计算机科学, 2012, 39(Z11): 440-442. |
[14] | 贾 焰,王晓伟,韩伟红,李爱平,程文聪. YHSSAS:面向大规模网络的安全态势感知系统 YHSSAS: Large-scale Network Oriented Security Situational Awareness System 计算机科学, 2011, 38(2): 4-8. |
[15] | 周延年,朱怡安. 基于灰嫡绝对关联分析在嵌入式计算机性能评价中的应用 New and Better Algorithm for Evaluation of Overall Performance of Embedded Computer through Combining Grey Entropy with Absolute Correlation Degree 计算机科学, 2011, 38(11): 206-207. |
|