计算机科学 ›› 2022, Vol. 49 ›› Issue (11A): 211100268-10.doi: 10.11896/jsjkx.211100268
张康威1, 张敬伟1, 杨青2, 胡晓丽1, 单美静3
ZHANG Kang-wei1, ZHANG Jing-wei1, YANG Qing2, HU Xiao-li1, SHAN Mei-jing3
摘要: 随着定位技术的广泛使用,产生了以轨迹流形式收集的海量时空数据,如何从中挖掘有用的信息得到越来越多学者的关注。从轨迹流中挖掘伴随模式指在同一时间内发现具有高度相似行为的群体,对于交通管理、推荐系统的实时应用至关重要。然而,现有的研究只达到秒级响应,面对大规模轨迹数据难以在毫秒级的时间内快速响应。因此,提出了分布式轨迹流挖掘框架DCPFS。框架的主要模块包括:1)为了减少基于密度的聚类算法DBSCAN由于大规模数据带来的大量时间消耗,研究基于分布式部署方案,设计了数据分区策略和聚类合并算法,确保聚类的并行性及准确性;2)由于现实中轨迹移动具有方向性,在聚类阶段增加方向维度以减少冗余聚类;3)鉴于模式挖掘阶段涉及对聚类结果的交叉,设计了并行交叉算法来提高挖掘效率;4)基于Flink分布式大数据流处理平台实现了DCPFS。以成都市出租车GPS数据集和谷歌生活数据集为例进行实验,验证了所提框架比基准方法具有更快的响应速度。
中图分类号:
[1]ZHENG Y.Trajectory DataMining:An Overview[M].ACM,2015. [2]BENKERT M,GUDMUNDSSON J,HUBNER F,et al.Repor-ting Flock Patterns[J].Computational Geometry,2010,41(3):111-125. [3]JEUNG H,SHEN H T,ZHOU X.Convoy queries inspation-temporal databases[C]//IEEE 24th International Conference on Data Engineering.New York,IEEE,2008:1457-1459. [4]JEUNG,YIU H,MAN L,et al.Discovery of Convoys in Trajectory Databases[J].Computer Science,2010,1(1):1068-1080. [5]LI Z,DING B L,HAN J W,et al.Swarm:Mining relaxed temporal moving object clusters[J].Processdings of the Very Large Database Endowment,2010,3(1/2):723-734. [6]LI Y,BAILEY J,KU L.Efficient mining of platoon patterns in trajectory databases[J].Data & Knowledge Engineering,2015,100:167-187. [7]ORAKZAI F,CALDERS T,Pedersen T B.Distributed convoy pattern mining[C]//Proc. of the 17th Int. Conf. on Mobile Data Management.Piscataway,NJ:IEEE,2016:122-131. [8]ZHANG J W,LIU S J,YANG Q.DMFUCP:A Distributed Mi-ning Framework for Universal Companion Patterns on Large-Scale Trajectory Data[J].Journal of Computer Research and Development,2022,59(3):647-660. [9]VIEIRA M R,BAKALOV P,TSOTRAS V J.On-line discovery of flock patterns inspatio-temporal data[C]//Procceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Informantion Systems.Seattle,Washington,USA,2009:286-295. [10]TANG L A,ZHENG Y,YUAN J,et al.On Discovery of Traveling Companions form Streaming Trajectories[C]//IEEE 28th International Conference on Data Engineering.Washington,2012:186-197. [11]LI X,CEIKUTE V,JENSEN C S,et al.Effective Online Group Discovey in Trajectory Databases[J].IEEE Transactions on Knowledge & Data Engineering,2013,25(12):2752-2766. [12]LAN R,YUY,CAO L,et al.Discovering Evolving Moving Object Groups from Massive-Scale Trajectory Stream[C]//IEEE International Conference on Mobile Data Management.IEEE,2017:256-265. [13]NASERIAN E,WANG X H,XU X L,et al.A Framework ofLoose Travelling Compnion Discovery from Human Trajectories[J].IEEE Transactionson Moblie Computing,2018,17(11):2497-2511. [14]SHEIN T T,PUNTHEERANURAK S,IMAMURA M.Dis-covery of evolving companion from trajectory data streams[J].Knowledge and Information Systems,2020,62:3509-3533. [15]ZHENG Y,ZHANG L,XIE X,et al.Mining interesting loca-tions and travel sequences from GPS trajectories[C]//Proc. of Int. Conf. on World Wild Web.New York:ACM,2009:791-800. [16]XIAN Y,LIU Y,XU C,et al.Parallel Discovery of Trajectory Companion Pattern and System Evaluation[J].2020,28(10):538-550. [17]ZHENG Y,CHEN Y,XIE X,et al.Understanding mobilitybased on GPS data[C]//Proceedings of ACM Conference on Ubiquitous Computing.New York:ACM,2008:312-321. [18]ZHENG Y,XIE X,MA W.GeoLife:A collaborative social networking service among user,location and trajectory[J].IEEE Data Engineering Bulletin,2010,33(2):32-39. [19]TILMANN R,JONAS T,ASTERIOS K,et al.Apache Flink in current research[J].2016,58(4):157-165. [20]YUAN J,ZHENG Y,ZHANG C.et al.T-Drive:Driving Directions Based on Taxi Trajectories[C]//Proceedings of the 18th Annual ACM International Conference on Advances in Geographic Information Systems.ACM,2010.99-108. |
[1] | 鲁晨阳, 邓苏, 马武彬, 吴亚辉, 周浩浩. 基于分层抽样优化的面向异构客户端的联邦学习 Federated Learning Based on Stratified Sampling Optimization for Heterogeneous Clients 计算机科学, 2022, 49(9): 183-193. https://doi.org/10.11896/jsjkx.220500263 |
[2] | 何强, 尹震宇, 黄敏, 王兴伟, 王源田, 崔硕, 赵勇. 基于大数据的进化网络影响力分析研究综述 Survey of Influence Analysis of Evolutionary Network Based on Big Data 计算机科学, 2022, 49(8): 1-11. https://doi.org/10.11896/jsjkx.210700240 |
[3] | 陈晶, 吴玲玲. 多源异构环境下的车联网大数据混合属性特征检测方法 Mixed Attribute Feature Detection Method of Internet of Vehicles Big Datain Multi-source Heterogeneous Environment 计算机科学, 2022, 49(8): 108-112. https://doi.org/10.11896/jsjkx.220300273 |
[4] | 傅丽玉, 陆歌皓, 吴义明, 罗娅玲. 区块链技术的研究及其发展综述 Overview of Research and Development of Blockchain Technology 计算机科学, 2022, 49(6A): 447-461. https://doi.org/10.11896/jsjkx.210600214 |
[5] | 杨亚红, 王海瑞. 基于Renyi熵和BiGRU算法实现SDN环境下的DDoS攻击检测方法 DDoS Attack Detection Method in SDN Environment Based on Renyi Entropy and BiGRU Algorithm 计算机科学, 2022, 49(6A): 555-561. https://doi.org/10.11896/jsjkx.210800095 |
[6] | 孙浩, 毛瀚宇, 张岩峰, 于戈, 徐石成, 何光宇. 区块链跨链技术发展及应用 Development and Application of Blockchain Cross-chain Technology 计算机科学, 2022, 49(5): 287-295. https://doi.org/10.11896/jsjkx.210800132 |
[7] | 孙轩, 王焕骁. 政务大数据安全防护能力建设:基于技术和管理视角的探讨 Capability Building for Government Big Data Safety Protection:Discussions from Technologicaland Management Perspectives 计算机科学, 2022, 49(4): 67-73. https://doi.org/10.11896/jsjkx.211000010 |
[8] | 冯了了, 丁滟, 刘坤林, 马科林, 常俊胜. 区块链BFT共识算法研究进展 Research Advance on BFT Consensus Algorithms 计算机科学, 2022, 49(4): 329-339. https://doi.org/10.11896/jsjkx.210700011 |
[9] | 王美珊, 姚兰, 高福祥, 徐军灿. 面向医疗集值数据的差分隐私保护技术研究 Study on Differential Privacy Protection for Medical Set-Valued Data 计算机科学, 2022, 49(4): 362-368. https://doi.org/10.11896/jsjkx.210300032 |
[10] | 谭双杰, 林宝军, 刘迎春, 赵帅. 基于机器学习的分布式星载RTs系统负载调度算法 Load Scheduling Algorithm for Distributed On-board RTs System Based on Machine Learning 计算机科学, 2022, 49(2): 336-341. https://doi.org/10.11896/jsjkx.201200126 |
[11] | 陆炫廷, 蔡瑞杰, 刘胜利. 基于流量分析发现未知UDP反射放大协议 Discovery of Unknown UDP Reflection Amplification Protocol Based on Traffic Analysis 计算机科学, 2022, 49(11A): 211000089-5. https://doi.org/10.11896/jsjkx.211000089 |
[12] | 王清旭, 董理君, 贾伟, 刘超, 杨光, 吴铁军. 开放式环境下基于向量表征与计算的动态访问控制 Vector Representation and Computation Based Dynamic Access Control in Open Environment 计算机科学, 2022, 49(11A): 210900217-7. https://doi.org/10.11896/jsjkx.210900217 |
[13] | 李辉, 韩林, 陶红伟, 董本松. 基于申威众核处理器的Office口令恢复向量化研究 Study on Office Password Recovery Vectorization Technology Based on Sunway Many-core Processor 计算机科学, 2022, 49(11A): 210900176-5. https://doi.org/10.11896/jsjkx.210900176 |
[14] | 李辉, 韩林, 于哲, 王威. 基于人工蜂群算法的多维函数优化加速方法 Acceleration Method for Multidimensional Function Optimization Based on Artificial Bee Colony Algorithm 计算机科学, 2022, 49(11A): 211200075-6. https://doi.org/10.11896/jsjkx.211200075 |
[15] | 王冬霞, 雷咏梅, 张泽宇. 面向通用一致性优化的通信高效的异步ADMM算法 Communication Efficient Asynchronous ADMM for General Form Consensus Optimization 计算机科学, 2022, 49(11): 309-315. https://doi.org/10.11896/jsjkx.211200006 |
|