Computer Science ›› 2025, Vol. 52 ›› Issue (8): 136-145.doi: 10.11896/jsjkx.240600075

• Database & Big Data 2 Data Science • Previous Articles     Next Articles

Douglas-Peuker Algorithm Based Learned Index Structure for Road Network Trajectroy Data

MIAO Zhuqing1, HAN Jingyu1,2, LI Caiyun1, WANG Yanzhi1, MAO Yi1, ZHANG Yiting1   

  1. 1 College of Computer Science,Nanjing University of Posts and Telecommunications,Nanjing 210000,China
    2 Jiangsu Key Laboratory of Big Data Security and Intelligent Processing,Nanjing University of Posts and Telecommunications,Nanjing 210000,China
  • Received:2024-06-11 Revised:2024-09-04 Online:2025-08-15 Published:2025-08-08
  • About author:MIAO Zhuqing,born in 1997,postgra-duate,is a student member of CCF(No.O8027G).His main research interests include machine learning and spatio-temporal database.
    HAN Jingyu,born in 1976,Ph.D,professor,is a member of CCF(No.E20-0012101M).His main research in-terests include biomedical data proces-sing,machine learning and spatio-temporal database.
  • Supported by:
    Key Research and Development Project of Jiangsu Province(BE2022065-5).

Abstract: In recent years,the technology of location-based services has developed rapidly,which has produced a large amount of road network trajectory data.As a type of road network trajectory query,path range query is the basis to support other query types.This paper proposes a Douglas-Peuker based learned index structure(DPLI) based on Douglas-Purker algorithm,in order to efficiently index massive road network trajectory data and provide accurate path range query service.Firstly,the trajectory data is divided into multiple trajectory segments,then the midpoint of the trajectory segment is taken as the representation of the tra-jectory data,and the mapping function is used to map the sequence of one-dimensional mapping values,and then the trajectory data is divided into multiple data fragments according to the number of key values.In the fragment,the first and last data are formed into a line segment,and then the fitting error of the distance line segment of the remaining data points is calculated.The data points exceeding the error threshold are taken as the new line segment endpoints,and the original line segment is recursively divided until the fitting error of all data points is less than the threshold,so as to fit the piecewise linear function.The results show that compared with the traditional index methods,DPLI has faster construction efficiency and disk access efficiency.Compared to learning indexing methods,DPLI maintains the advantage of build efficiency and achieves a 100% query recall rate.

Key words: Location services, Road network trajectory data, Learned index, Range query, Douglas-Purker algorithm

CLC Number: 

  • TP392
[1]ZHANG S M,CAI P,LI C P,et al.High-dimensional Learned Index Based on Space Division and Dimension Reduction [J].Journal of Software,2023,34(5):2413-2426.
[2]SANDU POPA I,ZEITOUNI K,ORIA V,et al.Indexing in-network trajectory flows [J].The VLDB Journal,2011(20):643-669.
[3]MORRIS B T,TRIVEDI M M.Trajectory Learning for Activity Understanding:Unsupervised,Multilevel,and Long-Term Adaptive Approach [J].IEEE Transactions on Software Engineering,2011,33(11):2287-2301.
[4]TAO Y,PAPADIAS D.MV3R-tree:A spatio-temporal access method for timestamp and interval queries[C]//Proceedings of the 27th International Conference on Very Large Data Bases.San Francisco,CA:Morgan Kaufmann Publishers Inc.,2001:431-440.
[5]GUTTMAN A.R-trees:A dynamic index structure for sparial searching [J].ACM Sigmod Record,1984,14(2):47-57.
[6]BECKMANN N,KRIEGEL H P,SCHNEIDER R,et al.TheR*-tree:an efficient and robust access method for points and rectangles[C]//International Conference on Management of Data.New York:ACM,1990:322-331.
[7]SINGH M,ZHU Q,JAGADISH H V.SWST:A Disk Based Index for Sliding Window Spatio-Temporal Data[C]//2012 IEEE 28th International Conference on Data Engineering.IEEE,2012:342-353.
[8]ROBINSON J T.The K-D-B-tree:a search structure for large multidimensional dynamic indexes[C]//Proceedings of the 1981 ACM SIGMOD International Conference on Management of Data.New York:ACM,1981:10-18.
[9]PFOSER D,JENSEN C S.Indexing of network constrainedmoving objects[C]//CIKM03:12th International Conference on Information and Knowledge Management.New York:ACM,2003:25-32.
[10]ZHANG R,QI J,STRADLING M,et al.Towards a Painless Index for Spatial Objects [J].ACM Transactions on Database Systems,2014,39(3):1-42.
[11]LI P,LU H,ZHENG Q,et al.LISA:A Learned Index Structure for Spatial Data[C]//International Conference on Management of Data.New York:ACM,2020:2119-2133.
[12]LI R,RUAN S,BAO J,et al.Querying Massive Trajectories by Path on the Cloud[C]//the 25th ACM SIGSPATIAL International Conference.New York:ACM,2017:1-4.
[13]KRASKA T,BEUTEL A,CHI E H,et al.The Case for Learned Index Structures[C]// Proceedings of the 2018 International Conference on Management of Data.New York:ACM,2018:489-504.
[14]DING J,MINHAS U F,YU J,et al.ALEX:An UpdatableAdaptive Learned Index[C]//International Conference on Management of Data.New York:ACM,2020:969-984.
[15]GALAKAATOS A,MARKOVITCH M,BINNIG C,et al.FITing-Tree:A Data-aware Index Structure[C]//Proceedings of the 2019 International Conference on Management of Data.New York:ACM,2019:1189-1106.
[16]WANG Y T,WANG Y S,YUAN Y.Survey of Learned Index[J].Computer Science,2023,50(1):1-8.
[17]WU F,HAN J Y,LIU Y,et al.Trajectory Index Based on Piecewise Linear Regression Tree[J].Journal of Chinese Computer Systems,2022,43(6):1245-1253.
[18]ZHANG Z,JIN P Q,XIE X K.Learning index:Current situation and research prospects [J].Journal of Software,2021,32(4):1129-1150.
[19]KRASKA T,ALIZADEH M.SageDB:An Instance-Optimized Data Analytics System [J].Proceedings of the VLDB Endowment,2019,15(13):4062-4078.
[20]WANG H,FU X,XU J,et al.Learned Index for Spatial Queries[C]//2019 20th IEEE International Conference on Mobile Data Management(MDM).IEEE,2019:569-574.
[21]QI J,LIU G,JENSEN C S,et al.Effectively learning spatial indices [J].Proceedings of the VLDB Endowment,2020,13(12):2341-2354.
[22]DAVITKOVA A,MILCHEVSKI E,MICHEL S.The ML-Index:A Multidimensional,Learned Index for Point,Range,and Nearest-Neighbor Queries[C]//International Conference on Extending Database Technology.Copenhagen:Open Procee-dings.org,2020:407-410.
[23]ZHANG S,RAY S,LU R,et al.Efficient Learned Spatial Index With Interpolation Function Based Learned Model [J].IEEE Transactions on Big Data,2023,9(2):733-745.
[24]NEWSON P,KRUMM J.Hidden Markov map matchingthrough noise and sparseness[C]//Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems.New York:ACM,2009:336-343.
[25]CHRISTOPHER H.Introduction to Real Analysis[M].New York:Wiley,2019:33-86.
[26]WU S T,MARQUEZ M R G.A non-self-intersection Douglas-Peucker algorithm[C]//Computer Graphics and Image Proces-sing.IEEE,2003:60-66.
[1] XU Ruida, LI Yongkun, XU Yinlong. Performance Optimization of LSM-tree Based Key-Value Storage System Based on Fine-grained Cache andLearned Index [J]. Computer Science, 2025, 52(2): 33-41.
[2] CHEN Shanshan, GAO Jun, MA Zhenyu. GDLIN:A Learned Index By Gradient Descent [J]. Computer Science, 2023, 50(6A): 220600256-6.
[3] WANG Yitan, WANG Yishu, YUAN Ye. Survey of Learned Index [J]. Computer Science, 2023, 50(1): 1-8.
[4] XU Zhe, LIU Liang, QIN Xiao-lin, QIN Wei-meng. Distributed Spatial Keyword Query Processing Algorithm with Relational Attributes [J]. Computer Science, 2019, 46(6A): 402-406.
[5] GUO Shuai, LIU Liang and QIN Xiao-lin. Spatial Keyword Range Query with User Preferences Constraint [J]. Computer Science, 2018, 45(4): 182-189.
[6] DONG Tian-yang, SHANG Yue-hui, CHENG Qiang. Direction-aware Moving Object Range Query Algorithm in Road Network [J]. Computer Science, 2018, 45(11): 210-219.
[7] MA Yin-fang and ZHANG Lin. LBS Group Nearest Neighbor Query Method Based on Differential Privacy [J]. Computer Science, 2017, 44(Z6): 336-341.
[8] XU Jing, REN Kai-jun and LI Xiao-yong. Parallel Algorithm Design and Optimization of Range Query for Meteorological Data Retrieval [J]. Computer Science, 2017, 44(3): 42-47.
[9] LIU Huai-jin, CHEN Yong-hong, TIAN Hui, WANG Tian and CAI Yi-qiao. Privacy and Integrity Protection Range Query Processing in Two-tiered Wireless Sensor Networks [J]. Computer Science, 2016, 43(Z11): 393-397.
[10] LIU Xue-jun, CHEN Yu-feng and LI Bin. Mobile Location Privacy Protection Based on Untrusted Environment [J]. Computer Science, 2015, 42(2): 108-113.
[11] LI Yan-hong,HUANG Qun,JIANG Hong and LI Guo-hui. Research on Processing Continuous Spatial Keyword Range Queries in Road Networks [J]. Computer Science, 2014, 41(7): 232-235.
[12] WU Ling-kun TANG Yong WANG Peng SHU Ran (Department of Computer Science, Sun Yat-Sen University, Guangzhou 510275,China). [J]. Computer Science, 2009, 36(6): 133-137.
[13] FU Xiang-Hua, PENG Xiao-Gang, WANG Zhi-Qiang, MING Zhong (College of Information Engineering, Shenzhen University, Shenzhen 518060). [J]. Computer Science, 2007, 34(8): 69-71.
[14] SHI Zhi-Bin HUANG Hou-Kuan (School of Computer and IT, Beijing Jiaotong University, Beijing 100044). [J]. Computer Science, 2007, 34(12): 93-96.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!