Computer Science ›› 2019, Vol. 46 ›› Issue (12): 38-44.doi: 10.11896/jsjkx.190100240
• Big Data & Data Science • Previous Articles Next Articles
WANG Xiao-xia, SUN De-cai
CLC Number:
[1]SILVA Y N,PEARSON S S,CHON J,et al.Similarity Joins:Their implementation and interactions with other databaseopera-tors[J].Information Systems,2015,52(8/9):149-162.[2]YU M,LI G,DENG D,et al.String similarity search and join:a survey[J].Frontiers of Computer Science,2016,10(3):399-417.[3]PAGH R.Large-scale similarity joins with guarantees[C]//Proceedings of 18th International Conference on Database Theory.Brussels,Belgium:Schloss Dagstuhl,2015:15-24.[4]PANG J,YU G,XU J,et al.Similarity Joins on Massive Data Based on MapReduce Framework[J].Computer Science,2015,42(1):1-5,27.[5]LEVENSHTEIN V.Binary codes capable of correcting dele- tions,insertions,and reversals[J].Soviet Physics Doklady,1966,10(8):707-710.[6]LIN C,YU H,WENG W,et al.Large-Scale Similarity Join with Edit-Distance Constraints[C]//Proceedings of 19th InternationalConference on Database Systems for Advanced Applications.Bali Indonesia:Springer,2014:328-342.[7]DENG D,LI G L,HAO S,et al.MassJoin:A MapReduce-based Method for Scalable String Similarity Joins[C]//Proceedings of IEEE 30th International Conference on Data Engineering.New York,USA:IEEE,2014:340-351.[8]CHEN G,YANG K,CHEN L,et al.Metric similarity joins using MapReduce[J].IEEE Transactions on Knowledge and Data Engineering,2017,29(3):656-669.[9]WANG J Y,YANG X C,WANG B,et al.LS-Join:Local Similarity Join on String Collections[J].IEEE Transactions Know-ledge and Data Engineering,2017,29(9):1928-1942.[10]BAYARDO R J,MA Y,SRIKANT R.Scaling up all pairs similarity search[C]//Proceedings of 16th International Conference on World Wide Web.New York,USA:ACM,2007:131-140.[11]XIAO C,WANG W,LIN X.Ed-join:an efficient algorithm for similarity joins with edit distance constraints[J].Proceedings of the Vldb Endowment,2008,1(1):933-944.[12]XIAO C,WANG W,LIN X,et al.Efficient similarity joins for near-duplicate detection[J].ACM Transactions on Database Systems,2011,36(3):1-41.[13]LI G,DONG D,WANG J,et al.PASS-JOIN:A Partition-based Method for Similarity Joins[J].Proceedings of the Vldb Endowment,2011,5(3):253-264.[14]WANG W,QIN J,XIAO C,et al.Vchunkjoin:An efficient algorithm for edit similarity joins[J].IEEE Transactions on Know-ledge and Data Engineering,2013,25(8):1916-1929.[15]SHANG Z,LIU Y,LI G,et al.K-Join:Knowledge-Aware Similarity Join[J].IEEE Transactions on Knowledge and Data Engineering,2016,28(12):3293-3308.[16]VERNICA R,CAREY M J,LI C.Efficient parallel set-similarity joins using MapReduce[C]//Proceedings of 2010 ACM SIGMOD International Conference on Management of Data.Indianapolis,IN,USA:ACM,2010:495-506.[17]METWALLY A,FALOUTSOS C.V-SMART-Join:A Scalable MapReduce Framework for All-Pair Similarity Joins of Multisets and Vectors[J].Proceedings of the Vldb Endowment,2012,5(8):704-715.[18]AFRATI F N,SARMA A D,MENESTRINA D,et al.Fuzzy Joins Using MapReduce[C]//Proceedings of IEEE 28th International Conference on Data Engineering.Washington,DC:IEEE,2012:498-509.[19]MA Y,MENG X,WANG S.Parallel similarity joins on massive high-dimensional data using MapReduce[J].Concurrency and Computation:Practice and Experience,2016,28(1):166-183.[20]RONG C T,LIN C B,SILVA Y N,et al.Fast and Scalable Distributed Set Similarity Joins for Big Data Analytics[C]//Proceedings of IEEE 33rd International Conference on Data Engineering.New York,USA:IEEE,2017:1059-1070. |
[1] | CHEN Jing, WU Ling-ling. Mixed Attribute Feature Detection Method of Internet of Vehicles Big Datain Multi-source Heterogeneous Environment [J]. Computer Science, 2022, 49(8): 108-112. |
[2] | HE Qiang, YIN Zhen-yu, HUANG Min, WANG Xing-wei, WANG Yuan-tian, CUI Shuo, ZHAO Yong. Survey of Influence Analysis of Evolutionary Network Based on Big Data [J]. Computer Science, 2022, 49(8): 1-11. |
[3] | LIU Wei-ming, AN Ran, MAO Yi-min. Parallel Support Vector Machine Algorithm Based on Clustering and WOA [J]. Computer Science, 2022, 49(7): 64-72. |
[4] | WANG Mei-shan, YAO Lan, GAO Fu-xiang, XU Jun-can. Study on Differential Privacy Protection for Medical Set-Valued Data [J]. Computer Science, 2022, 49(4): 362-368. |
[5] | SUN Xuan, WANG Huan-xiao. Capability Building for Government Big Data Safety Protection:Discussions from Technologicaland Management Perspectives [J]. Computer Science, 2022, 49(4): 67-73. |
[6] | WANG Jun, WANG Xiu-lai, PANG Wei, ZHAO Hong-fei. Research on Big Data Governance for Science and Technology Forecast [J]. Computer Science, 2021, 48(9): 36-42. |
[7] | YU Yue-zhang, XIA Tian-yu, JING Yi-nan, HE Zhen-ying, WANG Xiao-yang. Smart Interactive Guide System for Big Data Analytics [J]. Computer Science, 2021, 48(9): 110-117. |
[8] | WANG Li-mei, ZHU Xu-guang, WANG De-jia, ZHANG Yong, XING Chun-xiao. Study on Judicial Data Classification Method Based on Natural Language Processing Technologies [J]. Computer Science, 2021, 48(8): 80-85. |
[9] | WANG Xue-cen, ZHANG Yu, LIU Ying-jie, YU Ge. Evaluation of Quality of Interaction in Online Learning Based on Representation Learning [J]. Computer Science, 2021, 48(2): 207-211. |
[10] | ZHANG Yuan-ming, YU Jia-rui, JIANG Jian-bo, LU Jia-wei, XIAO Gang. Intermediate Data Transmission Pipeline Optimization Mechanism for MapReduce Framework [J]. Computer Science, 2021, 48(2): 41-46. |
[11] | TENG Jian, TENG Fei, LI Tian-rui. Travel Demand Forecasting Based on 3D Convolution and LSTM Encoder-Decoder [J]. Computer Science, 2021, 48(12): 195-203. |
[12] | ZHANG Yu-long, WANG Qiang, CHEN Ming-kang, SUN Jing-tao. Survey of Intelligent Rain Removal Algorithms for Cloud-IoT Systems [J]. Computer Science, 2021, 48(12): 231-242. |
[13] | LIU Ya-chen, HUANG Xue-ying. Research on Creep Feature Extraction and Early Warning Algorithm Based on Satellite MonitoringSpatial-Temporal Big Data [J]. Computer Science, 2021, 48(11A): 258-264. |
[14] | ZHANG Guang-jun, ZHANG Xiang. Mechanism and Path of Optimizing Institution of Legislative Evaluation by Applying “Big Data+Blockchain” [J]. Computer Science, 2021, 48(10): 324-333. |
[15] | YE Ya-zhen, LIU Guo-hua, ZHU Yang-yong. Two-step Authorization Pattern of Data Product Circulation [J]. Computer Science, 2021, 48(1): 119-124. |
|