基于△-tree的高维数据相似连接算法

Computer Science ›› 2011, Vol. 38 ›› Issue (10): 157-160.

△-tree Based Similarity ,Join Algorithm for High-dimensional Data

LIU Yan,HAO Zhong-xiao

Online:2018-11-16 Published:2018-11-16

Abstract

Abstract: Similarity joins arc used in a variety of fields, such as clustering, text mining, and multimedia databases. In or- der to solve the proplemes of high-dimensional similarity joins in main-memory environment, a novel similarity join algo- rithm called △-tree-join* that can efficiently combine two different database sets based on p-tree was presented. △-tree has been proven to be an efficient index method in main-memory. △-tree-join* adopted the top-down join scheme and made full use of the properties of p-tree to compute the distances between clusters and between point and cluster with fewer number of dimensions,so as to filter unnecessary nodes or points,reduce computations and improve joins efficien- cy. Experiments on both synthetic clustered dataset and real datasets were conducted, and the results demonstrate that △-tree-join* is more suitable for main-memory similarity joins, and it performs well compared with the two state-of-the- art similarity join methods EGO and EGO".

LIU Yan,HAO Zhong-xiao. △-tree Based Similarity ,Join Algorithm for High-dimensional Data[J].Computer Science, 2011, 38(10): 157-160.