Computer Science ›› 2009, Vol. 36 ›› Issue (8): 201-204.

Previous Articles     Next Articles

Fast Chinese Text Search Based on LSH

CAI Heng, LI Zhou-Jun,SUN Jian,LI Yang   

  • Online:2018-11-16 Published:2018-11-16

Abstract: The query of High dimension data attracts more and more attention. When dimension of a space vector is higher than 10, R-tree, Kd-tree, SR-tree and Quadtrecs perform worse than linear query. However, Locality Sensitive hashing (LSH) algorithm successfully deals with this problem. Nowadays LSH is playing a more and more important role in high dimension query. In the paper, the basic algorithm and principle of LSH were introduced firstly, then binary vector LSH Search Algorithm was improved by means of the multi-probe. Finally, we implemented the two kinds of LSH algorithms. The experience we have designed verified that the revised algorithm has better performance than the original one in two aspects. On the one hand, as the increment of setover, the proportion of retrial recall enlarges. On the other hand, the complexity of space decreases without the change of time complexity.

Key words: High dimension data, Similarity search, Locality sensitive hashing, Near neighbor, Multi-probe

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!