Computer Science ›› 2016, Vol. 43 ›› Issue (Z6): 410-412.doi: 10.11896/j.issn.1002-137X.2016.6A.097

Previous Articles     Next Articles

Dynamic Filtering Algorithm of Connected Bit Maximum Likelihood Minwise Hash

CAO Yang, YUAN Xin-pan and LONG Jun   

  • Online:2018-11-14 Published:2018-11-14

Abstract: Maximum likelihood estimator (RMle) can improve the average accuracy.Connected bit Minwise Hash (RMinwise,c) can exponentially improve efficiency of similarity estimation.Dynamic threshold filter makes further improvement on efficiency.Combining RMinwise,c and dynamic threshold filter,the maximum likelihood dynamic filtering algorithm of Connected bit Minwise Hash was proposed.Experimental results demonstrate that R(TMle,c) can get second-best precision and efficiency,and is the most cost effective business in the estimator options (RMle,RMle,c,RMinwise,c,R(TMle,c).

Key words: Maximum likelihood estimator,Similarity measure,Hash,Connected bit,Dynamic threshold

[1] Broder A Z,Charikar M,Frieze A M,et al.Min-wise indepen-dent permutations [J].Journal of Computer Systems and Scie-nces,2000,60(3):630-659
[2] Kalpakis K,Tang S.Collaborative data gathering in wirelesssensor networks using measurement co-occurrence [J].ComputerCommunications,2008,31(10):1979-1992
[3] Dourisboure Y,Geraci F,Pellegrini M.Extraction and classification of dense implicit communities in the web graph [J].ACM Transactions on the Web (TWEB),2009,3(2):1-36
[4] Bendersky M,Croft W B.Finding text reuse on the Web [C]∥Proceedings of the Second ACM International Conference on Web Search and Data Mining(WSDM’09).New York,USA:ACM,2009:262-271
[5] Buehrer G,Chellapilla K.A scalable pattern mining approach to web graph compression with communities [C]∥Proceedings of the International Conference on Web Search and Web Data Mi-ning(WSDM’08).New York,USA:ACM,2008:95-106
[6] Indyk P.A small approximately min-wise independent family of Hash functions [J].Journal of Algorithm,2001,38(1):84-90
[7] Charikar M S.Similarity estimation techniques from roundingalgorithms [C]∥Proceedings of the Thiry-fourth Annual ACM Symposium on Theory of Computing(STOC’02).New York,USA:ACM,2002:380-388
[8] Li P,Knig A C.b-bit minwise hashing [C]∥Proceedings of the 19th International Conference on World Wide Web(WWW’10).New York,USA:ACM,2010:671-680
[9] Li P,Knig A C.Theory and applications of b-bit minwise hashing [J].Communications of the ACM,2011,54(8):101-109
[10] Yuan Xin-pan,Long Jun,Zhang Zu-ping,et al.Connected bitminwise hashing [J].Journal of Computer Research and Deve-lopment,2013,50(4):883-890

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!