计算机科学 ›› 2016, Vol. 43 ›› Issue (Z6): 410-412.doi: 10.11896/j.issn.1002-137X.2016.6A.097
曹阳,袁鑫攀,龙军
CAO Yang, YUAN Xin-pan and LONG Jun
摘要: Minwise Hash极大似然估计子RMle综合考虑所有事件的发生概率,可以提高估计精度,但降低了估计的效率。连接位Minwise Hash估计子RMinwise,c可以成倍减少比对次数,动态阈值过滤器能够进一步提高Minwise Hash算法和其变种算法的效率。结合连接位极大似然估计子和动态阈值过滤器,提出了连接位极大似然动态过滤算法R(TMle,c)。实验表明,R(TMle,c)具有精度和效率兼顾的特性,计算时间最少,并且在k>300的条件下,其准确度与RMle的近乎相等。
[1] Broder A Z,Charikar M,Frieze A M,et al.Min-wise indepen-dent permutations [J].Journal of Computer Systems and Scie-nces,2000,60(3):630-659 [2] Kalpakis K,Tang S.Collaborative data gathering in wirelesssensor networks using measurement co-occurrence [J].ComputerCommunications,2008,31(10):1979-1992 [3] Dourisboure Y,Geraci F,Pellegrini M.Extraction and classification of dense implicit communities in the web graph [J].ACM Transactions on the Web (TWEB),2009,3(2):1-36 [4] Bendersky M,Croft W B.Finding text reuse on the Web [C]∥Proceedings of the Second ACM International Conference on Web Search and Data Mining(WSDM’09).New York,USA:ACM,2009:262-271 [5] Buehrer G,Chellapilla K.A scalable pattern mining approach to web graph compression with communities [C]∥Proceedings of the International Conference on Web Search and Web Data Mi-ning(WSDM’08).New York,USA:ACM,2008:95-106 [6] Indyk P.A small approximately min-wise independent family of Hash functions [J].Journal of Algorithm,2001,38(1):84-90 [7] Charikar M S.Similarity estimation techniques from roundingalgorithms [C]∥Proceedings of the Thiry-fourth Annual ACM Symposium on Theory of Computing(STOC’02).New York,USA:ACM,2002:380-388 [8] Li P,Knig A C.b-bit minwise hashing [C]∥Proceedings of the 19th International Conference on World Wide Web(WWW’10).New York,USA:ACM,2010:671-680 [9] Li P,Knig A C.Theory and applications of b-bit minwise hashing [J].Communications of the ACM,2011,54(8):101-109 [10] Yuan Xin-pan,Long Jun,Zhang Zu-ping,et al.Connected bitminwise hashing [J].Journal of Computer Research and Deve-lopment,2013,50(4):883-890 |
No related articles found! |
|