计算机科学 ›› 2015, Vol. 42 ›› Issue (7): 265-269.doi: 10.11896/j.issn.1002-137X.2015.07.057
陈雅兰,胡小华,涂新辉,何婷婷
CHEN Ya-lan, HU Xiao-hua, TU Xin-hui and HE Ting-ting
摘要: 在大多数现有的检索模型中常常忽略了如下事实:一个文档中匹配到的查询词项的近邻性和打分时所基于的段落检索也可以被用来促进文档的打分。受此启发,提出了基于位置语言模型的中文信息检索系统,首先通过定义位置传播数的概念,为每个位置单独地建立语言模型;然后通过引入KL-divergence检索模型,并结合位置语言模型给每个位置单独打分;最后由多参数打分策略得到文档的最终得分。实验中还重点比较了基于词表和基于二元两种中文索引方法在位置语言模型中的检索效果。在标准NTCIR5、NTCIR6测试集上的实验结果表明,该检索方法在两种索引方式上都显著改善了中文检索系统的性能,并且优于向量空间模型、BM25概率模型、统计语言模型。
[1] Ponte J,Croft W B.A Language Modeling Approach to Information Retrieval[C]∥Proceedings of the 1998 ACM SIGIR Conference on Research and Development in Information Retrieval.Melbourne,1998:275-281 [2] Lv Yuan-hua,Zhai Cheng-xiang.A comparative study of methodsfor estimating query language models with pseudo feedback[C]∥Proceedings of 2009 CIKM Conference on Information and Knowledge Management.HongKong,2009:1895-1898 [3] Diaz F,Metzler D.Improving the estimation of relevance models using large external corpora[C]∥Proceedings of the 2006 ACM SIGIR Conference on Research and Development in Information Retrieval.Washington,2006:154-161 [4] Liu Xiao-yong,Croft W B.Cluster-based retrieval using lan-guage models[C]∥Proceedings of the 2004 ACM SIGIR Conference on Research and Development in Information Retrieval.Sheffield,2004:186-193 [5] Lv Yuan-hua,Zhai Cheng-xiang.Positional language models for information retrieval[C]∥Proceedings of the 2009 ACM SIGIR Conference on Research and Development in Information Retrieval.Boston,2009:299-306 [6] 余伟,王明文,万剑怡,等.结合语义的位置语言模型[J].北京大学学报(自然科学版),2013,49(2):203-212 Yu Wei,Wang Ming-wen,Wan Jian-yi,et al.Positional language models with semantic information[J].Journal of Peking University(Natural Science Edition),2013,49(2):203-212 [7] Miao Jun,Huang Xiang-ji,Ye Zheng.Proximity-based rocchio’s model for pseudo relevance[C]∥Proceedings of the 2012 ACM SIGIR Conference on Research and Development in Information Retrieval.Portland,2012:535-544 [8] Lv Yuan-hua,Zhai Cheng-xiang.Positional relevance model for pseudo-relevance feedback[C]∥Proceedings of the 2010 ACM SIGIR Conference on Research and Development in Information Retrieval.Geneva,2010:579-586 [9] Kwok K L.Comparing representations in Chinese informationretrieval[C]∥Proceedings of the 1997 ACM SIGIR Conference on Research and Development in Information Retrieval.1997:34-41 [10] Lam W,Wong C Y,Wong K F.Performance evaluation of chara-cter,word and n-gram-based indexing for Chinese text retrieval[C]∥Proceedings of the Information Retrieval with Asian Languages 97 Conference.1997:68-80 [11] Nie J Y,Ren F.Chinese information retrieval:using characters or words[J].Information Processing and Management,1997,35(4):443-462 [12] Zhai Cheng-xiang,Lafferty J D.A study of smoothing methods for language models applied to ad hoc information retrieval[C]∥Proceedings of the 2001 ACM SIGIR Conference on Research and Development in Information Retrieval.New Orleans,2001:334-342 [13] Zhao Jia-shu,Huang Xiang-ji,He Ben.CRTER:using cross termsto enhance probabilistic information retrieval[C]∥Proceedings of the 2011 ACM SIGIR Conference on Research and Development in Information Retrieval.Beijing,2011:155-164 [14] Kise K,Junker M,Dengel A,et al.Passage Retrieval Based on Density Distributions of Terms and Its Applications to Document Retrieval and Question Answering[M].Reading and Learning:Adaptive Content Recognition.2004:306-327 [15] Petkova D,Croft W B.Proximity-based document representation for named entity retrieval[C]∥Proceedings of the 2007 CIKM Conference on Information and Knowledge Management.Lisboa,2007:731-740 [16] Kaszkiel M,Zobel J,Sacks-Davis R.Efficient passage ranking for document databases[J].ACM Transactions on Information Systems,1999,17(4):406-439 [17] Salton G,Wong A,Yang C S.A vector space model for automaticindexing[J].Communications of the ACM,1975,18(11):613-620 [18] Salton G,Fox E A,Wu H.Extended Boolean information retrieval[J].Communications of the ACM,1983,26(11):1022-1036 [19] Maron M E,Kuhns J L.On relevance,probabilistic indexing and information retrieval[J].Journal of the ACM(JACM),1960,7(3):216-244 [20] Berger A,Lafferty J.Information retrieval as statistical translation[C]∥Proceedings of the 1999 ACM SIGIR Conference on Research and Development in Information Retrieval.Berkley,1999:222-229 |
No related articles found! |
|