计算机科学 ›› 2014, Vol. 41 ›› Issue (4): 200-204.
钟敏娟,万常选,刘德喜,廖述梅,焦贤沛
ZHONG Min-juan,WAN Chang-xuan,LIU De-xi,LIAO Shu-mei and JIAO Xian-pei
摘要: 查询词扩展要解决两个方面的问题:一是扩展词的来源,二是如何在来源集合里挑选扩展词项。对此,首先利用检索结果聚类和排序模型获取了较高质量的相关文档集合,并以此作为扩展源;然后结合XML文档的特点,通过词项间的局部共现特征进行查询扩展。相关实验结果表明,一方面,所采用的检索结果聚类和排序模型的相关文档集扩展源具有较高的用户查询相关性,相比传统的伪反馈扩展源,具有更高的质量;另一方面,提出的结合了XML结构特点的词共现查询扩展方案能获得与用户查询意图相关的扩展信息,与初始查询和无结构的词项扩展方法相比,所提方法能够更有效地提高搜索引擎检索性能。
[1] 黄名选,严小卫,张师超.基于矩阵加权差联规则挖掘的伪相关反馈查询扩展[J].软件学报,2009,20(7):1854-1865 [2] Sakai T,Manabe T,Koyama M.Flexible Pseudo-RelevanceFeedback via Selective Sampling[J].ACM Transactions on Asian Language Information Processing,2005,4(2):111-135 [3] Kyung S L,Croft W B,James A.A Cluster-Based Resampling Method for Pseudo-Relevance Feedback[C]∥Proc.of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,2008.New York:ACM Press,2008:235-242 [4] Shariq B,Andreas B.Improving Retrievability of Patents with Cluster-Based Pseudo-Relevance Feedback Document Selection[C]∥Proc.of the 18th ACM Conf.on Information and Know-ledge Management (CIKM),2009.New York:ACM Press,2009:1863-1866 [5] 叶正.基于网络挖掘与机器学习技术的相关反馈研究[D].大连:大连理工大学,2011 [6] 蒲强,何大庆,杨国纬.一种基于统计语义聚类的查询语言模型估计[J].计算机研究与发展,2011,48(2):224-231 [7] Cao G H,Nie J Y,Gao J F,et al.Selecting Good ExpansionTerms for Pseudo-Relevance-Feedback[C]∥Proc.of the ACM SIGIR Conf.Singapore,2008:243-250 [8] 黄名选,严小卫,张师超.基于矩阵加权关联规则挖掘的伪相关反馈查询扩展[J].计算机研究与发展,2009,20(7):1854-1865 [9] Schenkel R,Theobald M.Feedback-Driben Structural QueryExpansion for Ranked Retrieval of XML Data[C]∥Procee-dings of the 10th International Conference on Extending Database Technology( LNCS).Munich,Germany,2006:331-348 [10] 万常选,鲁远.基于权重查询词的XML结构查询扩展[J].软件学报,2008,19(10):2611-2619 [11] 钟敏娟.基于内容与结构语义相融合的XML检索结果聚类[J].情报学报,2012,31(5):515-525 [12] Singhal A,Choi J,Hindle D,et al.AT&T at TREC-7[C]∥Proc.of the 7th Text Retrieval Conference(TREC-7),1998.NIST Special Publication,1998:239-252 [13] 丁国栋,白硕,王斌.一种基于局部共现的查询扩展方法[J].中文信息学报,2006,20(3):84-91 |
No related articles found! |
|