基于检索结果聚类的XML伪相关文档查找

Abstract

Abstract: Recently study shows that traditional pseudo-relevance feedback may bring topic drift．Therefore,to avoid topic drift effectively,it is essential to identify relevant documents and to form the pseudo relevant documents to user’s query．In this paper,based on clustering XML search results,a method was proposed to find good feedback documents．Firstly,a cluster-label extraction method based on equalizing weights was introduced,by fully considering the content and structure features in XML documents．Secondly,a two-stage ranking strategy was presented,as the candidate cluster ranking model and document ranking model．Finally,experimental data shows that compared to original retrieving method, the ranking models obtain better performance and find more relevant XML documents.

Key words: Information retrieval,XML pseudo-relevance feedback,XML search results clustering,Cluster label,Ran-king model

ZHONG Min-juan,WAN Chang-xuan,LIU De-xi and LIAO Shu-mei. Finding XML Pseudo-relevance Document Based on Search Results Clustering[J].Computer Science, 2013, 40(10): 172-177.

References

[1] Qiang H,Dawei S,Stefan R．Robust Query-Specific PseudoFeedback Document Selection for Query Expansion[A]∥Proc．of the 30th European Conf．on Information Retrieval(ECIR),2008[C]．Heidelberg:Springer-Verlag,2008:547-554
[2] Ben H,Ladh O．Finding Good Feedback Documents[A]∥Proc．of the 18th ACM Conf．on Information and Knowledge Management(CIKM),2009[C]．New York:ACM Press,2009:2011-2014
[3] Karthik R,Raghavendra U,Pushpak B,et al．On ImprovingPseudo-Relevance Feedback Using Pseudo-Irrelevant Documents[A]∥Proc．of the 32nd European Conf．on Information Retrie-val(ECIR),2010[C]．Heidelberg:Springer-Verlag,2010:573-576
[4] Lv Yuan-hua,Zhai Cheng-xiang,Chen Wan．A Boosting Ap-proach to Improving Pseudo-Relevance Feedback[A]∥Proc．of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval,2011[C]．New York:ACM Press,2011:165-174
[5] Sakai T,Manabe T,Koyama M．Flexible Pseudo-RelevanceFeedback via Selective Sampling[J].ACM Transactions on AsianLanguage Information Processing,2005,4(2):111-135
[6] Kyung S L,Croft W B,James A．A Cluster-Based ResamplingMethod for Pseudo-Relevance Feedback[A]∥Proc．of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,2008[C]．New York:ACM Press,2008:235-242
[7] Shariq B,Andreas B．Improving Retrievability of Patents withCluster-Based Pseudo-Relevance Feedback Document Selection[A]∥Proc．of the 18th ACM Conf．on Information and Know-ledge Management(CIKM),2009[C]．New York:ACM Press,2009:1863-1866
[8] Kevyn C T,Jamie C．Estimation and Use of Uncertainty inPseudo-Relevance Feedback[A]∥Proc．of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,2007[C]．New York:ACM Press,2007:303-310
[9] 叶正．基于网络挖掘与机器学习技术的相关反馈研究[D]．大连:大连理工大学,2011
[10] 蒲强,何大庆,杨国纬．一种基于统计语义聚类的查询语言模型估计[J]．计算机研究与发展,2011,48(2):224-231
[11] Gong Bi-hong,Peng Bo,Li Xiao-ming．A personalized re-ranking algorithm based on relevance feedback[A]∥1st International workshop on Database Management and Applications over Networks,DBMAN,2007[C].2007:4537:255-263
[12] 钟敏娟．基于内容与结构语义相融合的XML检索结果聚类[J]．情报学报,2012,31(5):515-525
[13] Singhal A,Choi J,Hindle D,et al．AT&T at TREC-7[A]∥Proc．of the 7th Text Retrieval Confernece(TREC-7),1998[C]．NIST Special Publication,1998:239-252

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Finding XML Pseudo-relevance Document Based on Search Results Clustering

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 0

Metrics

Comments

Recommended 0