计算机科学 ›› 2013, Vol. 40 ›› Issue (11): 242-247.
邱云飞,鲍莉,邵良杉
QIU Yun-fei,BAO Li and SHAO Liang-shan
摘要: 在传统的搜索引擎和信息检索中,用户Query中的term-weight通常是以一种上下文无关的方式得到的。现有的大多数信息检索技术都使用词袋方法,例如布尔模型、向量空间模型和概率模型等,这些方法均没有考虑Query中term之间的相关性。为了能够充分利用Query中的信息来提高term-weight的准确度,提出了一种有监督的机器学习方法来学习用户Query中的term-weight。该方法基于分类的方法,并引入了句法分析作为分类的一项重要的特征来训练模型。考虑用户Query中term之间的关系后,既避免了由Query到单个term的信息丢失,又增加了短文本的特征,同时使分类器实现软输出,能够给term的重要程度一个更为准确的量化值。
[1] 第30次中国互联网发展状况统计报告[R].中国互联网络信息中心(CNNIC),2012 [2] Guo Jia-feng,Xu Gu,Chen Xue-qi,et al.Named entity recognition in query[C]∥Proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval.Boston,MA,USA:ACM,2009:267-274 [3] Fonseca B M,Golgher P,Possas B,et al.Concept-based interactive query expansion[C]∥Proceedings of the 14th ACM international conference on information and knowledge management.New York,NY,USA:ACM,2005:696-703 [4] Cao G,Nie J Y,Gao J,et al.Selecting good expansion terms for pseudo-relevance feedback[C]∥Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval.New York,NY,USA:ACM,2008:243-250 [5] Gao J,Nie J Y,Xun E,et al.Improving query translation for cross-language information retrieval using statistical models[C]∥Proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval.New York,NY,USA:ACM,2001:96-104 [6] Callan J P,Croft W B,Broglio J.Trec and tipster experiments with inquery [C]∥Information Processing and Management:an International Journal-Special issue:the second text retrieval conference(TREC-2).1995:327-343 [7] Allan J,Callan J,Croft W B,et al.Inquery at trec-5[C]∥TREC.1997:119-132 [8] Bendersky M,Croft W B.Discovering key concepts in verbosequeries[C]∥Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval.New York,NY,USA:ACM,2008:491-498 [9] Kumaran G,Allan J.Effective and efficient user interaction for long queries[C]∥Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval.New York,NY,USA:ACM,2008:11-18 [10] Kumaran G,Carvalho V R.Reducing long queries using queryquality predictors[C]∥Proceedings of the 32nd annual international ACM SIGIR conference on Research and development in information retrieval.New York,NY,USA:ACM,2009:564-571 [11] Lease M,Allan J,Croft W B.Regression Rank:Learning toMeet the Opportunity of Descriptive Queries[C]∥Proceedings of the 31st European Conference on IR Research on Advances in Information Retrieval.Toulouse,France,2009:99-101 [12] Nivre J,Hall J,Nilsson J.MaltParser:A data-driven parser-ge-nerator for dependency parsing [C]∥Proc.of LREC.2006 [13] 李珏伶.搜索引擎网页相关性评估方法设计及其在rank模型上的应用[D].北京:北京交通大学,2011 |
No related articles found! |
|