Computer Science ›› 2016, Vol. 43 ›› Issue (9): 247-249, 273.doi: 10.11896/j.issn.1002-137X.2016.09.049

Previous Articles     Next Articles

Improvement of Lucene Sorting Algorithm Fusing Location-related and Probabilistic Sorting

HU Bo and JIANG Zong-li   

  • Online:2018-12-01 Published:2018-12-01

Abstract: Sorting document retrieval results and text classification technology is the core technology to solve vertical search,personalized information retrieval,information filtering and other related issues.In order to improve the performan-ce of retrieval systems,an improved method for integrating location-related and probabilistic sorting was proposed for Lucene default sorting algorithm.Taking into account the document relevance impact of query’s location information and probabilistic sorting,the scoring formula of Lucene default sorting algorithm is improved using the probability value of document relevance based on naive Bayesian classification algorithm and the weights of location-related query.Experimental results show that this improvement can effectively improve the accuracy of vertical search,allowing users to have better vertical search experience.

Key words: Location-related,Probabilistic sorting,Lucene,Sorting algorithm,Vertical search

[1] Liu J X,Sheng Y.The differences and case analysis of vertical and general search engines[J].Modern Information,2009,9(3):143-149(in Chinese) 刘俊熙,盛宇.垂直和通用搜索引擎的差异和案例分析[J].现代情报,2009,9(3):143-149
[2] 牛长流,尚宇.Lucene实战(第2版)[M].北京:人民邮电出版社,2011
[3] Bai K,Geng G H.Research and Application of vertical search engines based on Lucene/Heritrix[J].Computer Applications and Software,2009,6(1):212-215(in Chinese) 白坤,耿国华.基于Lucene/Heritrix的垂直搜索引擎的研究与应用[J].计算机应用与软件,2009,6(1):212-215
[4] Zhang X,Liu X F.Design and implementation of full-text search engine based on Lucene and Heritrix[J].Modern Computer ,2013(22):74-77(in Chinese) 张宣,刘晓飞.基于Lucene和Heritrix的全文搜索引擎的设计与实现[J].现代计算机,2013(22):74-77
[5] Cai F.Research and improvement of Lucene sorting algorithm[J].New Technology and New Products of China,2011(4):15-16(in Chinese) 蔡峰.Lucene排序算法的研究和改进[J].中国新技术新产品,2011(4):15-16
[6] Chen J X,Huang R,Ma Z B.Optimization and implementation of Lucene sorting algorithm based on PageRank[J].Computer Engineering and Science,2012,4(10):123-127(in Chinese) 陈建峡,黄日,马忠宝.基于PageRank的Lucene排序算法优化与实现[J].计算机工程与科学,2012,4(10):123-127
[7] Mohd M.Development of Search Engines using Lucene:An Experience[J].Procedia-Social and Behavioral Sciences,2011,8:282-286
[8] Milosavljevic,Branko,Boberic,et al.Retrieval of bibliographic records using Apache Lucene[J].The Electronic Library,2010,8(4):525-539
[9] Rong G,Zhang H X.Application of text classification in thesearch engine[J].Guide of Scitech Magazine,2008,2(2):14-15(in Chinese) 荣光,张化祥.文本分类在搜索引擎性能中的应用[J].科技致富向导,2008,2(2):14-15
[10] Lewis D D.Representation and learning in information retrieval[D].Graduate School of the University of Maassachusetts,1992
[11] Zhang X F.Analysis and evaluation of several common information retrieval model[J].Journal of Intelligence ,2008,7(3):121-123(in Chinese) 张小芳.几种常见信息检索模型的分析与评价[J].情报杂志,2008,7(3):121-123
[12] Croft W B,Metzler D,Strohman T.Search Engine:Information Retrieval in Practice[M].Pearson,2010

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75, 88 .
[2] XIA Qing-xun and ZHUANG Yi. Remote Attestation Mechanism Based on Locality Principle[J]. Computer Science, 2018, 45(4): 148 -151, 162 .
[3] LI Bai-shen, LI Ling-zhi, SUN Yong and ZHU Yan-qin. Intranet Defense Algorithm Based on Pseudo Boosting Decision Tree[J]. Computer Science, 2018, 45(4): 157 -162 .
[4] WANG Huan, ZHANG Yun-feng and ZHANG Yan. Rapid Decision Method for Repairing Sequence Based on CFDs[J]. Computer Science, 2018, 45(3): 311 -316 .
[5] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[6] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[7] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[8] LIU Qin. Study on Data Quality Based on Constraint in Computer Forensics[J]. Computer Science, 2018, 45(4): 169 -172 .
[9] ZHONG Fei and YANG Bin. License Plate Detection Based on Principal Component Analysis Network[J]. Computer Science, 2018, 45(3): 268 -273 .
[10] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99, 116 .