Computer Science ›› 2015, Vol. 42 ›› Issue (10): 275-280.

Previous Articles     Next Articles

Information Retrieval Model for Domain-specific Structural Documents and its Application in Agricultural Disease Prescription Retrieval

LIU Tong and NI Wei-jian   

  • Online:2018-11-14 Published:2018-11-14

Abstract: Different from plain text,professional documents in various domains are mostly a type of structural document which is composed of several roughly fixed textual fields and embeds rich domain knowledge.To incorporate the inhe-rent structure information and domain knowledge,we proposed a novel retrieval model for professional documents based on structural retrieval.In particular,we first derived a domain model from a given professional document collection,and then used it as a basis to design a domain-specific structural retrieval function.We applied the proposed structural retrieval model to agricultural disease prescriptions,i.e.,a representative type of professional document in agriculture,and developed a prototype search engine for agricultural disease prescription.The experimental results on a real prescription collection show advantages of the proposed model to conventional information retrieval approaches.

Key words: Information retrieval,Agricultural disease prescription,Query expansion,Structural retrieval

[1] Robertson S,Zaragoza H,Taylor M.Simple BM25 Extension to Multiple Weighted Fields[C]∥Proceedings of the 13th ACM CIKM.Washington DC,USA,2004:42-49
[2] Lu W,Robertson S,MacFarlane A.Field-Weighted XML Re-trieval Based on BM25[C]∥Proceedings of the 5th Workshop of INEX.Germany,2006:161-171
[3] Ogilvie P,Callan J.Hierarchical language models for XML component retrieval[C]∥Proceedings of the 4th Workshop of INEX.Germany,2005:224-237
[4] Ogilvie P,Callan J.Combining document representations forknown-item search[C]∥Proceedings of the 26th ACM SIGIR.Toronto,Canada,2003:143-150
[5] Kim J,Xue X,Croft W B.A Probabilistic Retrieval Model for Semistructured Data[C]∥Proceedings of the 31th ECIR.Toulouse,France,2009:228-239
[6] Kim J,Croft W B.A Field Relevance Model for Structured Docu-ment Retrieval[C]∥Proceedings of the 34th ECIR.Barcelona,Spain,2012:97-108
[7] Itakura K Y,Clarke C L.A framework for BM25F-based XML retrieval[C]∥Proceedings of the 33rd ACM SIGIR.Geneva,Switzerland,2010:843-844
[8] 刘德喜,万常选,刘喜平,等.基于结点权重模型的XML片段检索策略[J].计算机学报,2013,6(8):1729-1744 Liu,De-xi,Wan Chang-xuan,Liu Xi-ping,et al.A Snipet Retrieval Strategy Based on Element Weighting Model[J].Chinese Journal of Computers,2013,6(8):1729-1744
[9] Yi X,Allan J,Croft W B.Matching resumes and jobs based on relevance models[C]∥Proceedings of the 30th ACM SIGIR.Amsterdam,2007:809-810
[10] Zhao L,Callan J.Effective and Efficient Structured Retrieval[C]∥Proceedings of the 18th ACM CIKM.Hong Kong,China,2009:1573-1576
[11] Blei D M,Ng A Y,Jordan M I.Latent Dirichletallocation[J].Journal of Machine Learning Research,2003,3(4/5):993-1022
[12] Yi X,Allan J.A Comparative Study of Utilizing Topic Models for Information Retrieval[C]∥Proceedings of the 31th ECIR.Toulouse,France,2009:29-41
[13] Lavrenko V,Croft W B.Relevance-based language models[C]∥Proceedings of the 24th ACM SIGIR.New Orleans,Louisiana,USA,2001:120-127
[14] Ganguly D,Leveling J,Jones G J F.An LDA-smoothed relevance model for document expansion:a case study for spoken document retrieval[C]∥Proceedings of the 36th SIGIR.Dublin,Ireland,2013:1057-1060
[15] Bai J,Song D,Bruza P,et al.Query Expansion Using Term Relationships in Language Models for Information Retrieval[C]∥Proceedings of the 14th CIKM.Bremen,Germany,2005:688-695
[16] Han J,Pei J,Yin Y.Mining frequent patterns without candidate generation[C]∥Proceedings of SIGMOD.Dallas,Texas,USA,2000:1-12
[17] Liang Y,Liu T,Ni W.Augmented Vector Space Model for Passage Intention Classification in Chinese Agricultural Prescription Documents[J].Journal of Computational Information Systems,2014,10(1):101-108
[18] Songa M,Song I-Y,Hu X,et al.Integration of association rules and ontologies for semantic query expansion[J].Data & Know-ledge Engineering,2007,3(1):63-75

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!