Computer Science ›› 2016, Vol. 43 ›› Issue (5): 188-192, 208.doi: 10.11896/j.issn.1002-137X.2016.05.034

Study on Domain-corpus Driven Calculation Method of Sentence Relevance

LI Feng, HUANG Jin-zhu, LI Zhou-jun and YANG Wei-ming   

  Online:2018-12-01 Published:2018-12-01

Abstract: Sentence relevance calculation plays a very important role in various fields of NLP,such as public opinion monitoring,information retrieval and statistical machine translation(SMT) etc.This paper,after a clear definition of relationship between similarity and relevance,designed a domain-specific corpus-driven calculation model of sentence relevance.The model applies the linguistic data of the same domain to construct a “sentence-paragraph-article” three-level domanial semantic space.The topic relevance of words can be figured out through calculating different factors of various levels such as co-occurrence probability,co-occurrence average distance and sentence length etc.The paper made comparative experiments between the model and methods based on literal features,HowNet and Tongyici Cilin respectively and the results show that this model has great practical value.

Key words: Sentence relevance,Corpus driven,Topic relevance,Calculation model

