Computer Science ›› 2009, Vol. 36 ›› Issue (10): 222-224.
Previous Articles Next Articles
ZHANG Yang-sen
Online:
Published:
Abstract: The training of statistical language model parameter is the key of language modeling. Chooseing how many training samples to meet the demand of the model parameter estimation error is one of concern problems of language modeling theory. We applied mathematical statistics theory to give the estimating method for training samples lower bound capability for Chinese model, the quantification estimation formula was suggested. By using this formula, the corpus sample capability needed to train model parameters can be calculated according to the demand of parameter estimation error.
Key words: Chinese statistical language model, Training corpus sample, Sample capacity, Relative error
ZHANG Yang-sen. Statistical Language Model[J].Computer Science, 2009, 36(10): 222-224.
0 / / Recommend
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
URL: https://www.jsjkx.com/EN/
https://www.jsjkx.com/EN/Y2009/V36/I10/222
Cited