基于分割识别的蒙古语语音关键词检测方法的研究

计算机科学 ›› 2013, Vol. 40 ›› Issue (9): 208-211.

基于分割识别的蒙古语语音关键词检测方法的研究

飞龙,高光来,闫学亮,王炜华

内蒙古大学计算机学院呼和浩特010021;内蒙古大学计算机学院呼和浩特010021;内蒙古大学计算机学院呼和浩特010021;内蒙古大学计算机学院呼和浩特010021

出版日期:2018-11-16 发布日期:2018-11-16
基金资助:
本文受国家自然科学基金项目(61263037,9)；内蒙古自然科学基金重大项目(2011ZD11)资助

Research on Mongolian Spoken Term Detection Method Based on Segmentation Recognition

BAO Fei-long,GAO Guang-lai,YAN Xue-liang and WANG Wei-hua

Online:2018-11-16 Published:2018-11-16

摘要/Abstract

摘要： 蒙古文属于黏着语,词根和后缀能够组合成近百万的蒙古文单词。现有的蒙古语大词汇量连续语音识别(LVCSR)系统的发音词典无法包含所有蒙古文单词。同时发音词典较大时,训练语料的稀疏将导致LVCSR系统的性能明显下降。为了解决LVCSR系统中大多数蒙古文单词的识别问题和蒙古语语音关键词检测系统中大量集外词的检测问题,结合蒙古文的构词特点,提出了基于分割识别的蒙古语LVCSR方法,并建立了对应的声学模型和语言模型。最后,将此方法应用到了蒙古语语音关键词检测系统中并在蒙古语语音语料上进行了测试。实验结果表明,基于分割识别的蒙古语LVCSR方法能解决大部分蒙古文单词的识别问题,并将蒙古语语音关键词检测系统的大量集外词转化成了集内词,大幅度提高了检测系统的查准率和召回率。

关键词: 蒙古语,词干,结尾后缀,关键词检测,集外词,混淆网络中图法分类号TP391.1文献标识码A

Abstract: Mongolian is an agglutinative language．This special formation rule results in an amount of probably millions of words which are far beyond the coverage of the pronunciation dictionary of any current Mongolian Large Vocabulary Continuous Speech Recognition(LVCSR)System．On the other hand,even if the pronunciation dictionary is large enough to cover most of the Mongolian words,the recognition system still won’t perform well due to the sparseness of training data．To avoid the poor coverage problem of pronunciation Dictionary,we proposed a segmentation-based LVCSR approach and trained its acoustic model and language model．This approach is integrated into Mongolian Spoken Term Detection(STD)system and tested on Mongolian speech data．Experimental results show that our segmentation-based LVCSR approach can recognize most of the Mongolian words successfully and both the precision and recall of the Mongolian STD system are greatly improved by converting most of the out-of-vocabulary words into their in-vocabulary form.

Key words: Mongolian,Stem,Ending suffix,Spoken term detection,Out-of-vocabulary word,Confusion network

飞龙,高光来,闫学亮,王炜华. 基于分割识别的蒙古语语音关键词检测方法的研究[J]. 计算机科学, 2013, 40(9): 208-211. https://doi.org/

BAO Fei-long,GAO Guang-lai,YAN Xue-liang and WANG Wei-hua. Research on Mongolian Spoken Term Detection Method Based on Segmentation Recognition[J]. Computer Science, 2013, 40(9): 208-211. https://doi.org/

参考文献

[1] Bao Fei-long,Gao Guang-lai．The Research on Mongolian Spo-ken Term Detection Based on Confusion Network[C]∥Procee-dings of The Chinese Conference on Pattern Recognition(CCPR2012).Beijing,2012:606-612
[2] Gao Guang-lai,Biligetu,Nabuqing,et al.A Mongolian speechrecognition system based on HMM[C]∥Proceedings of International Conference on Intelligent Computing(ICIC2006)．Kunming,2006:667-676
[3] Qilao H S,Gao Guang-lai．Researching of Speech Recognition Oriented Mongolian Acoustic Model[C]∥Proceedings of The Chinese Conference on Pattern Recognition(CCPR2008).Beijing,2008:406-411
[4] Bao Fei-long,Gao Guang-lai．Improving of Acoustic Model forthe Mongolian Speech Recognition System[C]∥Proceedings of The Chinese Conference on Pattern Recognition(CCPR2009)．Nanjing,2009:616-620
[5] 清格尔泰．蒙古语语法 [M]．呼和浩特:内蒙古人民出版社,1991:77-133
[6] Mangu L,Brill E,Stolcke A．Finding consensus in speech recognition:word error minimization and other applications of confusion networks[J]．Computer Speech and Language,2000,14(4):373-400
[7] Mamou J,Carmel D,Hoory R．Spoken document retrieval from call-center conversations[C]∥Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval．New York,NY,USA,2006:51-58
[8] Mamou J,Ramabhadran B,Siohan O．Vocabulary independentspoken term detection[C]∥Proc．ACM-SIGIR’07．Amsterdam,2007:615-622
[9] Young S,et al．The HTK book(Revised for HTK version 3.4.1)[M]．Cambridge University,2009
[10] Stolcke A．SRILM-An Extensible Language Modeling Toolkit[C]∥Proc．Intl．Conf．Spoken Language Processing．Denver,Colorado,2002

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed