计算机科学 ›› 2013, Vol. 40 ›› Issue (9): 208-211.
飞龙,高光来,闫学亮,王炜华
BAO Fei-long,GAO Guang-lai,YAN Xue-liang and WANG Wei-hua
摘要: 蒙古文属于黏着语,词根和后缀能够组合成近百万的蒙古文单词。现有的蒙古语大词汇量连续语音识别(LVCSR)系统的发音词典无法包含所有蒙古文单词。同时发音词典较大时,训练语料的稀疏将导致LVCSR系统的性能明显下降。为了解决LVCSR系统中大多数蒙古文单词的识别问题和蒙古语语音关键词检测系统中大量集外词的检测问题,结合蒙古文的构词特点,提出了基于分割识别的蒙古语LVCSR方法,并建立了对应的声学模型和语言模型。最后,将此方法应用到了蒙古语语音关键词检测系统中并在蒙古语语音语料上进行了测试。实验结果表明,基于分割识别的蒙古语LVCSR方法能解决大部分蒙古文单词的识别问题,并将蒙古语语音关键词检测系统的大量集外词转化成了集内词,大幅度提高了检测系统的查准率和召回率。
[1] Bao Fei-long,Gao Guang-lai.The Research on Mongolian Spo-ken Term Detection Based on Confusion Network[C]∥Procee-dings of The Chinese Conference on Pattern Recognition(CCPR2012).Beijing,2012:606-612 [2] Gao Guang-lai,Biligetu,Nabuqing,et al.A Mongolian speechrecognition system based on HMM[C]∥Proceedings of International Conference on Intelligent Computing(ICIC2006).Kunming,2006:667-676 [3] Qilao H S,Gao Guang-lai.Researching of Speech Recognition Oriented Mongolian Acoustic Model[C]∥Proceedings of The Chinese Conference on Pattern Recognition(CCPR2008).Beijing,2008:406-411 [4] Bao Fei-long,Gao Guang-lai.Improving of Acoustic Model forthe Mongolian Speech Recognition System[C]∥Proceedings of The Chinese Conference on Pattern Recognition(CCPR2009).Nanjing,2009:616-620 [5] 清格尔泰.蒙古语语法 [M].呼和浩特:内蒙古人民出版社,1991:77-133 [6] Mangu L,Brill E,Stolcke A.Finding consensus in speech recognition:word error minimization and other applications of confusion networks[J].Computer Speech and Language,2000,14(4):373-400 [7] Mamou J,Carmel D,Hoory R.Spoken document retrieval from call-center conversations[C]∥Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval.New York,NY,USA,2006:51-58 [8] Mamou J,Ramabhadran B,Siohan O.Vocabulary independentspoken term detection[C]∥Proc.ACM-SIGIR’07.Amsterdam,2007:615-622 [9] Young S,et al.The HTK book(Revised for HTK version 3.4.1)[M].Cambridge University,2009 [10] Stolcke A.SRILM-An Extensible Language Modeling Toolkit[C]∥Proc.Intl.Conf.Spoken Language Processing.Denver,Colorado,2002 |
No related articles found! |
|