汉语语音识别中声学界标点引导的随机段模型解码算法

计算机科学 ›› 2013, Vol. 40 ›› Issue (10): 208-212.

汉语语音识别中声学界标点引导的随机段模型解码算法

晁浩,杨占磊,刘文举

河南理工大学计算机科学与技术学院焦作454000;中国科学院自动化研究所模式识别国家重点实验室北京100190;中国科学院自动化研究所模式识别国家重点实验室北京100190

出版日期:2018-11-16 发布日期:2018-11-16
基金资助:
本文受国家自然科学基金(91120303,90820303,90820011),国家重点基础研究发展计划(973计划)(2004CB318105),国家高技术研究发展计划(863计划)(20060101Z4073,2006AA01Z194)资助

Landmark Guided Segmental Speech Decoding Algorithm for Continuous Mandarin Speech Recognition

CHAO Hao,YANG Zhan-lei and LIU Wen-ju

Online:2018-11-16 Published:2018-11-16

摘要/Abstract

摘要： 提出了一种随机段模型的解码优化算法。检测出具有语音学意义的界标点,根据这些界标点分析临近语音段的边界信息和声韵母类别信息,最后将这些边界信息和类别信息用于指导随机段模型的搜索过程。实验中,两种类型的界标点能较为准确地被检测出来,并用于指导随机段模型的解码,在“863-test”测试集上进行的汉语连续语音识别实验显示,在正确率只有轻微下降的同时,解码时间下降了12.92%,这表明了将语音学知识引入语音识别系统的有效性。

关键词: 语音识别,随机段模型,解码,界标点

Abstract: A framework was proposed which attempts to incorporate landmarks into segment based Mandarin speech recognition system．In the method,landmarks provide boundary information and phonetic class information,and the information is used to direct the decoding process．To prove the validity of this method,two kinds of landmarks which can be detected reliably were used to direct the decoding process of a segment model(SM)based Mandarin LVCSR system．Experiments conducted on “863-test” set show that decoding time can be saved about 12.92% without obviously decreasing the recognition accuracy．Thus,potential of the method is demonstrated.

Key words: Speech recognition,Stochastic segment modeling,Decoding,Landmark

晁浩,杨占磊,刘文举. 汉语语音识别中声学界标点引导的随机段模型解码算法[J]. 计算机科学, 2013, 40(10): 208-212. https://doi.org/

CHAO Hao,YANG Zhan-lei and LIU Wen-ju. Landmark Guided Segmental Speech Decoding Algorithm for Continuous Mandarin Speech Recognition[J]. Computer Science, 2013, 40(10): 208-212. https://doi.org/

参考文献

[1] Kimball O,Ostendorf M,Bechwati I．Context Modeling with the Stochastic Segment model[J]．IEEE Trans．on Signal Proces-sing,1992,0(6):1584-1587
[2] 唐赟,刘文举,徐波．基于后验概率解码段模型的汉语语音数字串识别[J]．计算机学报,2006,29(4):635-642
[3] Chao Hao,Yang Zhan-lei,Liu Wen-ju．Improved Tone Modeling by Exploiting Articulatory Features for Mandarin Speech Reco-gnition[C]∥Proceedings of ICASSP．2012:4741-4744
[4] Tang Yun,Liu Wen-ju,Zhang Hua.One-pass coarse-to-fine segmental speech decoding algorithm[C]∥Proceedings of ICASSP．2006:441-444
[5] Zhang Hua,Liu Wen-ju,Xu Bo．Research on Adaptive Step Decoding in Segment-Based LVCSR[C]∥Proceedings of IEEE NLP-KE’07．2007:463-467
[6] 彭守业,刘文举,张华．基于相邻段的随机分段模型解码算法及其在LVCSR中的应用[C]∥2008年全国模式识别学术会议．2008:432-436
[7] 张晴晴,潘接林,颜永红．基于发音特征的汉语普通话语音声学建模[J].声学学报,2010,5(2):261-266
[8] Yang Zhan-lei,Liu Wen-ju．A Novel Path Extension Framework Using Steady Segment Detection for Mandarin Speech Recognition[C]∥Proceedings of InterSpeech．2010:226-229
[9] Liu S A.Landmark Detection for Distinctive Feature-basedSpeech Recognition[J]．Journal of the Acoustical Society of America,1996,100(5):3417-3430
[10] Park C．Consonant Landmark Detection for Speech Recognition[D]．Massachusetts,Cambriage:Massachusetts Institute of Technology,2008
[11] 唐赟．基于随机段模型的汉语语音识别算法研究[D].北京:中国科学院自动化研究,2006

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed