计算机科学 ›› 2020, Vol. 47 ›› Issue (11A): 172-174.doi: 10.11896/jsjkx.200200006
张经, 杨健, 苏鹏
ZHANG Jing, YANG Jian, SU Peng
摘要: 声学模型建模可实现对语音信号的处理和特征抽取,是语音识别过程中必不可少的基础性工作,同时也是影响语音识别整体性能的一个重要因素。在语音识别中,选择合适的建模基元能使后续系统获得更高的准确率和更强的鲁棒性。音节是汉语等汉藏语系的最小发音单位,针对其发音特点,研究使用音节作为汉藏语系语音识别的建模基元,再提取相应的特征进行识别就有着尤为重要的意义。针对单音节识别目前的研究进展,首先介绍了基于有限状态矢量量化的算法,以及其改进算法在单音节识别中的研究成果;然后介绍了基于隐马尔可夫模型的算法,并详细介绍了将隐马尔可夫模型与其他算法相结合的音节识别研究成果;接着介绍了基于神经网络的算法;最后总结并提出了单音节识别研究未来发展的重要方向。
中图分类号:
[1] LI J,XU M X,ZHANG J Y,etal.Comparison of Acoustic Model Elements in Chinese Continuous Speech Recognition:Syllable,Phoneme and Vowel[C]//Proceedings of the 6th National Conference on Man-machine Voice Communication.2001:391-395. [2] MA Z X,WANG H,LI X.Overview of Speech RecognitionTechnology[J].Journal of Changji University,2006(3):93-97. [3] ZHAN X Y,WU J P,ZHANG Y W.Optimum Vetor Quantizational Codebook Design for Speeker Recognition[J].International Conference on Signal Processing Proceedings,2004,7:14-16. [4] HOU L M,ZHANG J Q.Design of a Speaker Recognition System Based on Chinese Monosyllables[J].Journal of Lanzhou University,1996(4):81-85. [5] JIA Z J.Research and Implementation of Chinese Speech Recognition Technology[D].Tianjin:Tianjin University of Technology,2013. [6] CHU S C,TSAI P W,PAN J S.Cat swarm optimization[C]//9th Pacific kim International Conference on Artificial Intelligence.Berlin:Springer,2006:854-858. [7] LU Y H.Research on Key Issues of Speech Recognition Technology[D].Shaanxi:Shaanxi Normal University,2014. [8] CAO G B.Research on Continuous Speech Recognition Techno-logy Based on HMM[D].Nanjing:Nanjing University of Science and Technology,2018. [9] DEIVAPALAN P G.A segmented syllable-based isolated word recognizer for indian languages [D].Indian Institute of Techno-logy Madras,2008. [10] THANGARAJAN R,NATARAJAN A M,SELVAM M.Word and triphone based approaches in continuous speech recognition for Tamil language[J].WSEAS Transactions on Signal Proces-sing,2008,4(3):76-85. [11] WANG Z Y,XIAO X.HMM speech recognition model based on segment length distribution[J].Acta Electronica Sinica,2004(1):46-49. [12] CHAO H,YANG Z L,LIU W J.Improved algorithm of syllable-based acoustic model in Chinese speech recognition[J].Journal of Computer Applications,2013,33(6):1742-1745. [13] CAO H.A New Chinese Monosyllable Recognition Method[J].Journal Publishing Center of Tsinghua University Press,1990(4):87-92. [14] WU Y.Design and Implementation of Chinese Speech SyllableRecognition Algorithm Based on SVM and HMM[D].Univer-sity of Electronic Science and Technology of China,2010. [15] ZHOU N,ZHAO Y,LI Y Q,et al.Study on Continuous Speech Recognition Based on Bottleneck Features for Lhasa-Tibetan Dialect[J].Acta Scientiarum Naturalium Universitatis Pekinensis,2018,54(2):249-254. [16] WU J F.Research and Application of RNN-DNN Speech Recognition System[D].South China University of Technology,2018. [17] ZHAO Y B,SUN S H.A TTRNN-based Chinese syllable full syllable recognition method[J].Journal of Harbin Institute of Technology,2001(2):213-216. [18] PAUL J W.Backpropagation through time what it does and how to do it[J].Proceedings of the IEEE,1990,78(10):575-580. [19] ZHU S,CHEN D W,HUANG T Y.Feature parameter curve methods for high-performance NN-based speech recognition[C]//IEEE International Conference on Acoustics.IEEE,2002. |
[1] | 王冠宇, 钟婷, 冯宇, 周帆. 基于矢量量化编码的协同过滤推荐方法 Collaborative Filtering Recommendation Method Based on Vector Quantization Coding 计算机科学, 2022, 49(9): 48-54. https://doi.org/10.11896/jsjkx.210700109 |
[2] | 宁晗阳, 马苗, 杨波, 刘士昌. 密码学智能化研究进展与分析 Research Progress and Analysis on Intelligent Cryptology 计算机科学, 2022, 49(9): 288-296. https://doi.org/10.11896/jsjkx.220300053 |
[3] | 费星瑞, 谢逸. 基于HMM-NN的用户点击流识别 Click Streams Recognition for Web Users Based on HMM-NN 计算机科学, 2022, 49(7): 340-349. https://doi.org/10.11896/jsjkx.210600127 |
[4] | 徐鸣珂, 张帆. Head Fusion:一种提高语音情绪识别的准确性和鲁棒性的方法 Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition 计算机科学, 2022, 49(7): 132-141. https://doi.org/10.11896/jsjkx.210100085 |
[5] | 王欣, 向明月, 李思颖, 赵若成. 基于隐马尔可夫模型的铁路出行团体关系预测研究 Relation Prediction for Railway Travelling Group Based on Hidden Markov Model 计算机科学, 2022, 49(6A): 247-255. https://doi.org/10.11896/jsjkx.210500001 |
[6] | 程高峰, 颜永红. 多语言语音识别声学模型建模方法最新进展 Latest Development of Multilingual Speech Recognition Acoustic Model Modeling Methods 计算机科学, 2022, 49(1): 47-52. https://doi.org/10.11896/jsjkx.210900013 |
[7] | 杨润延, 程高峰, 刘建. 基于端到端语音识别的关键词检索技术研究 Study on Keyword Search Framework Based on End-to-End Automatic Speech Recognition 计算机科学, 2022, 49(1): 53-58. https://doi.org/10.11896/jsjkx.210800269 |
[8] | 池昊宇, 陈长波. 基于神经网络的循环分块大小预测 Prediction of Loop Tiling Size Based on Neural Network 计算机科学, 2020, 47(8): 62-70. https://doi.org/10.11896/jsjkx.191200180 |
[9] | 王鹏, 苏伟, 张久文, 刘映杰, 王臻睿. 基于船舶自动识别系统与人工神经网络的船舶载重预测 Prediction of Vessel Load Based on Vessel Automatic Identification System and Artificial Neural Network 计算机科学, 2020, 47(6A): 49-53. https://doi.org/10.11896/JsJkx.191000074 |
[10] | 郑纯军, 王春立, 贾宁. 语音任务下声学特征提取综述 Survey of Acoustic Feature Extraction in Speech Tasks 计算机科学, 2020, 47(5): 110-119. https://doi.org/10.11896/jsjkx.190400122 |
[11] | 张成伟, 罗凤娥, 代毅. 基于数据挖掘的指定航班计划延误预测方法 Prediction Method of Flight Delay in Designated Flight Plan Based on Data Mining 计算机科学, 2020, 47(11A): 464-470. https://doi.org/10.11896/jsjkx.200600001 |
[12] | 崔阳, 刘长红. 基于PIFA的语音识别系统评测平台 PIFA-based Evaluation Platform for Speech Recognition System 计算机科学, 2020, 47(11A): 638-641. https://doi.org/10.11896/jsjkx.200500097 |
[13] | 史燕燕, 白静. 融合CFCC和Teager能量算子倒谱参数的语音识别 Speech Recognition Combining CFCC and Teager Energy Operators Cepstral Coefficients 计算机科学, 2019, 46(5): 286-289. https://doi.org/10.11896/j.issn.1002-137X.2019.05.044 |
[14] | 岳鑫, 杜军威, 胡强, 王延平. 一种故障树结构匹配算法及其应用 Fault Tree Structure Matching Algorithm and Its Application 计算机科学, 2018, 45(9): 202-206. https://doi.org/10.11896/j.issn.1002-137X.2018.09.033 |
[15] | 宫法明,朱朋海. 基于自适应隐马尔可夫模型的石油领域文档分词 Word Segmentation Based on Adaptive Hidden Markov Model in Oilfield 计算机科学, 2018, 45(6A): 97-100. |
|