语音识别中单音节识别研究综述

doi:10.11896/jsjkx.200200006

Abstract

Abstract: Acoustic model modeling realizes the processing of speech signals and feature extraction,which is an essential basic work in the process of speech recognition and an important factor affecting the overall performance of speech recognition.In speech recognition,selecting appropriate modeling primitives can make subsequent systems obtain higher accuracy and stronger robustness.Syllable is the smallest pronunciation unit of Sino-Tibetan languages such as Chinese.According to its pronunciation characteristics,it is of great significance to study the use of syllable as the modeling element of Sino-Tibetan language speech re-cognition and to extract the corresponding features for recognition.In view of the current research progress of monosyllabic re-cognition,this paper first introduces the algorithm based on finite state vector quantization and the research results of its improved algorithm in monosyllabic recognition.Then the algorithm based on hidden Markov model is introduced,and the syllable recognition research results combining hidden Markov model with other algorithms are introduced in details,and then the algorithm based on neural network is introduced.Finally,the important development direction of monosyllabic recognition research in the future is summarized and proposed.

Key words: Artificial neural network, Hidden Markov model, Monosyllable recognition, Speech recognition, Vector quantization

CLC Number:

TN912.34

ZHANG Jing, YANG Jian, SU Peng. Survey of Monosyllable Recognition in Speech Recognition[J].Computer Science, 2020, 47(11A): 172-174.

References

[1] LI J,XU M X,ZHANG J Y,etal.Comparison of Acoustic Model Elements in Chinese Continuous Speech Recognition:Syllable,Phoneme and Vowel[C]//Proceedings of the 6th National Conference on Man-machine Voice Communication.2001:391-395.
[2] MA Z X,WANG H,LI X.Overview of Speech RecognitionTechnology[J].Journal of Changji University,2006(3):93-97.
[3] ZHAN X Y,WU J P,ZHANG Y W.Optimum Vetor Quantizational Codebook Design for Speeker Recognition[J].International Conference on Signal Processing Proceedings,2004,7:14-16.
[4] HOU L M,ZHANG J Q.Design of a Speaker Recognition System Based on Chinese Monosyllables[J].Journal of Lanzhou University,1996(4):81-85.
[5] JIA Z J.Research and Implementation of Chinese Speech Recognition Technology[D].Tianjin:Tianjin University of Technology,2013.
[6] CHU S C,TSAI P W,PAN J S.Cat swarm optimization[C]//9th Pacific kim International Conference on Artificial Intelligence.Berlin:Springer,2006:854-858.
[7] LU Y H.Research on Key Issues of Speech Recognition Technology[D].Shaanxi:Shaanxi Normal University,2014.
[8] CAO G B.Research on Continuous Speech Recognition Techno-logy Based on HMM[D].Nanjing:Nanjing University of Science and Technology,2018.
[9] DEIVAPALAN P G.A segmented syllable-based isolated word recognizer for indian languages [D].Indian Institute of Techno-logy Madras,2008.
[10] THANGARAJAN R,NATARAJAN A M,SELVAM M.Word and triphone based approaches in continuous speech recognition for Tamil language[J].WSEAS Transactions on Signal Proces-sing,2008,4(3):76-85.
[11] WANG Z Y,XIAO X.HMM speech recognition model based on segment length distribution[J].Acta Electronica Sinica,2004(1):46-49.
[12] CHAO H,YANG Z L,LIU W J.Improved algorithm of syllable-based acoustic model in Chinese speech recognition[J].Journal of Computer Applications,2013,33(6):1742-1745.
[13] CAO H.A New Chinese Monosyllable Recognition Method[J].Journal Publishing Center of Tsinghua University Press,1990(4):87-92.
[14] WU Y.Design and Implementation of Chinese Speech SyllableRecognition Algorithm Based on SVM and HMM[D].Univer-sity of Electronic Science and Technology of China,2010.
[15] ZHOU N,ZHAO Y,LI Y Q,et al.Study on Continuous Speech Recognition Based on Bottleneck Features for Lhasa-Tibetan Dialect[J].Acta Scientiarum Naturalium Universitatis Pekinensis,2018,54(2):249-254.
[16] WU J F.Research and Application of RNN-DNN Speech Recognition System[D].South China University of Technology,2018.
[17] ZHAO Y B,SUN S H.A TTRNN-based Chinese syllable full syllable recognition method[J].Journal of Harbin Institute of Technology,2001(2):213-216.
[18] PAUL J W.Backpropagation through time what it does and how to do it[J].Proceedings of the IEEE,1990,78(10):575-580.
[19] ZHU S,CHEN D W,HUANG T Y.Feature parameter curve methods for high-performance NN-based speech recognition[C]//IEEE International Conference on Acoustics.IEEE,2002.

Related Articles 15

[1]	WANG Guan-yu, ZHONG Ting, FENG Yu, ZHOU Fan. Collaborative Filtering Recommendation Method Based on Vector Quantization Coding [J]. Computer Science, 2022, 49(9): 48-54.
[2]	NING Han-yang, MA Miao, YANG Bo, LIU Shi-chang. Research Progress and Analysis on Intelligent Cryptology [J]. Computer Science, 2022, 49(9): 288-296.
[3]	XU Ming-ke, ZHANG Fan. Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition [J]. Computer Science, 2022, 49(7): 132-141.
[4]	CHENG Gao-feng, YAN Yong-hong. Latest Development of Multilingual Speech Recognition Acoustic Model Modeling Methods [J]. Computer Science, 2022, 49(1): 47-52.
[5]	YANG Run-yan, CHENG Gao-feng, LIU Jian. Study on Keyword Search Framework Based on End-to-End Automatic Speech Recognition [J]. Computer Science, 2022, 49(1): 53-58.
[6]	CHI Hao-yu, CHEN Chang-bo. Prediction of Loop Tiling Size Based on Neural Network [J]. Computer Science, 2020, 47(8): 62-70.
[7]	WANG Peng, SU Wei, ZHANG Jiu-wen, LIU Ying-Jie and WANG Zhen-rui. Prediction of Vessel Load Based on Vessel Automatic Identification System and Artificial Neural Network [J]. Computer Science, 2020, 47(6A): 49-53.
[8]	ZHENG Chun-jun, WANG Chun-li, JIA Ning. Survey of Acoustic Feature Extraction in Speech Tasks [J]. Computer Science, 2020, 47(5): 110-119.
[9]	CUI Yang, LIU Chang-hong. PIFA-based Evaluation Platform for Speech Recognition System [J]. Computer Science, 2020, 47(11A): 638-641.
[10]	ZHANG Cheng-wei, LUO Feng-e, DAI Yi. Prediction Method of Flight Delay in Designated Flight Plan Based on Data Mining [J]. Computer Science, 2020, 47(11A): 464-470.
[11]	JIA Zhi-chun, LI Xiang, YU Zhan-lin, LU Yuan, XING Xing. QoS Satisfaction Prediction of Cloud Service Based on Second Order Hidden Markov Model [J]. Computer Science, 2019, 46(9): 321-324.
[12]	SHI Yan-yan, BAI Jing. Speech Recognition Combining CFCC and Teager Energy Operators Cepstral Coefficients [J]. Computer Science, 2019, 46(5): 286-289.
[13]	WU Jian-wei, LI Yan-ling, ZHANG Hui, ZANG Han-lin. HMM Cooperative Spectrum Prediction Algorithm Based on Density Clustering [J]. Computer Science, 2018, 45(9): 129-134.
[14]	YUE Xin, DU Jun-wei, HU Qiang, WANG Yan-ping. Fault Tree Structure Matching Algorithm and Its Application [J]. Computer Science, 2018, 45(9): 202-206.
[15]	GONG Fa-ming,ZHU Peng-hai. Word Segmentation Based on Adaptive Hidden Markov Model in Oilfield [J]. Computer Science, 2018, 45(6A): 97-100.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Survey of Monosyllable Recognition in Speech Recognition

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0