Computer Science ›› 2019, Vol. 46 ›› Issue (5): 286-289.doi: 10.11896/j.issn.1002-137X.2019.05.044

Previous Articles     Next Articles

Speech Recognition Combining CFCC and Teager Energy Operators Cepstral Coefficients

SHI Yan-yan, BAI Jing   

  1. (College of Information and Computer,Taiyuan University of Technology,Taiyuan 030024,China)
  • Published:2019-05-15

Abstract: In view of the imperfection of the existing features which represent the speech characteristics,this paper proposed a mutual integration method based on Cochlear Filter Cepstral Coefficients and Teager Energy Operators Cepstral Coefficients.First,the fusion feature of CFCC that reflects human auditory characteristics and TEOCC that embodies nonlinear energy characteristics is applied to speech recognition system.Then principal component analysis is applied to the selection and optimization of fusion features.Finally,support vector machine is used for speech recognition.The results show that the proposed fusion features can achieve better speech recognition performance than single feature,and after combining PCA,the accuracy rate of speech recognition is increased by 3.7% on average.

Key words: CFCC, PCA, Speech recognition, TEOCC

CLC Number: 

  • TN912.34
[1]GAO Y.Cochlear Filter Cepstral Feature in Speech recognition[D].Taiyuan:Taiyuan University of Technology,2011.(in Chinese)高扬.耳蜗滤波器倒谱特征在语音识别中的应用[D].太原:太原理工大学,2011.
[2]WANG L,MINAMI K,YAMAMOTO K,et al.Speaker Recognition by Combining MFCC and Phase Information in Noisy Conditions[J].IEICE Transactions on Information & Systems,2010,93-D(9):2397-2406.
[3]LI Q.An auditory-based transform for audio signal processing[C]∥IEEE Workshop on Applications of Signal Processing to Audio and Acoustics,2009(WASPAA’09).IEEE,2009:181-184.
[4]LI Q,HUANG Y.An Auditory-Based Feature Extraction Algorithm for Robust Speaker Identification Under Mismatched Conditions[J].IEEE Transactions on Audio Speech & Language Processing,2011,19(6):1791-1801.
[5]LI Z Q,GAO Y.Robust speaker identification based on CFCC and phase information[J].Computer Engineering and Applications,2015,51(17):228-232.(in Chinese)李作强,高勇.基于CFCC和相位信息的鲁棒性说话人辨识[J].计算机工程与应用,2015,51(17):228-232.
[6]PATEL T B,PATIL H.Combining Evidences from Mel Cepstral,Cochlear Filter Cepstral and Instantaneous Frequency Features for Detection of Natural vs.Spoofed Speech[C]∥The Conference of International Speech Communication Association.2015.
[7]PATEL T B,PATIl H A.Cochlear Filter and InstantaneousFrequency Based Features for Spoofed Speech Detection[J].IEEE Journal of Selected Topics in Signal Processing,2017,11(4):618-631.
[8]BANDELA S R,KUMAR T K.Stressed speech emotion recognition using feature fusion of teager energy operator and MFCC[C]∥International Conference on Computing,Communication and Networking Technologies.IEEE Computer Society,2017:1-5.
[9]SREERAJ V V,RAJAN R.Automatic dialect recognition using feature fusion[C]∥International Conference on Trends in Electronics and Informatics.2017:435-439.
[10]LI J J,AN D,YANG D,et al.TEO-CFCC Characteristic Para-meter Extraction Method for Speaker Recognition in Noisy Environments[J].Computer Science,2012,39(12):195-197.(in Chinese)李晶皎,安冬,杨丹,等.噪声环境下说话人识别的TEO-CFCC特征参数提取方法[J].计算机科学,2012,39(12):195-197.
[11]WU D,CAO J,WANG J H.Speaker recognition based on adapted Gaussian mixture model and static and dynamic auditory feature fusion[J].Optics and Precision Engineering,2013,21(6):1598-1604.(in Chinese)吴迪,曹洁,王进花.基于自适应高斯混合模型与静动态听觉特征融合的说话人识别[J].光学精密工程,2013,21(6):1598-1604.
[12]KAISER J F.On a simple algorithm to calculate the ‘energy’of a signal[C]∥International Conference on Acoustics,Speech,and Signal Processing.IEEE,2002:381-384.
[13]WANG M R,ZHOU P,JING X X.Mixed Peramaters of Mel Frequency Cepstral and Short-time TEO Energy in Speaker Re-cognition[J].Microelectronics & Computer,2016,33(1):144-148.(in Chinese)王茂蓉,周萍,景新幸.MFCC和短时TEO能量的混合参数应用于说话人识别[J].微电子学与计算机,2016,33(1):144-148.
[14]LI J,ZHOU P,DU Z R.Application of short-time TEO energy in noisy speech endpoint detection[J].Computer Engineering and Applications,2013,49(12):144-147.(in Chinese)李杰,周萍,杜志然.短时TEO能量在带噪语音端点检测中的应用[J].计算机工程与应用,2013,49(12):144-147.
[15]JIANG H H,HU B.Speech Emotion Recognition in Mandarin based on PCA and SVM[J].Computer Science,2015,42(11):270-273.(in Chinese)蒋海华,胡斌.基于PCA和SVM的普通话语音情感识别[J].计算机科学,2015,42(11):270-273.
[16]YUE Q Q,ZHOU P,JING X X.The Auditory Feature Extraction Algorithm Based on Power-law Nonlinearity Function[J].Microelectronics and Computers,2015(6):163-166.(in Chinese)岳倩倩,周萍,景新幸.基于非线性幂函数的听觉特征提取算法研究[J].微电子学与计算机,2015(6):163-166.
[1] XU Ming-ke, ZHANG Fan. Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition [J]. Computer Science, 2022, 49(7): 132-141.
[2] CHENG Gao-feng, YAN Yong-hong. Latest Development of Multilingual Speech Recognition Acoustic Model Modeling Methods [J]. Computer Science, 2022, 49(1): 47-52.
[3] YANG Run-yan, CHENG Gao-feng, LIU Jian. Study on Keyword Search Framework Based on End-to-End Automatic Speech Recognition [J]. Computer Science, 2022, 49(1): 53-58.
[4] HUANG Xiao-sheng, XU Jing. Multi-focus Image Fusion Method Based on PCANet in NSST Domain [J]. Computer Science, 2021, 48(9): 181-186.
[5] HU Yu-wen. Stock Forecast Based on Optimized LSTM Model [J]. Computer Science, 2021, 48(6A): 151-157.
[6] ZHENG Chun-jun, WANG Chun-li, JIA Ning. Survey of Acoustic Feature Extraction in Speech Tasks [J]. Computer Science, 2020, 47(5): 110-119.
[7] CUI Yang, LIU Chang-hong. PIFA-based Evaluation Platform for Speech Recognition System [J]. Computer Science, 2020, 47(11A): 638-641.
[8] ZHANG Jing, YANG Jian, SU Peng. Survey of Monosyllable Recognition in Speech Recognition [J]. Computer Science, 2020, 47(11A): 172-174.
[9] LI Meng-xiao, YAO Shi-yuan. Design and Improvement of Face Recognition System Based on PCA [J]. Computer Science, 2019, 46(6A): 577-579.
[10] HAN Xu, CHEN Hai-yun, WANG Yi, XU Jin. Face Recognition Using SPCA and HOG with Single Training Image Per Person [J]. Computer Science, 2019, 46(6A): 274-278.
[11] ZHANG Ming-yue, WANG Jing. Interactive Likelihood Target Tracking Algorithm Based on Deep Learning [J]. Computer Science, 2019, 46(2): 279-285.
[12] LI Xiao-xin, ZHOU Yuan-shen, ZHOU Xuan, LI Jing-jing, LIU Zhi-yong. Gabor Occlusion Dictionary Learning via Singular Value Decomposition [J]. Computer Science, 2018, 45(6): 275-283.
[13] LI Xiao-xin, WU Ke-song, QI Pan-pan, ZHOU Xuan and LIU Zhi-yong. Local Sphere Normalization Embedding:An Improved Scheme for PCANet [J]. Computer Science, 2018, 45(5): 238-242.
[14] LI Shan-shan, CHEN Li, ZHANG Yong-xin and YUAN Ya-ting. Fuzzy Edge Detection Algorithm Based on RPCA [J]. Computer Science, 2018, 45(5): 273-279.
[15] ZHONG Fei and YANG Bin. License Plate Detection Based on Principal Component Analysis Network [J]. Computer Science, 2018, 45(3): 268-273.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!