计算机科学 ›› 2019, Vol. 46 ›› Issue (5): 286-289.doi: 10.11896/j.issn.1002-137X.2019.05.044
史燕燕, 白静
SHI Yan-yan, BAI Jing
摘要: 针对现有表征语音特性的特征提取不完善的问题,提出了一种耳蜗滤波倒谱系数(Cochlear Filter Cepstral Coefficients,CFCC)和Teager能量算子倒谱参数(Teager Energy Operators Cepstral Coefficients,TEOCC)相互融合的方法。该方法将表征人耳听觉特性的CFCC和体现非线性能量特性的TEOCC的融合特征应用到语音识别系统中,并联合主成分分析(Principal Components Analysis,PCA)对该融合特征进行特征选择和优化,最后通过支持向量机进行语音识别。实验结果表明:该融合特征与单一特征相比具有更佳的语音识别性能,结合PCA后其语音识别的准确率平均提高了3.7%。
中图分类号:
[1]GAO Y.Cochlear Filter Cepstral Feature in Speech recognition[D].Taiyuan:Taiyuan University of Technology,2011.(in Chinese)高扬.耳蜗滤波器倒谱特征在语音识别中的应用[D].太原:太原理工大学,2011. [2]WANG L,MINAMI K,YAMAMOTO K,et al.Speaker Recognition by Combining MFCC and Phase Information in Noisy Conditions[J].IEICE Transactions on Information & Systems,2010,93-D(9):2397-2406. [3]LI Q.An auditory-based transform for audio signal processing[C]∥IEEE Workshop on Applications of Signal Processing to Audio and Acoustics,2009(WASPAA’09).IEEE,2009:181-184. [4]LI Q,HUANG Y.An Auditory-Based Feature Extraction Algorithm for Robust Speaker Identification Under Mismatched Conditions[J].IEEE Transactions on Audio Speech & Language Processing,2011,19(6):1791-1801. [5]LI Z Q,GAO Y.Robust speaker identification based on CFCC and phase information[J].Computer Engineering and Applications,2015,51(17):228-232.(in Chinese)李作强,高勇.基于CFCC和相位信息的鲁棒性说话人辨识[J].计算机工程与应用,2015,51(17):228-232. [6]PATEL T B,PATIL H.Combining Evidences from Mel Cepstral,Cochlear Filter Cepstral and Instantaneous Frequency Features for Detection of Natural vs.Spoofed Speech[C]∥The Conference of International Speech Communication Association.2015. [7]PATEL T B,PATIl H A.Cochlear Filter and InstantaneousFrequency Based Features for Spoofed Speech Detection[J].IEEE Journal of Selected Topics in Signal Processing,2017,11(4):618-631. [8]BANDELA S R,KUMAR T K.Stressed speech emotion recognition using feature fusion of teager energy operator and MFCC[C]∥International Conference on Computing,Communication and Networking Technologies.IEEE Computer Society,2017:1-5. [9]SREERAJ V V,RAJAN R.Automatic dialect recognition using feature fusion[C]∥International Conference on Trends in Electronics and Informatics.2017:435-439. [10]LI J J,AN D,YANG D,et al.TEO-CFCC Characteristic Para-meter Extraction Method for Speaker Recognition in Noisy Environments[J].Computer Science,2012,39(12):195-197.(in Chinese)李晶皎,安冬,杨丹,等.噪声环境下说话人识别的TEO-CFCC特征参数提取方法[J].计算机科学,2012,39(12):195-197. [11]WU D,CAO J,WANG J H.Speaker recognition based on adapted Gaussian mixture model and static and dynamic auditory feature fusion[J].Optics and Precision Engineering,2013,21(6):1598-1604.(in Chinese)吴迪,曹洁,王进花.基于自适应高斯混合模型与静动态听觉特征融合的说话人识别[J].光学精密工程,2013,21(6):1598-1604. [12]KAISER J F.On a simple algorithm to calculate the ‘energy’of a signal[C]∥International Conference on Acoustics,Speech,and Signal Processing.IEEE,2002:381-384. [13]WANG M R,ZHOU P,JING X X.Mixed Peramaters of Mel Frequency Cepstral and Short-time TEO Energy in Speaker Re-cognition[J].Microelectronics & Computer,2016,33(1):144-148.(in Chinese)王茂蓉,周萍,景新幸.MFCC和短时TEO能量的混合参数应用于说话人识别[J].微电子学与计算机,2016,33(1):144-148. [14]LI J,ZHOU P,DU Z R.Application of short-time TEO energy in noisy speech endpoint detection[J].Computer Engineering and Applications,2013,49(12):144-147.(in Chinese)李杰,周萍,杜志然.短时TEO能量在带噪语音端点检测中的应用[J].计算机工程与应用,2013,49(12):144-147. [15]JIANG H H,HU B.Speech Emotion Recognition in Mandarin based on PCA and SVM[J].Computer Science,2015,42(11):270-273.(in Chinese)蒋海华,胡斌.基于PCA和SVM的普通话语音情感识别[J].计算机科学,2015,42(11):270-273. [16]YUE Q Q,ZHOU P,JING X X.The Auditory Feature Extraction Algorithm Based on Power-law Nonlinearity Function[J].Microelectronics and Computers,2015(6):163-166.(in Chinese)岳倩倩,周萍,景新幸.基于非线性幂函数的听觉特征提取算法研究[J].微电子学与计算机,2015(6):163-166. |
[1] | 李其烨, 邢红杰. 基于最大相关熵的KPCA异常检测方法 KPCA Based Novelty Detection Method Using Maximum Correntropy Criterion 计算机科学, 2022, 49(8): 267-272. https://doi.org/10.11896/jsjkx.210700175 |
[2] | 徐鸣珂, 张帆. Head Fusion:一种提高语音情绪识别的准确性和鲁棒性的方法 Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition 计算机科学, 2022, 49(7): 132-141. https://doi.org/10.11896/jsjkx.210100085 |
[3] | 阙华坤, 冯小峰, 刘盼龙, 郭文翀, 李健, 曾伟良, 范竞敏. Grassberger熵随机森林在窃电行为检测的应用 Application of Grassberger Entropy Random Forest to Power-stealing Behavior Detection 计算机科学, 2022, 49(6A): 790-794. https://doi.org/10.11896/jsjkx.210800032 |
[4] | 程高峰, 颜永红. 多语言语音识别声学模型建模方法最新进展 Latest Development of Multilingual Speech Recognition Acoustic Model Modeling Methods 计算机科学, 2022, 49(1): 47-52. https://doi.org/10.11896/jsjkx.210900013 |
[5] | 杨润延, 程高峰, 刘建. 基于端到端语音识别的关键词检索技术研究 Study on Keyword Search Framework Based on End-to-End Automatic Speech Recognition 计算机科学, 2022, 49(1): 53-58. https://doi.org/10.11896/jsjkx.210800269 |
[6] | 吴善杰, 王新. 基于AGA-DBSCAN优化的RBF神经网络构造煤厚度预测方法 Prediction of Tectonic Coal Thickness Based on AGA-DBSCAN Optimized RBF Neural Networks 计算机科学, 2021, 48(7): 308-315. https://doi.org/10.11896/jsjkx.200800110 |
[7] | 胡昕彤, 沙朝锋, 刘艳君. 基于随机投影和主成分分析的网络嵌入后处理算法 Post-processing Network Embedding Algorithm with Random Projection and Principal Component Analysis 计算机科学, 2021, 48(5): 124-129. https://doi.org/10.11896/jsjkx.200500058 |
[8] | 王艺皓, 丁洪伟, 李波, 保利勇, 张颖婕. 基于聚类与特征融合的蛋白质亚细胞定位预测 Prediction of Protein Subcellular Localization Based on Clustering and Feature Fusion 计算机科学, 2021, 48(3): 206-213. https://doi.org/10.11896/jsjkx.200200081 |
[9] | 冯安然, 王旭仁, 汪秋云, 熊梦博. 基于PCA和随机树的数据库异常访问检测 Database Anomaly Access Detection Based on Principal Component Analysis and Random Tree 计算机科学, 2020, 47(9): 94-98. https://doi.org/10.11896/jsjkx.190800056 |
[10] | 郑纯军, 王春立, 贾宁. 语音任务下声学特征提取综述 Survey of Acoustic Feature Extraction in Speech Tasks 计算机科学, 2020, 47(5): 110-119. https://doi.org/10.11896/jsjkx.190400122 |
[11] | 崔阳, 刘长红. 基于PIFA的语音识别系统评测平台 PIFA-based Evaluation Platform for Speech Recognition System 计算机科学, 2020, 47(11A): 638-641. https://doi.org/10.11896/jsjkx.200500097 |
[12] | 张经, 杨健, 苏鹏. 语音识别中单音节识别研究综述 Survey of Monosyllable Recognition in Speech Recognition 计算机科学, 2020, 47(11A): 172-174. https://doi.org/10.11896/jsjkx.200200006 |
[13] | 张明月, 王静. 基于深度学习的交互似然目标跟踪算法 Interactive Likelihood Target Tracking Algorithm Based on Deep Learning 计算机科学, 2019, 46(2): 279-285. https://doi.org/10.11896/j.issn.1002-137X.2019.02.043 |
[14] | 高忠石, 苏旸, 柳玉东. 基于PCA-LSTM的入侵检测研究 Study on Intrusion Detection Based on PCA-LSTM 计算机科学, 2019, 46(11A): 473-476. |
[15] | 王鹏飞, 张杭. 欠定条件下基于主成分的亚采样信号重构 Sub-sampling Signal Reconstruction Based on Principal Component Under Underdetermined Conditions 计算机科学, 2019, 46(10): 103-108. https://doi.org/10.11896/jsjkx.190700195 |
|