计算机科学 ›› 2016, Vol. 43 ›› Issue (8): 297-299.doi: 10.11896/j.issn.1002-137X.2016.08.060

• 图形图像与模式识别 • 上一篇    下一篇

一种新的鲁棒声纹特征提取与融合方法

罗元,孙龙   

  1. 重庆邮电大学光电工程学院 重庆400065,重庆邮电大学光电工程学院 重庆400065
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受重庆市教委科学技术研究项目基金(KJ130512)资助

New Method of Robust Voiceprint Feature Extraction and Fusion

LUO Yuan and SUN Long   

  • Online:2018-12-01 Published:2018-12-01

摘要: 为提高说话人确认系统在噪声环境下的鲁棒性,在利用听觉外周模型改进Mel频率倒谱系数(Mel FrequencyCepstral Coefficient,MFCC)的基础上,结合感知线性预测系数(Perceptual Linear Predictive Coefficient,PLPC),以类间区分度为依据,在特征域对两种声纹特征进行融合,提出一种新的声纹特征提取方法,并对基于该特征的说话人确认系统的噪声鲁棒性进行研究。针对不同信噪比的语音信号进行了融合特征与原始特征的对比实验,结果表明,融合特征在模拟餐厅噪声环境中的错误率更低,较MFCC与PLPC分别降低了2.2%和3.1%,说话人确认系统在噪声中的鲁棒性得到提升。

关键词: Gammatone特征参数,感知线性预测,类间区分度,特征融合,鲁棒性,说话人确认

Abstract: In order to promote the robustness of speaker verification system in noise circumstance,this paper improved MFCC based on auditory periphery model,finished the fusion of improved MFCC and PLPC according to the inter-cluster distinctness,obtained a new voiceprint feature and tested its robustness in noise circumstance.An experiment based on different SNR between fused feature and original feature was finished.The experimental results show that the fused feature can effectively increase the voiceprint recognition compared to MFCC and PLPC by 2.2% and 3.1% in simulated restaurant noise.

Key words: GFCC,PLP,Inter-cluster distinctness,Feature fusion,Robustness,Speaker verification

[1] Pols L C W,Schouten M E H.Perception of tone,band,and formant sweeps[M]∥The psychophysics of Speech Perception.Springer Netherlands,1987:231-240
[2] Hsu W,Sun J.The Effectiveness of Linear Prediction Residual to the Verification of Voiceprint and the Recognition of Chinese Tone[C]∥IEEE International Symposium on Multimedia.IEEE,2010:353-356
[3] Patil H A,Basu T K.Comparison of subband cepstrum and Mel cepstrum for open set speaker classification[C]∥Proceedings of the IEEE INDICON 2004 India Annual Conference,2004.First.IEEE,2004:35-40
[4] Srinivasan A.Speaker identification and Verification using Vector quantization and Mel frequency Cepstral Coefficients[J].Engineering and Technology,2012,4(1):33-40
[5] Hermansky H.Perceptual linear predictive (PLP) analysis ofspeech[J].The Journal of the Acoustical Society of America,1990,87(4):1738-1752
[6] Li Y G,Ouyang X Z,Z F.A Study on Robust Speech Recognition Based on Gamatone feature[C]∥ National Conference on Man-Machine Speech Communication Chinese Information Processing Society of China.2013:26-29(in Chinese) 李银国,欧阳希子,郑方.基于Gammatone特征的鲁棒语音识别研究[C]∥十二届全国人机语音通讯学术会议(NCMMSC’2013).2013:26-29
[7] Luo Y,Chen J,Zhang Y.An Auditory Feature Extraction Algorithm Based on Gama-chirp Filter Banks[J].Information and Control,2013,2(5):590-594(in Chinese) 罗元,陈君,张毅.基于伽马啁啾滤波器组的听觉特征提取算法[J].信息与控制,2013,42(5):590-594
[8] Makhoul J,Cosell L.LPCW:An LPC vocoder with linear predictive spectral warping[C]∥ IEEE International Conference on Acoustics,Speech,and Signal Processing(ICASSP’76).IEEE,1976:466-469
[9] Varga A,Steeneken H J M.Assessment for automatic speech recognition:II.NOISEX-92:A database and an experiment to study the effect of additive noise on speech recognition systems[J].Speech Communication,1993,12(3):247-251
[10] Bao H J,Zheng F.Combined GMM-UBM and SVM speaker identification system[J].Tsinghua University (Science & Technology),2007,8(S1):693-698(in Chinese) 鲍焕军,郑方.GMM-UBM和SVM说话人辨认系统及融合的分析[J].清华大学学报(自然科学版),2007,8(S1):693-698

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!