计算机科学 ›› 2016, Vol. 43 ›› Issue (Z11): 45-49.doi: 10.11896/j.issn.1002-137X.2016.11A.010

• 智能计算 • 上一篇    下一篇

基于深度神经网络的语音识别系统研究

李伟林,文剑,马文凯   

  1. 北京林业大学工学院 北京100083,北京林业大学工学院 北京100083,北京林业大学工学院 北京100083
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受国家级大学生创新创业训练计划资助

Speech Recognition System Based on Deep Neural Network

LI Wei-lin, WEN Jian and MA Wen-kai   

  • Online:2018-12-01 Published:2018-12-01

摘要: 语音识别是人机交互模式识别领域的一个重要课题,构建了一种基于深度神经网络的语音识别系统,使用了抗噪对比散度法和抗噪最小平方误差法对模型进行无监督训练;使用了均值归一化进行模型优化, 提高了网络对训练集的拟合度,并且降低了语音识别的错误率;使用多状态激活函数进行了模型优化,这不仅使得不带噪测试和带噪声测试的语音识别错误率进一步下降,并能在一定程度上减轻过拟合现象;并通过奇异值分解和重构的方法对模型进行了降维。实验结果表明,此系统可以在不影响语音识别错误率的基础上极大地降低系统的复杂性。

关键词: 模式识别,深度神经网络,语音识别,隐马尔科夫模型,模型重构

Abstract: Speech recognition is an important subject in the field of human computer interaction pattern recognition.A speech recognition system based on deep neural network was constructed in this paper.The model was trained without supervision by using the method of anti-chirp contrast divergence and anti-chirp least squares error.The model optimization was carried out using the average value normalization. The fitting degree of the network to the training set is improved and the error rate of speech recognition was reduced.The system used the multi-condition activation function for the model optimization,then the error rate of speech recognition without noise and noise measurement was further reduced.So the system can reduce the over fitting phenomenon.The model was reduced by using the method of singular value decomposition and reconstruction.Experimental results show that the system can greatly reduce the complexity of the system without affecting the error rate of speech recognition.

Key words: Pattern recognition,Deep neural network,Speech recognition,Hidden markov model(HMM),Model reconstruction

[1] Vincent P,Larocheiie H,Lajoie I,et al.Stacked Denoising Autoencoders:Learning Useful Representations in a Deep Network with a Local Denoising Criterion[J].Journal of Machine Lear-ning Research,2010,1:3371-3408
[2] Martens J.Deep learning via hessian-free optimization[C]∥Pro-ceedings of the 27th International Conference on Machine Learning (ICML-10).Israel:Haifa,2010:735-742
[3] Dean J,Corrado G,Monga R,et al.Large scale distributed deep networks[C]∥Advances in Neural Information Processing Systems 25:26th Annual Conference on Neural Information Processing Systems 2012.Lake Tahoe,Nevada,United States:Proceedings of a meeting held.2012:1232-1240
[4] Deng W,Qian Y,Fan Y,et al.Stochastic data sweeping for fast DNN training[C]∥IEEE International Conference on Acoustics,Speech and Signal Processing.Florence,Italy:ICASSP 2014,2014:240-244
[5] You Zhan,Wang Xiao-rui,Xu Bo.Exploring one pass learning for deep neural network training with averaged stochastic gra-dient descent[C]∥ICASSP 2014.2014:6854-6858
[6] Xue J,Li J,Gong Y.Restructuring of deep neural network acoustic models with singular value decomposition[C]∥INTERSPEECH 2013.Lyon,France:14th Annual Conference of the International Speech Communication Association,2013:2365-2369
[7] He Y,Qian F Y,et al.Reshaping deep neural network for fast decoding by node-pruning[C]∥IEEE International Conference on Acoustics,Speech and Signal Processing.Florence,Italy:ICASSP 2014,2014:245-249
[8] Graves A,Mohamed A R,Geoffrey E.HINTON.Speech recognition with deep recurrent neural networks[C]∥ICASSP 2013.2013:6645-6649
[9] Sarikaya R,Hinton G E,Deoras A.Application of Deep Belief Networks for Natural Language Understanding[J].IEEE/ACM Transactions on Audio,Speech & Language Processing,2014,2(4):778-784
[10] Mohamed A,Dahl G E,Hinton G E.Acoustic modeling using deep belief networks[J].IEEE Transactions on Audio,Speech &Language Processing,2012,0(1):14-22

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!