计算机科学 ›› 2009, Vol. 36 ›› Issue (7): 211-214.doi: 10.11896/j.issn.1002-137X.2009.07.051

• 人工智能 • 上一篇    下一篇

小波变分辨率频谱特征静音检测和短时自适应混音算法

薛卫,都思丹,叶迎宪   

  1. (南京农业大学计算机系 南京210002);(南京大学电子科学与工程系 南京210093)
  • 出版日期:2018-11-16 发布日期:2018-11-16
  • 基金资助:
    本文受国家自然科学基金(60472026)资助。

Voice Activity Detection Using Wavelets Multiresolution Spectrum and Short-time Adaptive Audio Mixing Algorithm

XUE Wei,DU Si-dan,YE Ying-xian   

  • Online:2018-11-16 Published:2018-11-16

摘要: 静音检测算法使用两种语音感觉特征与变分辫率频谱的Mel频率倒谱系数组合成音频特征,采用多门限过零率对静音进行初判,并通过二分类支持向量机对组合语音特征进行分类;实时混音算法使用每一路音频的短时能量作为混音权重。测试表明,静音检测算法在不同信噪比下语音识别正确率高于G.729b静音检测算法;实时混音算法听觉测试优于传统的算法,并且混音计算延时低,满足网络实时传输的要求;两种算法同时应用于视频会议系统,视频会议服务器的运算量低于使用了G.729b静音检测算法的视频系统。

关键词: 静音检测,小波,支持向量机,短时自适应权重

Abstract: The proposed VAD uses MFCC of multiresolution spectrum and two classical audio parameters as audio feature,and prejudges silence by detection of multi gate zero cross ratio, and classifies noise and voice by Support Vector Machines. New speech mixing algorithm used in Multipoint Control Unit(MCU) of conferences imposed short time power of each audio stream as mixing weight vector, and was designed for parallel processing in program. Various experiments show,proposed VAD algorithm achieves overall better performance in all SNRs than VAD of G. 729b and other VAD, output audio of new speech mixing algorithm has excellent hearing perceptibility, and its computational time delay is small enough to satisfy the needs of real-time transmission, and MCU computation is lower than that based on G. 729b VAD.

Key words: Voice activity detection, Wavelet, SVM, Short time adaptive weighted

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!