计算机科学 ›› 2021, Vol. 48 ›› Issue (6A): 33-37.doi: 10.11896/jsjkx.200700135

• 图像处理&多媒体技术 • 上一篇    下一篇

基于高低频带对数能量谱比贝叶斯决策的语音端点检测

张子丞, 谭志苇, 张晨瑞, 王旋, 刘晓璇, 俞一彪   

  1. 苏州大学电子信息学院 江苏 苏州215006
  • 出版日期:2021-06-10 发布日期:2021-06-17
  • 通讯作者: 俞一彪(yuyb@suda.edu.cn)
  • 作者简介:494725223@qq.com

Speech Endpoint Detection Based on Bayesian Decision of Logarithmic Power Spectrum Ratio in High and Low Frequency Band

ZHANG Zi-cheng, TAN Zhi-wei, ZHANG Chen-rui, WANG Xuan, LIU Xiao-xuan, YU Yi-biao   

  1. School of Electronic and Information Engineering,Soochow University,Suzhou,Jiangsu 215006,China
  • Online:2021-06-10 Published:2021-06-17
  • About author:ZHANG Zi-cheng,born in 1999,undergraduate.His main research interests include digital signal processing and speech processing.
    YU Yi-biao,born in 1962,Ph.D,professor.His main research interests include speech and image processing,pattern recognition and multimedia system.

摘要: 在分析语音信号与噪声在高低频带的能量谱表现特征基础上,提出一种低信噪比条件下采用高低频带对数能量谱比贝叶斯决策的语音端点检测方法。首先根据样本计算语音信号和背景噪声在高低两个不同频带的对数能量谱比值,依据最大似然估计得到两类信号对数能量谱比的统计分布,并基于贝叶斯决策准则推导最佳判决阈值。信号输入时,逐帧计算高低频带对数能量谱比并与判决阈值进行比较来进行语音和背景噪声的分类判决,从而实现语音信号的端点检测。实验结果表明,与传统的双门限检测法和谱熵检测法相比,提出的方法在较低信噪比条件下能更加准确地检测语音端点,明显提高了端点检测的准确率和速度。

关键词: 贝叶斯决策, 低信噪比, 对数能量谱比, 语音端点检测

Abstract: Based on the analysis of the power spectrum of speech signal and noise in high and low frequency band,a speech endpoint detection method under low SNR based on Bayesian decision of logarithmic power spectrum ratio in high and low frequency band is proposed.Firstly,the logarithm power spectrum ratio of speech signal and background noise in two different frequency bands is calculated respectively,and the statistical distribution is obtained according to the maximum likelihood estimation,and the optimal decision threshold is derived based on Bayesian decision criterion.When the signal is input,the log energy spectrum ratio of high and low frequency bands is calculated frame by frame and it is compared with the decision threshold to classify the speech and background noise,so as to realize the endpoint detection of speech signal.The experimental results show that,compared with the traditional double threshold detection method and spectral entropy detection method,the proposed method can detect speech endpoint more accurately under the condition of low SNR,and significantly improve the accuracy and speed of endpoint detection.

Key words: Bayesian decision, Logarithmic power spectrum ratio, Low SNR, Speech endpoint detection

中图分类号: 

  • TN912.34
[1] CHEN Y Y,JIAN L.Speech endpoint detection based on maximum entropy spectral estimation and time-frequency characteris-tics[J].Computer Applications and Software,2017,34 (11):91-96.
[2] SUN H F,LONG H,SHAO Y B,et al.Speech music classification algorithm based on zero crossing rate and spectrum [J].Journal of Yunnan University (Natural Science Edition),2019,41(5):925-931.
[3] CHEN H H,XU P.Sound endpoint detection method based on spectral subtraction and short-term energy [J].Mechanical Manufacturing and Automation,2016,45(3):191-192,209.
[4] CHEN Z W,ZENG Q N,XIE X M,et al.Speech endpoint detection method based on autocorrelation function [J].Computer Engineering and Application,2018,54(6):216-221,256.
[5] WANG R D,CHAI P Q.An improved speech endpoint detection method based on spectral entropy [J].Information and Control,2004(1):77-81.
[6] RABINER L R,SAMBUR M R.An Algorithm for Determining the Endpoints of Isolated Utterances[M].John Wiley & Sons,1975.
[7] FEI Y Q,WANG Y J,XIA Y L.Research on speech endpoint detection algorithm [J].Automation Technology and Application,2017,36(8):98-102.
[8] LI Y,CHENG L F,ZHANG P L.A speech endpoint detection method based on improved spectral entropy [J].Computer Science,2016,43(S2):233-236.
[9] WANG L,LI C R.An improved endpoint detection methodbased on adaptive spectral entropy [J].Computer Simulation,2010,27(12):373-375,395.
[10] WANG S,PU B M,LI X Z,et al.Wiener filter speech enhancement algorithm based on improved energy entropy ratio [J].Computer System Applications,2017,26(11):124-131.
[11] WU P,ZHANG X B,DING W.Speech endpoint detection basedon spectral subtraction and Short-term zero entropy method [J].Electroacoustic Technology,2018,42(3):55-59.
[1] 尹文兵, 高戈, 曾邦, 王霄, 陈怡.
基于时频域生成对抗网络的语音增强算法
Speech Enhancement Based on Time-Frequency Domain GAN
计算机科学, 2022, 49(6): 187-192. https://doi.org/10.11896/jsjkx.210500114
[2] 陈晋音, 成凯回, 郑海斌.
低信噪比下基于深度学习的调制模式识别方法
Deep Learning Based Modulation Recognition Method in Low SNR
计算机科学, 2020, 47(6A): 283-288. https://doi.org/10.11896/JsJkx.190800072
[3] 桑妍丽,钱宇华.
多粒度决策粗糙集中的粒度约简方法
Granular Structure Reduction Approach to Multigranulation Decision-theoretic Rough Sets
计算机科学, 2017, 44(5): 199-205. https://doi.org/10.11896/j.issn.1002-137X.2017.05.036
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!