计算机科学 ›› 2024, Vol. 51 ›› Issue (6A): 230600115-5.doi: 10.11896/jsjkx.230600115
白洁1, 田瑞丽2, 任一夫1, 员建厦1
BAI Jie1, TIAN Ruili2, REN Yifu1, YUAN Jianxia1
摘要: 噪声环境下语音检测准确率偏低是短波通话面临的公开挑战。当前已有方法应用有限,其根源在于难以可靠地在噪音环境下提取准确且高效的语音特征。针对上述问题,提出了一个面向短波通信的低秩方向梯度直方图(Low-rank Histogram of Oriented Gradient,LHOG)话音检测方法。首先,对目标音频源数据进行预处理,实现噪声环境下语音信息的可视化表征;然后,在HOG特征提取器中嵌入低秩化结构,缓解特征中的冗余信息,并降低噪声干扰,从而获得准确且高效的特征;最后,通过常用的SVM分类模型便可在噪声环境中准确快速地区分话音和噪声。测试结果表明,该方法的准确率达到了95.12%,误报率仅为0.96%,漏报率为13.14%。与现有主流方法的对比实验证明,该方法话音检测准确率高,资源占用少,能够有效提高短波通信侦控效率。
中图分类号:
[1]WANG J R,LI Y B.Design on all-digital demodulation algo-rithm for HF multitone parallel signal[J].Radio Engineering,2016,46(1):76-79. [2]WAN L,WANG Q,LI J.End-to-End Speech Recognition with Recurrent Neural Networks for Mandarin Chinese[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2017,25(10):1974-1983. [3]LI B.Speech Activity Detection Based on Deep Neural Networks Trained with Noise-Robust Features[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2017,25(11):2193-2203. [4]ALDARMAKI H,ULLAH A,RAM S,et al.Unsupervised automatic speech recognition:A review[J].Speech Communication,2022,139:76-91 [5]DONG B H,LI S Q.Current status and developing tendency for high frequency communications[J].Information and Electronic Engineering,2007,5(1):1-5. [6]YIN F M,WANG S J,ZHAO L.Environmental sound classification using DeepESC convolutional neural networks[J].Technical Acoustics.2019,38(5):590-593. [7]CHEN D,HUANG Z P.Car honking recognition based on mel frequency cepstrum coefficient and support vector machine[J].Science Technology and Engineering,2021,21(11):4486-4491. [8]SAILOR H B,AGRAWAL D M,PATIL H A.Unsupervised filterbank learning using convolutional restricted boltzmann machine for environmental sound classification[C]//Proceedings of Conference on the International Voice Communication Association,2017:3107-3111. [9]CHEN H T,LIU Z Z,LIU Z M,et al.Integrating the data augmentation scheme with various classifiers for acoustic scene modeling[J].arXiv:1907.006639,2019. [10]CHOI Y,ATIF O,LEE J,et al.Noise-robust sound-event classification system with texture analysis[J].Symmetry,2018,10(9):402. [11]QIU Y,JIA G M,YANG J F,et al.Voice recognition model of civil aviation radiotelephony communication based on BiLSTM[J].Journal of Signal Processing,2019,35(2):293-300. [12]YU Q Q,LI Y,LI Y.Eco-environmental sounds classificationunder noise conditions[J].Journal of Chinese Computer Systems,2011,32(8):1689-1693. [13]YANG L D,HU J T.Audio scene recognition of deep neural network under multiple optimization mechanisms[J].Journal of Signal Processing,2021,37(10):1969-1976. [14]DALAL N,TRIGGS B.Histograms of briented gradients forhuman detection[C]//IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR 2005).IEEE,2005:886-893. [15]GENG Y N,LIU S S,LIU T T,et al.Survey of pedestrian detection technology based on computer vision[J].Journal of Computer Applications,2021,41(S1):43-50. [16]LE V,ZHU Y,NGUYEN A.Research on depth image gesture segmentation and HOG-SVM gesture recognition method[J].Computer Applictions and Software,2016,33(12):122-126. [17]ALBIOL A,MONZO D,MARTIN A,et al.Face recognitionusing HOG-EB-GM[J].Pattern Recognition Letters,2008,29(10):1537-1543. [18]BAO X M,REN W J,LV W T.A novet algorithm for Pedestrian recognition based on gabor wavelet and HOG feature[J].Radio Engineering,2017,47(10):25-29,48. [19]ZHANG L,ZHANG Y,CHEN L L.A method of low illumination image target recognition[J].Radio Engineering,2020,50(8):656-660. [20]CORTES C,VAPNIK V.Support vector networks[J].Machine Learning,1995,20:273-297. [21]XU X Y,YAO P.Palm vein recognition algorithm based onHOG and improved SVM[J].Computer Engineering and Applications,2016,52(11):175-180. [22]SRIVASTAVA R K,PANDEY D.Speech recognition usingHMM and Soft Computing[J].Materials Today:Proceedings,2022,51:1878-1883. |
|