Computer Science ›› 2024, Vol. 51 ›› Issue (9): 338-345.doi: 10.11896/jsjkx.230700200
• Computer Network • Previous Articles Next Articles
YAO Yao, YANG Jibin, ZHANG Xiongwei, LI Yihao, SONG Gongkunkun
CLC Number:
[1]WANG Y P,WEI G H,PAN X D,et al.Prediction model and experiment of out-of-band dual-band interference of communication station[J].Acta Electronica Sinica,2019,47(4):826-831. [2]LI S,CAO F.Research on end-to-end framework model analysis and trend of intelligent speech technology[J].Computer Science,2022,49(S1):331-336. [3]PASCUAL S,BONAFONTE A,SERRA J.SEGAN:Speech Enhancement Generative Adversarial Network[C]//Conference of the International Speech Communication Association.2017:3642-3646. [4]PANDEY A,WANG D.TCNN:Temporal Convolutional Neural Network for Real-time Speech Enhancement in the Time Domain[C]//2019 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2019).Brighton,UK,2019:6875-6879. [5]PANDEY A,WANG D L.Densely connected neural networkwith dilated convolutions for real-time speech enhancement in the time domain[C]//2020 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2020).IEEE,2020:6629-6633. [6]FAN J Y,YANG J B,ZHANG X W,et al.Single-channel speech enhancement based on multi-head attention mechanism in U-net network[J].Acta Acoustica Sinica,2022,47(6):703-716. [7]LI L,ZHU Y,ZHU Z.Automatic Modulation ClassificationUsing ResNeXt-GRU With Deep Feature Fusion[J].IEEE Tran-sactions on Instrumentation and Measurement,2023,72:1-10. [8]CHOLLET F.Xception:Deep learning with depthwise separable convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:1251-1258. [9]BENGIO Y,SIMARD P,FRASCONI P,Learning long-term dependencies with gradient descent is difficult[J].IEEE Transactions on Instrumentation and Measurement,1994,5(2):157-166. [10]HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780. [11]BANG J Y,SUN M,ZHANG X W,et al.Lightweight Model for Bone-Conducted Speech Enhancement Based on Convolution Network and Residual Long Short-Time Memory Network[J].Journal of Data Acquisition & Processing,2021,36(5):921-931. [12]ZHANG Q,SONG Q,NI Z,et al.Time-frequency attention for monaural speech enhancement[C]//2022 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2022).IEEE,2022:7852-7856. [13]WOO S,PARK J,LEE J Y,et al.Cbam:Convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:3-19. [14]TOLOOSHAMS B,GIRI R,SONG A H,et al.Channel-atten-tion dense u-net for multichannel speech enhancement[C]//2020 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2020).Barcelona,Spain.IEEE,2020:836-840. [15]ZHU X,LI J,LIU Y,et al.A Survey on Model Compression for Large Language Models[J].arXiv:2308.07633,2023. [16]ANDREW G H,MENGLONG Z,BO C,et al.Mobilenets:Efficient convolutional neural networks for mobile vision applications[J].arXiv:1704.04861,2017. [17]ZHANG X,ZHOU X,LIN M,et al.Shufflenet:An extremelyefficient convolutional neural network for mobile devices[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:6848-6856. [18]ZENG Y,LI Y,ZHOU Z,et al.Domestic activities classification from audio recordings using multi-scale dilated depthwise separable convolutional network[C]//2021 IEEE 23rd International Workshop on Multimedia Signal Processing(MMSP).IEEE,2021:1-5. [19]TAN K,WANG D L.A convolutional recurrent neural network for real-time speech enhancement[C]//Interspeech 2018.2018:3229-3233. [20]LE X,CHEN H,CHEN K,et al.DPCRN:Dual-path convolution recurrent network for single channel speech enhancement[C]//Interspeech 2021,22nd Annual Conference of the International Speech Communication Association.Brno,Czechia,2021:2811-2815. [21]DEFOSSEZ A,SYNNAEVE G,ADI Y.Real time speech en-hancement in the waveform domain[C]//Interspeech 2020,21st Annual Conference of the International Speech Communication Association,Virtual Event.2020:3291-3295. [22]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7132-7141. [23]FU J,LIU J,TIAN H,et al.Dual attention network for scenesegmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:3146-3154. [24]PARK H J,KANG B H,SHIN W,et al.Manner:Multi-view attention network for noise erasure[C]//2022 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2022).Singapore,IEEE,2022:7842-7846. [25]LI Y,WANG W,CHEN H,et al.Few-shot speaker identifica-tion using depthwise separable convolutional network with channel attention[J].arXiv:2204.11180,2022. [26]VALENTINI-BOTINHAO C,WANG X,TAKAKI S,et al.Investigating RNN-based speech enhancement methods for noise-robust text-to-speech[C]//SSW.2016:146-152. [27]WANG D,ZHANG X.Thchs-30:A free chinese speech corpus[J].arXiv:1512.01882,2015. [28]RIX A W,BEERENDS J G,HOLLIER M P,et al.Perceptualevaluation of speech quality(PESQ)-a new method for speecn quality assessment of telephone networks and codecs[C]//Proceedings of the 26th International Conference on Acoustics,Speech,and Signal Processing.Utah:IEEE,2001:749-752. [29]TAAL C H,HENDRIKS R C,HEUSDENS R,et al.An algorithm for intelligibility prediction of time-frequency weighted noisy speech[J].IEEE Transactions on Audio,Speech,and Language Processing,2011,19(7):2125-2136. [30]HU Y,LOIZOU P C.Evaluation of objective quality measuresfor speech enhancement[J].IEEE Transactions on Audio,Speech,and Language Processing,2007,16(1):229-238. [31]MACARTNEY C,WEYDE T.Improved speech enhancementwith the Wave-U-Net[J].arXiv:1811.11307,2018. [32]FU S W,LIAO C F,TSAO Y,et al.Metricgan:Generative adversarial networks based black-box metric scores optimization for speech enhancement[C]//International Conference on Machine Learning.PMLR,2019:2031-2041. [33]YIN D,LUO C,XIONG Z,et al.Phasen:A phase-and-harmo-nics-aware speech enhancement network[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020,34(5):9458-9465. [34]ZHANG Q Q,AARON M N,WANG M J,et al.Deepmmse:A deep learning approach to mmse-based noise power spectral density estimation[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,IEEE,2020,28(1):1404-1415. [35]WANG K,HE B,ZHU W P.TSTNN:Two-Stage Transformer Based Neural Network for Speech Enhancement in the Time Domain[C]//2021 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2021).Toronto,ON,Canada,2021:7098-7102. [36]KONG Z,PING W,DANTREY A,et al.Speech denoising in the waveform domain with self-attention[C]//2022 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2022).IEEE,2022:7867-7871. |
[1] | LI Yunchen, ZHANG Rui, WANG Jiabao, LI Yang, WANG Ziqi, CHEN Yao. Re-parameterization Enhanced Dual-modal Realtime Object Detection Model [J]. Computer Science, 2024, 51(9): 162-172. |
[2] | HU Pengfei, WANG Youguo, ZHAI Qiqing, YAN Jun, BAI Quan. Night Vehicle Detection Algorithm Based on YOLOv5s and Bistable Stochastic Resonance [J]. Computer Science, 2024, 51(9): 173-181. |
[3] | LIU Qian, BAI Zhihao, CHENG Chunling, GUI Yaocheng. Image-Text Sentiment Classification Model Based on Multi-scale Cross-modal Feature Fusion [J]. Computer Science, 2024, 51(9): 258-264. |
[4] | LI Zhe, LIU Yiyang, WANG Ke, YANG Jie, LI Yafei, XU Mingliang. Real-time Prediction Model of Carrier Aircraft Landing Trajectory Based on Stagewise Autoencoders and Attention Mechanism [J]. Computer Science, 2024, 51(9): 273-282. |
[5] | LIU Qilong, LI Bicheng, HUANG Zhiyong. CCSD:Topic-oriented Sarcasm Detection [J]. Computer Science, 2024, 51(9): 310-318. |
[6] | LIU Sichun, WANG Xiaoping, PEI Xilong, LUO Hangyu. Scene Segmentation Model Based on Dual Learning [J]. Computer Science, 2024, 51(8): 133-142. |
[7] | ZHANG Rui, WANG Ziqi, LI Yang, WANG Jiabao, CHEN Yao. Task-aware Few-shot SAR Image Classification Method Based on Multi-scale Attention Mechanism [J]. Computer Science, 2024, 51(8): 160-167. |
[8] | WANG Qian, HE Lang, WANG Zhanqing, HUANG Kun. Road Extraction Algorithm for Remote Sensing Images Based on Improved DeepLabv3+ [J]. Computer Science, 2024, 51(8): 168-175. |
[9] | XIAO Xiao, BAI Zhengyao, LI Zekai, LIU Xuheng, DU Jiajin. Parallel Multi-scale with Attention Mechanism for Point Cloud Upsampling [J]. Computer Science, 2024, 51(8): 183-191. |
[10] | PU Bin, LIANG Zhengyou, SUN Yu. Monocular 3D Object Detection Based on Height-Depth Constraint and Edge Fusion [J]. Computer Science, 2024, 51(8): 192-199. |
[11] | ZHANG Junsan, CHENG Ming, SHEN Xiuxuan, LIU Yuxue, WANG Leiquan. Diversified Label Matrix Based Medical Image Report Generation [J]. Computer Science, 2024, 51(8): 200-208. |
[12] | WANG Chao, TANG Chao, WANG Wenjian, ZHANG Jing. Infrared Human Action Recognition Method Based on Multimodal Attention Network [J]. Computer Science, 2024, 51(8): 232-241. |
[13] | ZHANG Lu, DUAN Youxiang, LIU Juan, LU Yuxi. Chinese Geological Entity Relation Extraction Based on RoBERTa and Weighted Graph Convolutional Networks [J]. Computer Science, 2024, 51(8): 297-303. |
[14] | CHEN Shanshan, YAO Subin. Study on Recommendation Algorithms Based on Knowledge Graph and Neighbor PerceptionAttention Mechanism [J]. Computer Science, 2024, 51(8): 313-323. |
[15] | BAI Wenchao, BAI Shuwen, HAN Xixian, ZHAO Yubo. Efficient Query Workload Prediction Algorithm Based on TCN-A [J]. Computer Science, 2024, 51(7): 71-79. |
|