Computer Science ›› 2024, Vol. 51 ›› Issue (6A): 230500174-7.doi: 10.11896/jsjkx.230500174
• Image Processing & Multimedia Technolog • Previous Articles Next Articles
ZHANG Xinrui, YANG Jian, WANG Zhan
CLC Number:
[1]WANG Y,SKERRY-RYAN R J,STANTON D,et al.Taco-tron:Towards end-to-end speech synthesis[J].arXiv:1703.10135,2017. [2]SHEN J,PANG R,WEISS R J,et al.Natural tts synthesis by conditioning wavenet on mel spectrogram predictions[C]//2018 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).IEEE,2018:4779-4783. [3]REN Y,RUAN Y,TAN X,et al.Fastspeech:Fast,robust and controllable text to speech[C]//Proceesing of the 33rd International Conference on Advances in Neural Information Processing Systems.2019:3171-3180. [4]REN Y,HU C,TAN X,et al.Fastspeech 2:Fast and high-quality end-to-end text to speech[J].arXiv:2006.04558,2020. [5]CHOMPHAN S,KOBAYASHI T.Implementation and evaluation of an HMM-based Thai speech synthesis system[C]//Eighth Annual Conference of the International Speech Communication Association.2007. [6]TESPRASIT V,CHAROENPORNSAWAT P,SORNLERT-LAMVANICH V.A context-sensitive homograph disambiguation in Thai text-to-speech synthesis[C]//Companion Volume of the Proceedings of HLT-NAACL 2003-Short Papers.2003:103-105. [7]WAN V,LATORRE J,CHIN K K,et al.Combining multiple high quality corpora for improving HMM-TTS[C]//Thirteenth Annual Conference of the International Speech Communication Association.2012. [8]OORD A,DIELEMAN S,ZEN H,et al.Wavenet:A generative model for raw audio[J].arXiv:1609.03499,2016. [9]PRENGER R,VALLE R,CATANZARO B.Waveglow:A flow-based generative network for speech synthesis[C]//2019 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2019).IEEE,2019:3617-3621. [10]KUMAR K,KUMAR R,DE BOISSIERE T,et al.Melgan:Generative adversarial networks for conditional waveform synthesis[J].arXiv:1910.06711,2019. [11]YAMAMOTO R,SONG E,KIM J M.Parallel WaveGAN:A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram[C]//2020 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2020).IEEE,2020:6199-6203. [12]MUSTAFA A,PIA N,FUCHS G.Stylemelgan:An efficienthigh-fidelity adversarial vocoder with temporal adaptive norma-lization[C]//2021 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2021).IEEE,2021:6034-6038. [13]PARK T,LIU M Y,WANG T C,et al.Semantic image synthesis with spatially-adaptive normalization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:2337-2346. [14]NGUYEN T Q.Near-perfect-reconstruction pseudo-QMF banks[J].IEEE Transactions on Signal Processing,1994,42(1):65-76. [15]QIN Y Y.Analysis of Thai phonetics teaching and teachingstrategies for Chinese students in the primary stage[D].Nanning:Guangxi University,2017. [16]LIU J,XIE Z,ZHANG C,et al.A novel method for Mandarin speech synthesis by inserting prosodic structure prediction into Tacotron2[J].International Journal of Machine Learning and Cybernetics,2021,12:2809-2823. [17]SEEHA S,BILAN I,SANCHEZ L M,et al.Thailmcut:Unsupervised pretraining for thai word segmentation[C]//Procee-dings of The 12th Language Resources and Evaluation Confe-rence.2020:6947-6957. [18]FAHMY F K,KHALIL M I,ABBAS H M.A transfer learning end-to-end arabic text-to-speech(tts) deep architecture[C]//Artificial Neural Networks in Pattern Recognition:9th IAPR TC3 Workshop(ANNPR 2020).Winterthur,Switzerland,Cham:Springer International Publishing,2020:266-277. [19]XU J,TAN X,REN Y,et al.Lrspeech:Extremely low-resource speech synthesis and recognition[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Disco-very & Data Mining.2020:2802-2812. [20]HAYASHI T,YAMAMOTO R,YOSHIMURA T,et al.Espnet2-tts:Extending the edge of tts research[J].arXiv:2110.07840,2021. [21]KONG J,KIM J,BAE J.Hifi-gan:Generative adversarial networks for efficient and high fidelity speech synthesis[J].Advances in Neural Information Processing Systems,2020,33:17022-17033. [22]WATANABE S,HORI T,KARITA S,et al.Espnet:End-to-end speech processing toolkit[J].arXiv:1804.00015,2018. [23]KUBICHEK R.Mel-cepstral distance measure for objectivespeech quality assessment[C]//Proceedings of IEEE Pacific RIM Conference on Communications Computers and Signal Processing.IEEE,1993:125-128. |
[1] | CAO Yan, ZHU Zhenfeng. DRSTN:Deep Residual Soft Thresholding Network [J]. Computer Science, 2024, 51(6A): 230400112-7. |
[2] | WANG Jiahao, FU Yifu, FENG Hainan, REN Yuheng. Indoor Location Algorithm in Dynamic Environment Based on Transfer Learning [J]. Computer Science, 2024, 51(5): 277-283. |
[3] | WU Kewei, HAN Chao, SUN Yongxuan, PENG Menghao, XIE Zhao. Hierarchical Conformer Based Speech Synthesis [J]. Computer Science, 2024, 51(2): 161-171. |
[4] | YANG Lin, YANG Jian, CAI Haoran, LIU Cong. Vietnamese Speech Synthesis Based on Transfer Learning [J]. Computer Science, 2023, 50(8): 118-124. |
[5] | XIAO Guiyang, WANG Lisong , JIANG Guohua. Multimodal Knowledge Graph Embedding with Text-Image Enhancement [J]. Computer Science, 2023, 50(8): 163-169. |
[6] | CAI Haoran, YANG Jian, YANG Lin, LIU Cong. Low-resource Thai Speech Synthesis Based on Alternate Training and Pre-training [J]. Computer Science, 2023, 50(6A): 220800127-5. |
[7] | WANG Tianran, WANG Qi, WANG Qingshan. Transfer Learning Based Cross-object Sign Language Gesture Recognition Method [J]. Computer Science, 2023, 50(6A): 220300232-5. |
[8] | HU Mingyang, GUO Yan, JIN Yangshuang. PSwin:Edge Detection Algorithm Based on Swin Transformer [J]. Computer Science, 2023, 50(6): 194-199. |
[9] | ZHANG Qiyang, CHEN Xiliang, CAO Lei, LAI Jun, SHENG Lei. Survey on Knowledge Transfer Method in Deep Reinforcement Learning [J]. Computer Science, 2023, 50(5): 201-216. |
[10] | WANG Xiaofei, FAN Xueqiang, LI Zhangwei. Improving RNA Base Interactions Prediction Based on Transfer Learning and Multi-view Feature Fusion [J]. Computer Science, 2023, 50(3): 164-172. |
[11] | HU Zhongyuan, XUE Yu, ZHA Jiajie. Survey on Evolutionary Recurrent Neural Networks [J]. Computer Science, 2023, 50(3): 254-265. |
[12] | FANG Yi-qiu, ZHANG Zhen-kun, GE Jun-wei. Cross-domain Recommendation Algorithm Based on Self-attention Mechanism and Transfer Learning [J]. Computer Science, 2022, 49(8): 70-77. |
[13] | WANG Jun-feng, LIU Fan, YANG Sai, LYU Tan-yue, CHEN Zhi-yu, XU Feng. Dam Crack Detection Based on Multi-source Transfer Learning [J]. Computer Science, 2022, 49(6A): 319-324. |
[14] | PENG Yun-cong, QIN Xiao-lin, ZHANG Li-ge, GU Yong-xiang. Survey on Few-shot Learning Algorithms for Image Classification [J]. Computer Science, 2022, 49(5): 1-9. |
[15] | AN Xin, DAI Zi-biao, LI Yang, SUN Xiao, REN Fu-ji. End-to-End Speech Synthesis Based on BERT [J]. Computer Science, 2022, 49(4): 221-226. |
|