Computer Science ›› 2022, Vol. 49 ›› Issue (4): 221-226.doi: 10.11896/jsjkx.210300071
• Computer Graphics & Multimedia • Previous Articles Next Articles
AN Xin, DAI Zi-biao, LI Yang, SUN Xiao, REN Fu-ji
CLC Number:
[1] TAYLOR P.Text-to-speech synthesis[M].New York:Cam-bridge University Press,2009. [2] FUNG P,SCHULTZ T.Multilingual spoken language processing[J].IEEE Signal Processing Magazine,2008,25(3):89-97. [3] PAN X Q,LU T L,DU Y H,et al.Overview of Speech Synthesis and Voice Converrsion Technology Based on Deep Learning[J].Computer Science,2021,48(8):200-208. [4] ZHANG B,QUAN C Q,REN J F.Overview of Speech Synthesis in Development and Methods[J].Journal of Chinese Computer System,2016,37(1):186-192. [5] WANG Y,SKERRY-RYAN R,STANTON D,et al.Tacotron:toward end-to-end speech synthesis[J].arXiv:1703.10135,2017. [6] GRIFFIN D,LIM J S.Signal estimation from modified short-time Fourier transform[J].1984 IEEE Transactions on Acoustics Speech and Signal Processing,1984,32(2):236-243. [7] SUTSKEVER I,VINYALS O,LE Q V.Sequence to sequence learning with neural networks[J].Advances in Neural Information Processing Systems,2014,27:3104-3112. [8] SHEN J,PANG R,WEISS R J,et al.Natural tts synthesis by conditioning wavenet on mel spectrogram predictions[C]//Proceedings of the 2018 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP).Piscataway:IEEE,2018:4779-4783. [9] OORD A V D,DIELEMAN,ZEN H,et al.WaveNet:a generative model for raw audio[J].arXiv:1609.03499,2016. [10] ARIK S O,CHRZANOWSKI M,COATES A,et al.Deep voice:Real-time neural text-to-speech[J].arXiv:1702.07825,2017. [11] GIBIANSKY A,ARIK S O,DIAMOS G F,et al.Deep Voice 2:Multi-Speaker Neural Text-to-Speech[C]//Proceedings of the Advances in 2017 Neural Information Processing Systems.Uni-ted states:NIPS.2017:2963-2970. [12] PING W,PENG K,GIBIANSKY A,et al.Deep voice 3:Scaling text-to-speech with convolutional sequence learning[J].arXiv:1710.07654,2017. [13] CHRORWSKI J K,BAHDANAU D,SERDYUK D,et al.Attention-based models for speech recognition[J].Advances in Neural Information Processing Systems,2015,28:577-585. [14] BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[J].arXiv:1409.0473,2014. [15] OORD A,LI Y,BABUSCHKIN I,et al.Parallel wavenet:Fast high-fidelity speech synthesis[C]//Proceedings of the International Conference on Machine Learning.Cambridge MA:JMLR,2018:3918-3926. [16] PRENGER R,VALLE R,CATANZARO B.Waveglow:A flow-based generative network for speech synthesis[C]//Proceedings of the 2019 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP).Piscataway:IEEE,2019:3617-3621. [17] KINGMA D P,DHARIWAL P.Glow:Generative flow with invertible 1×1 convolutions[C]//Proceedings of the Advances in 2018 NeuralInformation Processing Systems.United states:NIPS,2018:10215-10224. [18] DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018. [19] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Proceedings of the Advances in 2017 Neural Information Processing Systems.United states:NIPS,2017:5998-6008. [20] QIU X,SUN T,XU Y,et al.Pre-trained models for natural language processing:A survey[J].arXiv:2003.08271,2020. [21] HAO Y,DONG L,WEI F,et al.Visualizing and understanding the effectiveness of BERT[J].arXiv:1908.05620,2019. [22] PING W,PENG K,CHEN J.Clarinet:Parallel wave generation in end-to-end text-to-speech[J].arXiv:1807.07281,2018. [23] CUI Y,CHE W,LIU T,et al.Revisiting Pre-Trained Models for Chinese Natural Language Processing[J].arXiv:2004.13922,2020. [24] KINGMA D P,BA J.Adam:A method for stochastic optimization[J].arXiv:1412.6980,2014. |
[1] | ZHOU Fang-quan, CHENG Wei-qing. Sequence Recommendation Based on Global Enhanced Graph Neural Network [J]. Computer Science, 2022, 49(9): 55-63. |
[2] | DAI Yu, XU Lin-feng. Cross-image Text Reading Method Based on Text Line Matching [J]. Computer Science, 2022, 49(9): 139-145. |
[3] | ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161. |
[4] | XIONG Li-qin, CAO Lei, LAI Jun, CHEN Xi-liang. Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization [J]. Computer Science, 2022, 49(9): 172-182. |
[5] | RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207. |
[6] | WANG Ming, PENG Jian, HUANG Fei-hu. Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction [J]. Computer Science, 2022, 49(8): 40-48. |
[7] | JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335. |
[8] | ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119. |
[9] | SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177. |
[10] | YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236. |
[11] | JIN Fang-yan, WANG Xiu-li. Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM [J]. Computer Science, 2022, 49(7): 179-186. |
[12] | XIONG Luo-geng, ZHENG Shang, ZOU Hai-tao, YU Hua-long, GAO Shang. Software Self-admitted Technical Debt Identification with Bidirectional Gate Recurrent Unit and Attention Mechanism [J]. Computer Science, 2022, 49(7): 212-219. |
[13] | PENG Shuang, WU Jiang-jiang, CHEN Hao, DU Chun, LI Jun. Satellite Onboard Observation Task Planning Based on Attention Neural Network [J]. Computer Science, 2022, 49(7): 242-247. |
[14] | ZHANG Ying-tao, ZHANG Jie, ZHANG Rui, ZHANG Wen-qiang. Photorealistic Style Transfer Guided by Global Information [J]. Computer Science, 2022, 49(7): 100-105. |
[15] | ZENG Zhi-xian, CAO Jian-jun, WENG Nian-feng, JIANG Guo-quan, XU Bin. Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism [J]. Computer Science, 2022, 49(7): 106-112. |
|