Computer Science ›› 2026, Vol. 53 ›› Issue (5): 59-67.doi: 10.11896/jsjkx.250600187
• Intelligent Education Technology • Previous Articles Next Articles
WANG Chencai1, YANG Siyan2, MIAO Qiguang1,3
CLC Number:
| [1]GRAVES A.Generating sequences with recurrent neural networks[J].arXiv:1308.0850,2013. [2]SHEN J,PANG R,WEISS R J,et al.Natural tts synthesis by conditioning wavenet on mel spectrogram predictions[C]//Proceedings of the 2018 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).IEEE,2018. [3]ARIK S Ö,CHRZANOWSKI M,COATES A,et al.Deep voice:Real-time neural text-to-speech[C]//Proceedings of the International Conference on Machine Learning.PMLR,2017. [4]GIBIANSKY A,ARIK S,DIAMOS G,et al.Deep voice 2:Multi-speaker neural text-to-speech[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.2017:2966-2974. [5]PING W,PENG K,GIBIANSKY A,et al.Deep voice 3:Scaling text-to-speech with convolutional sequence learning[J].arXiv:1710.07654,2017. [6]SNYDER D,GARCIA-ROMERO D,SELL G,et al.X-vectors:Robust dnn embeddings for speaker recognition[C]//Procee-dings of the 2018 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).IEEE,2018. [7]YANG J,LEE J,KIM Y,et al.VocGAN:A high-fidelity real-time vocoder with a hierarchically-nested adversarial network[J].arXiv:2007.15256,2020. [8]KONG J,KIM J,BAE J.Hifi-gan:Generative adversarial net-works for efficient and high fidelity speech synthesis[J].Advances in Neural Information Processing Systems,2020,33:17022-17033. [9]JANG W,LIM D,YOON J,et al.Univnet:A neural vocoderwith multi-resolution spectrogram discriminators for high-fidelity waveform generation[C]//Proceedings Interspeech 2021.2021:2207-2211. [10]MORRISON M,KUMAR R,KUMAR K,et al.Chunked autoregressive gan for conditional waveform synthesis[C]//International Conference on Learning Representations.2021. [11]CHEN S,WANG C,WU Y,et al.Neural codec language models are zero-shot text to speech synthesizers[J].IEEE Transactions on Audio,Speech and Language Processing,2025,33:705-718. [12]LI T,WANG Z,ZHU X,et al.U-Style:Cascading U-Nets With Multi-Level Speaker and Style Modeling for Zero-Shot Voice Cloning[J].IEEE/ACM Transactions on Audio,Speech and Language Processing,2024,32:4026-4035. [13]QIN Z,ZHAO W,YU X,et al.Openvoice:Versatile instantvoice cloning[J].arXiv:2312.01479,2023. [14]WANG Y,ZHAN H,LIU L,et al.Maskgct:Zero-shot text-to-speech with masked generative codec transformer[J].arXiv:2409.00750,2024. [15]LU Y X,DU H P,SHENG Z Y,et al.Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesis[C]//Proceedings of the ICASSP 2025-2025 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).IEEE,2025. [16]MENG M,YANG Z,YANG J,et al.DS-TTS:Zero-Shot Speaker Style Adaptation from Voice Clips via Dynamic Dual-Style Feature Modulation[J].arXiv:2506.01020,2025. [17]ZHANG B,GUO C,YANG G,et al.Minimax-speech:Intrinsic zero-shot text-to-speech with a learnable speaker encoder[J].arXiv:2505.07916,2025. [18]DENG W,ZHOU S,SHU J,et al.IndexTTS:An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System[J].arXiv:2502.05512,2025. [19]CASANOVA E,DAVIS K,GÖLGE E,et al.Xtts:a massively multilingual zero-shot text-to-speech model[J].arXiv:2406,04904,2024. [20]LI J,TU W,XIAO L.Freevc:Towards high-quality text-freeone-shot voice conversion[C]//Proceedings of the ICASSP 2023-2023 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).IEEE,2023. |
| [1] | HU Junjie, CHEN Yujie, HU Yikun, WEN Cheng, CAO Jialun, MA Zhi, SU Jie, SUN Weidi, TIAN Cong, QIN Shengchao. Formal Theorem Proving Empowered by Large Language Model:Survey and Perspectives [J]. Computer Science, 2026, 53(4): 1-23. |
| [2] | LIU Yichen, LIN Yan, ZHOU Zeyu, GUO Shengnan, LIN Youfang, WAN Huaiyu. Efficient Semantic-aware Trajectory Representation Learning Method via State Space Model [J]. Computer Science, 2026, 53(4): 134-142. |
| [3] | XU Yamin, LI Xiaobin, ZHANG Run. Semi-supervised Learning Algorithm Based on Pointwise Manifold Structures and Uniform Regularity Constraints [J]. Computer Science, 2026, 53(4): 173-179. |
| [4] | KANG Jun, GAO Shengkai, LAI Jiabao. Fast Map Matching Method Based on Trajectory Micro-segment Model [J]. Computer Science, 2026, 53(4): 252-259. |
| [5] | ZHANG Can, LI Weixun, WANG Ming, ZHAN Xiong, XIE Ziguang, HAN Dongqi, WANG Zhiliang, YANG Jiahai. Network Traffic Generation Method for Malicious Traffic Identification [J]. Computer Science, 2026, 53(4): 415-423. |
| [6] | XU Jiawen, ZHENG Yungui, ZHOU Wei, XU Yaoqiang, HU Huiqi, ZHOU Xuan. SQL-MARS:Text-to-SQL Structured Data Recommendation System for Ambiguous UserRequirements [J]. Computer Science, 2026, 53(3): 52-63. |
| [7] | SONG Jianhua, HE Jiawei, ZHANG Yan. Dual-channel Source Code Vulnerability Detection Model Based on Contrastive Learning [J]. Computer Science, 2026, 53(3): 424-432. |
| [8] | SHAO Xinyi, ZHU Jingwei, ZHANG Liang. LLM-based Business Process Adaptation Method to Respond Long-tailed Changes [J]. Computer Science, 2026, 53(1): 29-38. |
| [9] | LU Chao, YANG Chaoshu, YAO Zhengzhu, LIU Ying, ZHANG Runyu. Survey on Optimization B+ Tree Index for Persistent Memory [J]. Computer Science, 2026, 53(1): 77-88. |
| [10] | LI Shunyong, ZHENG Mengjiao, LI Jiaming, ZHAO Xingwang. Joint Spectrum Embedding Clustering Algorithm Based on Multi-view Diversity Learning [J]. Computer Science, 2026, 53(1): 104-114. |
| [11] | SONG Yijing, ZHANG Jifu. Attribute Grouping-based Categorical Outlier Detection Using Isolation Forest Ensemble Strategy [J]. Computer Science, 2026, 53(1): 115-127. |
| [12] | XU Teng, LIU Luyao, JIANG Haoyu, LUO Chang, LI Heng, YUAN Wei. Survey on Security of Android SDKs [J]. Computer Science, 2026, 53(1): 285-297. |
| [13] | PAN Yanyang, YANG Binhao, JI Qingge. PBFT Consensus Algorithm Based on Bayesian Theory [J]. Computer Science, 2026, 53(1): 331-340. |
| [14] | ZHANG Lizheng, YANG Qiuhui, DAI Shengxin. Automated Program Repair Based on Perturbing and Freezing Pre-trained Model [J]. Computer Science, 2025, 52(12): 18-23. |
| [15] | ZHANG Cong, CHEN Zhe, WANG Huijie, WEI Yiyang. SCADE Model Checking Based on Implicit Predicate Abstraction and Property-directedReachability [J]. Computer Science, 2025, 52(12): 24-31. |
|
||