摘要: 提出了一种基于双层码本的语音驱动视觉语音合成系统,该系统以矢量量化的思想为基础,建立语音特征空间到视觉语音特征空间的粗耦合映射关系。为加强语音和视觉语音的关联性,系统分别根据语音特征与视觉语音特征的相似性两次对样本数据进行自动聚类,构造同时反映语音之间与视觉语音之间相似性的双层映射码本。数据预处理阶段,提出一种能反映视觉语音几何形状特征与牙齿可见度的联合特征模型,并 在语音特征LPCC及MFCC基础上采用遗传算法提取视觉语音相关的语音特征模型。合成的视频中图像数据与原始视频中图像数据的比较结果表明,合成结果能在一定程度上逼近原始数据,取得了很好的效果。
[1] Jia Jia,Zhang Shen,Meng Fan-bo,et al.Emotional audio-visual speech synthesis based on PAD[J].IEEE Transactions on Audio,Speech and Language Processing,2011,9(3):570-582 [2] 谢金晶,陈益强,刘军发.基于语音情感识别的多表情人脸动画方法[J].计算机辅助设计与图形学学报,2005,0(4):520-525 [3] Pandzic I S,Ostermann J,et al.User evaluation:synthetic tal-king faces for interactive services[J].The Visual Computer,1999,15(7/8):330-340 [4] Massaro D W,Ouni S,Cohen M M,et al.A multilingual embo-died conversational agent[A]∥Proceedings of 38th Annual Hawaii International Conference on System Sciences (HICCS’05) (CD-ROM,10pages) [C].Los Alimitos,CA,IEEE Computer Society Press,2005 [5] 王志明,陶建华.文本-视觉语音合成综述[J].计算机研究与发展,2006,43(1):145-152 [6] Gao W,Chen Y Q,et al.Learning and synthesizing mpeg-4compatible 3-d face animation from video sequence[J].IEEE Transactions on Circuits and Systems for Video Technology,2003,3(11):1119-1128 [7] Brand M.Voice puppetry[C]∥Proceedings of ACM SIG-GRAPH 1999.ACM Press/Addison-Wesley Publishing Co:New York,NY,USA,1999:21-28 [8] Morishima S,Harashima H.Speech-to-image media conversionbased on VQ and neural network,ICASSP 91[C]∥1991International Conference on Acoustics,Speech and Signal Processing.1991:2865-2868 [9] Gutierrez-Osuna R,Kakumanu P,Esposito A,et al.Speech-drivenfacial animation with realistic dynamics[J].IEEE Transactions On Multimedia,2005,7(1):33-41 [10] Jiang J T,Alwan A,Bernstein L E,et al.Predicting face movements from speech acoustics using spectral dynamics[C]∥IEEE International Conference on Multimedia and Expo.2002:181-184 [11] Bregler C,Covell M,Slaney M.Video rewrite:driving visualspeech with audio,SIGGRAPH’97[C]∥ACM Press/Addison-Wesley Publishing Co:New York.NY,USA,1997:353-360 [12] Graf H P,Cosatto E.Sample-based synthesis of talking-heads[C]∥The 8th IEEE Int’l Conf.Computer Vision.2001:3-7 |
No related articles found! |
|