计算机科学 ›› 2023, Vol. 50 ›› Issue (11): 177-184.doi: 10.11896/jsjkx.221000024
王学光1, 诸珺文1, 张爱新2
WANG Xueguang1, ZHU Junwen1, ZHANG Aixin2
摘要: AI克隆语音技术的出现将对现代社会法治秩序造成致命冲击。近年来研究人员仅关注了AI合成语音与样本语音内容相同领域的研究,而对AI合成语音与样本内容不同的检材的鉴定研究却甚少,相关鉴定内容无法识别。为此,提出了一种三维度基于改进MFCC特征模型对AI克隆语音源进行鉴定。首先对先前研究人员人工分析的AI克隆语音特性进行验证,总结出可识别的“共振峰F5异常活跃”与“能量、共振峰、音高曲线异常突变”的特征。其次基于AI克隆语音的特征运用二阶差分修正MFCC系数并采用“逆差逻辑推演法”将能量、共振峰、音高曲线突变特性进一步量化采样,将其定义为语音鉴定的特征向量三元组。然后以特征向量三元组为输入,运用D-S证据合成规则将三组检材与样本比对的结果融合。最后形成三维度基于改进MFCC特征参量的检材评定模型。人群随机采样实验结果表明,该AI克隆语音源鉴定方法对以同一人为克隆源所合成的AI克隆语音鉴定的平均概率为67.324%,标准差为7.32%,鉴定效果很好。
中图分类号:
[1]CASADO-VARA R,MARTIN DEL REY A,PÉREZ-PALAU D,et al.Web Traffic Time Series Forecasting Using LSTM Neural Networks with Distributed Asynchronous Training[J].Mathematics,2021,9(4):421. [2]JIANG Y,WANG Y J,LIN Q,et al.A Memory Model forImage Recognition and Classification based on Convolutional Neural Network and Bayesian Decision [J].Scientia Sinica(Technologica),2017,47(9):977-984. [3]NGOC H L,XUAN T K.A New Approach in Elementary Chinese Pronunciation Test Using AI Voice Recognition at Hcmue[C]//13th International Conference on Education and New Learning Technologies.2021. [4]WANG C,TEO T,JANSSEN M.Public and Private Value Crea-tion Using Artificial Intelligence:An Empirical Study of AI Voice Robot Users in Chinese Public Sector[J].International Journal of Information Management,2021,61(4):102401. [5]JIN F.Output Analysis in Voice Interaction in AI Environment[J].Informatica,2019,43(3),321-324. [6]XU Z H,CHEN B,ZHANG H,et al.Speech Synthesis Adaption Method Based on Phoneme-Level Speaker Embedding Under Small Data[J].Chinese Journal of Computers,2022,45(5):1003-1017. [7]YUAN Z.A Comparative Study on the Voiceprint Characteristics of Voice Changers and Normal Voices[J].Journal of Jiangxi Police Institute,2021(6):38-47. [8]YU J Q,JIAN Z H,XU J,et al.Spoofing Speech Detection Algorithm based on Joint Feature and Random Forest[J].Telecommunications Science,2022,38(6):91-99. [9]CHEN Z Q,WANG H.Research on Speech Identity Recognition of Synthetic Speech[J].Guangdong Public Security Science and Technology,2021,29(3):43-46. [10]ZHANG X H,YANG L M.Voiceprint Identification Analysis of Speech Synthesis:Based on the Voice of Two AI Virtual Announcers[J].Chinese Journal of Forensic Sciences,2022(2):69-72. [11]LEE G T,NAM H,KIM S H,et al.Deep Learning based Cough Detection Camera Using Enhanced Features[J].Expert Systems with Applications,2022,206(15):117811. [12]HANILI C,KINNUNEN T,SAHIDULLAH M,et al.Classi-fiers for Synthetic Speech Detection:A Comparison[C]//ISCA.Dresden:ISCA,2015:2057-2061. [13]SAHIDULLAH M,KINNUNEN T,HANILCI C.A Comparison of Features for Synthetic Speech Detection[C]//ISCA.Dresden:lSCA,2015:2087-2091. [14]WANG X G,ZHU J W,ZHANG A X.Identification Method of Voiceprint Identity Based on MFCC Features[J].Computer Science,2021,48(12):343-348. [15]WANG X G,ZHU J W,ZHANG A X.Identification Method of Voiceprint Identity Based on ARIMA Prediction of MFCC Features[J].Computer Science,2022,49(5):92-97. [16]STEPHENS R G,DUNN J C,HAYES B K,et al.A test of two processes:The effect of training on deductive and inductive reasoning[J].Cognition,2020,199:104223. [17]CHEN D,XIANG P,JIA F.Performance Measurement of Ope-ration and Maintenance for Infrastructure Mega-Project Based on Entropy Method and DS Evidence Theory[J].Ain Shams Engineering Journal,2022,13(2):101591. [18]LUO H,YAN G H,ZHANG M,et al.A Multi-Relational Network Important Node Mining Method based on Evidence Theory[J].Chinese Journal of Computers,2020,43(12):2398-2413. [19]ZHANG C.Research on Mesaurement Method of ElectronicData Uncertainty[D].Chongqing:Chongqing University of Posts and Telecommunications,2021. [20]CAI H,GUO H L.Research on Fruit Recognition Based onMulti-Classifier DS Evidence Theory Fusion[J].Journal of Chinese Agricultural Mechanization,2021,42(2):184-189. [21]WANG C D,YE Q,YAO L,et al.Analysis of Network Malicious Behavior and Feature Association Based on Big-Data[J].Journal of Taiyuan University of Technology,2018,49(2):264-273. [22]XU L Y,ZHANG B F,XU W M,et al.Evidence Ullage Analysis in D-S Theory and Development[J].Journal of Software,2004(1):69-75. [23]DAVIS S,MERMELSTEIN P.Comparison of Parametric Re-presentations for Monosyllabic Word Recognition in Contin-uously Spoken Sentences[J].IEEE Transactions on Acoustics,Speech,and Signal Processing,1980,28(4):357-366. |
|