计算机科学 ›› 2022, Vol. 49 ›› Issue (5): 92-97.doi: 10.11896/jsjkx.210400071

• 计算机图形学&多媒体* 上一篇    下一篇

基于ARIMA预测MFCC特征的声纹同一性鉴定方法

王学光1, 诸珺文1, 张爱新2   

  1. 1 华东政法大学刑事法学院 上海200052
    2 上海交通大学网络空间安全学院 上海200240
  • 收稿日期:2021-04-07 修回日期:2021-07-01 出版日期:2022-05-15 发布日期:2022-05-06
  • 通讯作者: 王学光(wangxueguang@ecupl.edu.cn)
  • 基金资助:
    国家重点研发计划(2017YFB0802103)

Identification Method of Voiceprint Identity Based on ARIMA Prediction of MFCC Features

WANG Xue-guang1, ZHU Jun-wen1, ZHANG Ai-xin2   

  1. 1 College of Criminal Justice,East China University of Political Science and Law,Shanghai 200052,China
    2 School of Cyber Science and Engineering,Shanghai Jiao Tong University,Shanghai 200240,China
  • Received:2021-04-07 Revised:2021-07-01 Online:2022-05-15 Published:2022-05-06
  • About author:WANG Xue-guang,born in 1975,Ph.D,professor,is a member of China Computer Federation.His main research interests include computer networks,big data application and electronic data.
  • Supported by:
    National Key R & D Program of China(2017YFB0802103).

摘要: 声纹识别技术的关键是从语音信号中提取具有说话人特征的语音特征参数。考虑到当下大多是运用鉴定人的经验对两段语音是否来源于同一人进行判定,在前期研究的基础上,结合MFCC特征,提出一种基于ARIMA预测的声纹同一性鉴定方法,以提高具有年份差距的检材与样本比对的准确率。此方法在Mel倒谱系数声纹同一性鉴定方法基础上,采用自回归综合移动平均季节序列作出线性最小均方估计,对声纹特征进行预测,改良了包含元音与响辅音的共振峰特性。实验证明,ARIMA时间序列的预测结果很好,且使用ARIMA改良的基于Mel倒谱系数的文本无关同一性鉴定的准确率较高,相似度在60%以上。

关键词: ARIMA预测, Mel倒谱系数, MFCC特征, 同一性鉴定

Abstract: The key of vocal pattern recognition technology is to extract the speech feature parameters with representative speaker characteristics from the speech signal.Considering that most of the contemporary determinations are made using the experience of the identifiers,combined with MFCC features,this paper proposes an ARIMA prediction-based vocal identity identification me-thod on the basis of previous study to improve the accuracy of the comparison between the examination materials with year gaps and the samples.This method uses an autoregressive integrated moving average seasonal series based on the Mel inverse spectral coefficient vocalic identity identification method,makes linear least mean square estimation,and improves the resonance peak characteristics containing vowels and loud consonants.It is demonstrated that the prediction results of ARIMA time series are good,and the accuracy of text-independent identity identification based on Mel inverse spectral coefficients using the modified ARIMA is high,with a similarity of more than 60%.

Key words: ARIMA prediction, Identity identification, Mel cepstrum coefficient, MFCC characteristics

中图分类号: 

  • TP391
[1]JI M F,CHEN N.Research on vocal tract spectrum conversion based on GMM model and LPC-MFCC[J/OL].Journal of East China University of Science and Technology.https://doi.org/10.14135/j.cnki.1006-3080.20201209001.
[2]ZHANG X,KONG H F,WANG H Y,et al.Difference Analysis of Formant in Network Voice Identification[J].Computer Applications and Software,2019,36(3):187-191.
[3]KANG J R,WANG L,WANG X D,et al.A Review on Researches of Forensic Phonetics and Acoustics in 2017[J].Forensic Science and Technology,2018,43(3):179-186.
[4]HUANG X B,ZHANG L,CAO L,et al.Intelligent speech recog-nition methods of different frequency bands based on LPCC[J].Electronic Design Engineering,2020,28(2):22-25.
[5]ZHOU Y Y,KONG Q.Research of Feature Parameters inVoiceprint Recognition Technology Based on GMM-UBM[J].Computer Technology and Development,2020(5):1-11.
[6]FRENCH P,CAO H L,LEI Y J.A developmental history of forensic speaker comparison in the UK[J].Evidence Science,2019,27(6):730-740.
[7]Morrison G S.Likelihood-ratio forensic voice comparison using parametric representations of the formant trajectories of diphthongs[J].Journal of the Acoustical Society of America,2009(4):2387-2397.
[8]ROSE P.Forensic voice Comparison with Japanese Vowels Accoustics-a likelihood ration-based approach using segmental cepstra[C]//Proceedings of the 17th International Congress of Phonetic Sciences.2011:1718-1721.
[9]ZHANG C,ENZINGER E.Fusion of multiple formant-trajectory and fundamental-frequency-based forensic-voice-comparison systems:Chinese /ei1/,/ai2/,and /iau1/[J].Journal of the Acoustical Society of America,2013(5):3295-3300.
[10]ZHANG C L.A new paradigm of forensic evidence evaluation[J].Journal of People’s Public Security University of China(Science and Technology),2018(1):25-30.
[11]BAHI H,BENATI N.A new keyword spotting approach[C]//International Conference on Multimedia Computing and Systems.IEEE,2009:495-501.
[12]ZHANG Y,GLASS J R.Unsupervised spoken keyword spotting via segmental DTW on Gaussian Posteriorgrams[C]//IEEE Workshop on Automatic Speech Recognition & Understanding,2009.ASRU 2009.IEEE,2010:398-403.
[13]BARAKAT M S,RITZ C H,STRILING D A.Keyword Spotting Based on Analysis of Template Matching Distances[C]//2011 5th International Conference on Signal Processing and Communication Systems (ICSPCS).IEEE,2011:405-412.
[14]KESHET J,GRANGIER D,BENGIO S.Discriminative key-word spotting[J].Speech Communication,2009,51(4):317-329.
[15]ALEX J S R,VENKATESAN N.Spoken Utterance Detection Using Dynamic Time Warping Method Along With a Hashing Technique[J].International Journal of Engineering and Technology,2014,6(2):1100-1108.
[16]SHETTY S,ACHARY K K.Audio Data Mining Using Multiperceptron Artificial Neural Network[J].International Journal of Computer Science and Network Security,2008,8(10):224-229.
[17]GANGONDA S S,MUKHERJI P.Speech Processing for Marathi Numeral Recognition using MFCC and DTW Features[J/OL].International Journal of Engineering research and Applications.https://www.semanticscholar.org/paper/Speech-Processing-for-Marathi-Numeral-Recognition-Gangonda-Mukherji/0439f6c236c87472e26b924faf5de915b9a87a26#citing-papers.
[18]DAS A,ACHARJEE P,TALUKDAR H.Isolated BODO Spo-ken Word Identification using Mel-frequency Cepstral Coefficients and K-means clustering[J].International Journal of Advanced Research in Computer Science and Software Enginee-ring,2013,3(11):1501-1506.
[19]ALI M A,HOSSAIN M,BHUIYAN M N.Automatic Speech Recognition Technique for Bangla Words[J].International Journal of Advanced Science & Technology,2013,50:51-60.
[20]LIMKAR M,RAMARAO R,SAGVEKAR V.Isolated DigitRecognition Using MFCC and DTW[J].International Journal of Advanced Electrical & Electronics Engineering,2012,1(1):59-64.
[21]ELSHEIKH A H,SABA A I,ELAZIZ M A,et al.Deep lear-ning-based forecasting model for COVID-19 outbreak in Saudi Arabia[J].Process Safety and Environmental Protection,2021,149:223-233.
[22]HE Z R,TAO H B.Epidemiology and ARIMA Model of Positive-Rate of Influenza Viruses among Children in Wuhan,China:A Nine-Year Retrospective Study[J].International Journal of Infectious Diseases Ijid Official Publication of the International Society for Infectious Diseases,2018,74:61-70.
[23]WANG T,ZHOU Y,WANG L,et al.Using autoregressive integrated moving average model to predict the incidence of hemorrhagic fever with renal syndrome in Zibo,China,2004-2014[J].Japanese Journal of Infectious Diseases,2016,69(4):279-284.
[24]ZENG X,ZHANG X W,SUN M,et al.Research on vocal tract spectrum conversion based on GMM model and LPC-MFCC[J].Technical Acoustics,2020,39(4):451-455.
[1] 王学光, 诸珺文, 张爱新.
基于MFCC特征的声纹同一性鉴定方法
Identification Method of Voiceprint Identity Based on MFCC Features
计算机科学, 2021, 48(12): 343-348. https://doi.org/10.11896/jsjkx.210100038
[2] 沙毅,杨艳,黄烨,朱丽春,张志伟.
基于ARIMA模型的Ad-hoc网络节点位置预测加权分簇算法
ARIMA-based Weighted Clustering Algorithm for Prediction of Nodes' Location in Ad-hoc Network
计算机科学, 2012, 39(3): 47-50.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!