计算机科学 ›› 2021, Vol. 48 ›› Issue (12): 343-348.doi: 10.11896/jsjkx.210100038

• 信息安全 • 上一篇    下一篇

基于MFCC特征的声纹同一性鉴定方法

王学光1, 诸珺文1, 张爱新2   

  1. 1 华东政法大学刑事法学院 上海200052
    2 上海交通大学网络空间安全学院 上海200240
  • 收稿日期:2021-01-06 修回日期:2021-04-23 出版日期:2021-12-15 发布日期:2021-11-26
  • 通讯作者: 王学光(wangxueguang@ecupl.edu.cn)
  • 基金资助:
    国家重点研发计划项目(2017YFB0802103)

Identification Method of Voiceprint Identity Based on MFCC Features

WANG Xue-guang1, ZHU Jun-wen1, ZHANG Ai-xin2   

  1. 1 Criminal Justice College,East China University of Political Science and Law,Shanghai 200052,China
    2 School of Cyber Science and Engineering,Shanghai Jiao Tong University,Shanghai 200240,China
  • Received:2021-01-06 Revised:2021-04-23 Online:2021-12-15 Published:2021-11-26
  • About author:WANG Xue-guang,born in 1975,Ph.D,professor,is a member of China Computer Federation.His main research interests include computer networks,big data application and electronic data.
  • Supported by:
    National Key R & D Program of China(2017YFB0802103).

摘要: 声纹作为当代司法鉴定技术发展的产物,在现代声像资料鉴定中发挥了至关重要的作用。传统的声纹分析方法是基于声音处理工具进行手工分析的,考虑到其具有严格的文本相关性以及比对的臆断性的缺点,其作为证据鉴定意见的证明力有待加强。文中提出了一种基于Mel频率倒谱系数的同一性鉴定方法,即提取并量化包含原始声音的共振峰及其时间轴信息的包络作为声纹特征进行同一性比对。此方法改进了传统Mel频率倒谱系数的不足,提取共振峰的突变并将元音与响辅音的转变特性加入声纹特征,以提高其识别度。实验证明,此方法在检材与样本无关的情况下,同一性鉴定的准确率达到了85%,方差控制在9%左右,具有良好的同一性识别;而在非同一性鉴定中,该方法也能在结合人工分析的情况下给出较准确的结果。

关键词: Mel频率倒谱系数, MFCC特征, 同一性鉴定, 证明力补强

Abstract: As a product of the development of modern forensic technology,voiceprint plays an important role in modern audio-visual identification.The traditional voiceprint analysis method is based on the sound processing tools for manual analysis.Considering the shortcomings of strict text relevance and conjecture of comparison,its evidential power as evidence appraisal opinion needs to be strengthened.In this paper,a method of identification based on Mel frequency cepstrum coefficient is proposed,which is to extract and quantify the envelope containing the original sound formant and its time axis information as voiceprint features for identity comparison.This method improves the shortcomings of traditional Mel frequency cepstrum coefficient,which extracts the mutation of formant,and adds the transformation characteristics of vowels and consonants into voiceprint features to improve the correctness of recognition.Experiments show that the accuracy of identification is 85% and the variance is about 9% when the test material is independent of the sample text.Therefore,it has good identifiability for the same person identification of voiceprint.In the case of non same person identification of voiceprint,it proves to be far more accurate in combination with traditional manual analysis.

Key words: Identity identification, Mel frequency cepstrum coefficient, MFCC characteristics, Proof force reinforcement

中图分类号: 

  • TP391
[1]ZHAO Y Y.The confusion and solution of audio-visual mate- rials,electronic data and electronic evidence- from the perspective of information electronic technology[J].Journal of South China University of Technology(Social Science Edition),2020(2):1-10.
[2]KANG J T,WANG L,WANG X D,et al.A Review on Researches of Forensic Phonetics and Acoustics in 2017[J].Forensic Science and Technology,2018,43(3):179-186.
[3]ZHOU Y Y,KONG Q.Research of Feature Parameters in Voiceprint Recognition Technology Based on GMM-UBM[J].Computer Technology and Development,2020(5):1-11.
[4]LI W P.A study on the application of rank-sum test for comprehensive evaluation of acoustic pattern identification[J].Journal of Criminal Investigation Police University of China,1994(3):62-64.
[5]PETER F,CAO H L,LEI Y J.A developmental history of forensic speaker comparison in the UK[J].Evidence Science,2019,27(6):730-740.
[6]WANG Z N,CHEN Y,WU M H,et al.Acoustic Analysis of Mandarin Chinese Vowels Produced by Young Adults[J].Rehabilitation Medicine,2020,30(3):183-191.
[7]YU Y S.Research on Pornographic Audio Detection Algorithm Using MFCC Features and Vector Quantization[D].Lanzhou:Lanzhou University,2015.
[8]BRUMMUND M K,SGARD F,PETIT Y,et al.Three-dimensional finite element modeling of the human external ear:Simulation study of the bone conduction occlusion effecta[J].Journal of the Acoustical Society of America,2014,135(3):1433-1444.
[9]WU J Q,YU J J.An improved arithmetic of MFCC in speech recognition system[C]//2011 International Conference on Electronics,Communications and Control(ICECC).IEEE,2011:719-722.
[10]HIDAYAT R,BEJO A,SUMARYONO S,et al.Denoising Speech for MFCC Feature Extraction Using Wavelet Transformation in Speech Recognition System[C]//2018 10th International Conference on Information Technology and Electrical Engineering(ICITEE).2018:280-284.
[11]PETERSON G E,BARNEY H L.Control methods used in a study of the vowels[J].J.Acoust.Soc.Am.,1952,24(2):175-184.
[12]PETERSON G E.Parameters of vowel quality[J].J. of Speech & Hear. Res.,1961,4(1):10-29.
[13]STRANGE W.Evolving theories of vowel perception[J].J.Acoust.Soc.Am.,1989,85(5):2081-2087.
[14]VIJAYAN A,MATHAI B M,VALSALAN K,et al.Throat microphone speech recognition using mfcc[C]//2017 International Conference on Networks & Advances in Computational Technologies(NetACT).2017:392-395.
[15]YANG J B,XING Y L,CAO T Y,et al.Research on Speaker-Independent Speech Recognition Feature Based on Mellin Transform and Mel Frequency Analysis[J].Pattern Recognition and Artificial Intelligence,2020,18(3):350-353.
[16]MESEGUER N A.Speech analysis for automatic speech recognition[D].Trondheim:Norwegian University of Science and Technology,2009.
[17]WINURSITO A,HIDAYAT R,BEJO A.Improvement of MFCC feature extraction accuracy using PCA in Indonesian speech recognition[C]//2018 International Conference on Information and Communications Technology(ICOIACT).2018:379-383.
[18]RIGAZIO L,JUNQUA J,WELLEKENS C.Fundamentals of Speech Recognition[J].AT&T,1993:507.
[19]SENTHILDEVI K A,CHANDRA E.Keyword spotting system for Tamil isolated words using Multidimensional MFCC and DTW algorithm[C]//International Conference on Communications & Signal Processing.IEEE,2015:550-554.
[20]MAHESHA P,VINOD D S.LP-Hillbert transform based MFCC for effective discrimination of stuttering dysfluencies[C]//International Conference on Wireless Communications.2017:2561-2565.
[21]LI Q,YANG Y,LAN T,et al.MSP-MFCC:Energy-Efficient MFCC Feature Extraction Method with Mixed-Signal Proces-sing Architecture for Wearable Speech Recognition Applications[J].IEEE Access,2020(8):48720-48730.
[1] 王学光, 诸珺文, 张爱新.
基于ARIMA预测MFCC特征的声纹同一性鉴定方法
Identification Method of Voiceprint Identity Based on ARIMA Prediction of MFCC Features
计算机科学, 2022, 49(5): 92-97. https://doi.org/10.11896/jsjkx.210400071
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!