计算机科学 ›› 2024, Vol. 51 ›› Issue (11A): 240400021-4.doi: 10.11896/jsjkx.240400021

• 图像处理&多媒体技术 • 上一篇    下一篇

基于MLP的伪装语音说话人性别鉴定

张晓, 管林玉   

  1. 公安部第三研究所 上海 201204
  • 出版日期:2024-11-16 发布日期:2024-11-13
  • 通讯作者: 张晓(526993512@qq.com)
  • 基金资助:
    国家重点研发计划(2021YFC3320105);教育部人文社会科学研究项目(23YJA820015)

Gender Recognition of Electronic Disguised Voices Based on MLP

ZHANG Xiao, GUAN Linyu   

  1. The Third Research Institute of Public Security,Shanghai 201204,China
  • Online:2024-11-16 Published:2024-11-13
  • About author:ZHANG Xiao,born in 1987,master,associate professor,is a member of CCF(No.37630M).Her main research interests include network information security,electronic data and audio-visual information,and computer juridical expertise.
  • Supported by:
    National Key Research and Development Program of China(2021YFC3320105) and Program for the Humanities and Social Science of Ministry of Education of China(23YJA820015).

摘要: 文中提出了一种基于神经网络的伪装语音说话人识别模型,用以实现从共振峰的中心频率、带宽、音强等参数识别伪装语音说话人的性别。该模型以多层感知机(Multi-Layer Perceptron,MLP)为框架,经全连接的非线性堆叠计算获取识别结果,并在模型的训练阶段采用L-BFGS进行优化参数的求解。实验中采用SoundTouch对男性和女性的自然语音进行伪装,探讨了网络结构与激活函数对该模型的影响,以及该识别模型对不同电子伪装手段的适应能力。实验结果表明,基于MLP的识别模型能高效区分采用不同电子伪装手段伪装后的语音对应的说话人的性别。

关键词: 多层感知机, 电子伪装语音, 性别鉴定, 共振峰, 说话人

Abstract: A neural-network-based disguised voices recognition model is proposed to realize the gender identification of the disguised speech speaker from the parameters such as the formant center frequency,bandwidth and intensity of sound.The model uses multi-layer perceptron(MLP) as the framework to obtain the gender recognition results through the fully connected non-linear stacking calculation,and uses L-BFGS to solve the parameters optimization in training.This paper uses SoundTouch to disguise the original voices of the male and the female respectively,and then linear predictive coding(LPC) extracts various parameters such as the center frequency,bandwidth and sound intensity of the formant,and eliminates the outliers.Then experiment is carried out to explore the influences of network structure and activation function on the model as well as the adaptability of this recognition model to different electronic disguised methods.The experimental results show that the MLP-based recognition model can effectively distinguish the gender of the speaker corresponding to the voice disguised by different methods.This laid the foundation for electronic disguised voice speaker recognition.

Key words: Multi-layer perceptron(MLP), Electronic disguised voice, Gender recognition, Formant, Speaker

中图分类号: 

  • TP391
[1]ZHANG G Q,JIN Y Z,LIU H W,et al.Study on changing rulesof electronic camouflage audio[J].Evidence Science,2010,18(4):503-509.
[2]ENDRES W,BAMBACH W,FLOSSER G.Voice spectrograms as a function of age,voice disguise and voice imitation[J].J. Acoust. Soc. Am.,1971,(49):1842-1848.
[3]HANSEN J H,HASAN T.Speaker recognition by machinesand humans:a tutorial review [J].IEEE Signal Process Magazine,2015,32(6):74-99.
[4]ZHANG C.Acoustic Analysis of Disguised Voices with Raised and Lowered Pitch [C]//IEEE.ISCSLP,2012:353-357.
[5]RODMAN R.Computer Recognition of Speakers who Disguise Their Voice [C] //Proceedings of the International Conference on Signal Processing Applications & Technology.USA:Texas,2000.
[6]ZHAO L.Speech signal processing [M]//Beijing:Machinery Industry Press,2009:11.
[7]Gender Recognition of Electronic Disguised Voices:Chinese[P].Patent ZL 2019 1 0959040.[2020-10-23].
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!