计算机科学 ›› 2024, Vol. 51 ›› Issue (11A): 240400021-4.doi: 10.11896/jsjkx.240400021
张晓, 管林玉
ZHANG Xiao, GUAN Linyu
摘要: 文中提出了一种基于神经网络的伪装语音说话人识别模型,用以实现从共振峰的中心频率、带宽、音强等参数识别伪装语音说话人的性别。该模型以多层感知机(Multi-Layer Perceptron,MLP)为框架,经全连接的非线性堆叠计算获取识别结果,并在模型的训练阶段采用L-BFGS进行优化参数的求解。实验中采用SoundTouch对男性和女性的自然语音进行伪装,探讨了网络结构与激活函数对该模型的影响,以及该识别模型对不同电子伪装手段的适应能力。实验结果表明,基于MLP的识别模型能高效区分采用不同电子伪装手段伪装后的语音对应的说话人的性别。
中图分类号:
[1]ZHANG G Q,JIN Y Z,LIU H W,et al.Study on changing rulesof electronic camouflage audio[J].Evidence Science,2010,18(4):503-509. [2]ENDRES W,BAMBACH W,FLOSSER G.Voice spectrograms as a function of age,voice disguise and voice imitation[J].J. Acoust. Soc. Am.,1971,(49):1842-1848. [3]HANSEN J H,HASAN T.Speaker recognition by machinesand humans:a tutorial review [J].IEEE Signal Process Magazine,2015,32(6):74-99. [4]ZHANG C.Acoustic Analysis of Disguised Voices with Raised and Lowered Pitch [C]//IEEE.ISCSLP,2012:353-357. [5]RODMAN R.Computer Recognition of Speakers who Disguise Their Voice [C] //Proceedings of the International Conference on Signal Processing Applications & Technology.USA:Texas,2000. [6]ZHAO L.Speech signal processing [M]//Beijing:Machinery Industry Press,2009:11. [7]Gender Recognition of Electronic Disguised Voices:Chinese[P].Patent ZL 2019 1 0959040.[2020-10-23]. |
|