Computer Science ›› 2019, Vol. 46 ›› Issue (8): 183-188.doi: 10.11896/j.issn.1002-137X.2019.08.030

• Information Security • Previous Articles     Next Articles

Study on Restoration of Electronic Disguised Voice Based on DC-CNN

WANG Yong-quan1,2, SHI Zheng-yu1,2,3 , ZHANG Xiao4   

  1. (School of Criminal Justice,East China University of Political Science and Law,Shanghai 201620,China)1
    (Department of Information Science and Technology,East China University of Political Science and Law,Shanghai 201620,China)2
    (School of Data Science,Fudan University,Shanghai 200433,China)3
    (Key Laboratory of Information Network Security of Ministry of Public Security,The Third Research Institute of the Ministry of Public Security,Shanghai 200120,China)4
  • Received:2018-10-05 Online:2019-08-15 Published:2019-08-15

Abstract: Aiming at the fact that there is no breakthrough in modeling for the electronic disguised voicer estoration,this paper proposed a new model based on Dilated Casual-Convolution Neural Network (DC-CNN) for restoring electronic disguised voice.DC-CNN is used as the framework of restoring model,and convolution and nonlinear mapping are performed on the historical sampling acoustic information and restoring factors of the electronic disguised voice.Meanwhile,the model’s neural network adopts skip-connection for deep transmission and outputs the restoring voice after companding transformation.The model has obvious characteristics such as nonlinear mapping,expansibility,adaptability and conditionality,concurrency,etc.In the experiment,the original voice was processed by three basic disguised functions:pitch,tempo and rate.Then,voiceprint features comparison,LPC analysis and voice identity of human audiometry recognition were made between restoring voice and original voice.The voiceprint of the restoringvoice fits that of the original voice perfectly,and high quality formant waveform restoration is achieved.The piano music’s and English voice’sgeneral restoring fitting rates of the formant’s parameters are 79.03% and 79.06% respectively,which are much higher than the similarity of electronic disguised voice to original voice.The results turn out that this model can minify the electronic disguised characteristics effectively and it is efficient on the restoration of electronic disguised piano music and English voice

Key words: DC-CNN, Electronic disguised voice, Gated activation units, Restoring voice, Restoring factor

CLC Number: 

  • TP391
[1]张翠玲,赵晓波.电声伪装语音的声学研究[C]∥第七届中国语音学学术会议暨语音学前沿问题国际论坛.北京,2006.
[2]ZHANG C L,TAN T J,LIU S.Study on Automatic Speaker Recognition of Disguised Voices [J].Forensic Science and Technology,2007(2):18-21.(in Chinese) 张翠玲,谭铁军,刘昇.伪装语音的自动话者识别研究[J].刑事技术,2007(2):18-21.
[3]GONZALEZ R,KANERVISTO A,HAUTAMÄKI V,et al. Perceptual Evaluation of the Effectiveness of Voice Disguise by Age Modification[J].arXiv:1804.08910,2018.
[4]TAO D Y.Study on Speaker Recognition Under Electronic Disguised Voices[D].Nanjing:Nanjing University of Posts and Telecommunications,2016.(in Chinese) 陶定元.电子伪装语音下的说话人识别方法研究[D].南京:南京邮电大学,2016.
[5]LI Y P,TAO D Y,LIN L.Study on Electronic Disguised Voice Speaker Recognition Based on DTW Model Compensation [J].Computer Technology and Development,2017(1):93-96.(in Chinese) 李燕萍,陶定元,林乐.基于DTW模型补偿的伪装语音说话人识别研究[J].计算机技术与发展,2017(1):93-96.
[6]ZHANG G Q,JIN Y Z,LIU H W,et al.Study on Changing Rules of Electronic Disguised Voice [J].Evidence Science,2010,18(4):503-509.(in Chinese) 张桂清,金怡珠,刘红伟,等.电子伪装语音的变声规律研究[J].证据科学,2010,18(4):503-509.
[7]OORD A,KALCHBRENNER N,VINYALS O,et al.Conditio- nal Image Generation with PixelCNNDecoders[J].arXiv:1606.05328,2016.
[8]OORD A,DIELEMAN S,ZEN H,et al.WaveNet:A Generative Model for Raw Audio[J].arXiv:1609.03499,2016.
[9]CHEN K,ZHANG W,DUBNOV S,et al.The Effect of Explicit Structure Encoding of Deep Neural Networks for Symbolic Music Generation[J].arXiv:1811.08380,2018.
[10]YIN W,KANN K,YU M,et al.Comparative Study of CNN and RNN for Natural Language Processing[J].arXiv:1702.01923,2017.
[11]FU W B,SUN T,LIANG J,et al.Review of Principle and Application of Deep Learning[J].COMPUTER SCIENCE,2018,45(s1):24-28,53.(in Chinese) 付文博,孙涛,梁藉,等.深度学习原理及应用综述[J].计算机科学,2018,45(s1):24-28,53.
[12]伍宏,传顾宇,凌震华.基于深度卷积神经网络的语音参数合成器[C]∥第十四届全国人机语音通讯学术会议.江苏,2017.
[13]YU F,KOLTUN V.Multi-Scale Context Aggregation by Dilated Convolutions [C]∥International Conference on Learning Representations.2016.
[14]WANG Z,JI S.Smoothed Dilated Convolutions for Improved Dense Prediction[C]∥ACM SIGKDD Conference on Know-ledge Discovery and Data Mining.London,2018.
[15]TANAKA M.Weighted Sigmoid Gate Unit for an Activation Function of Deep Neural Network[J].arXiv:1810.01829,2018.
[16]王永全.声像资料司法鉴定实务[M].北京:法律出版社,2013.
[17]MCCANE B,SZYMANSKI L.Some Approximation Bounds for Deep Networks[J].arXiv:1803.02956,2018.
[18]LIU G,XU C,CHEN S Y,et al.Image Classification with Stacked Restricted Boltzmann Machines and Hybrid Neural Network [J].Journal of Chinese Computer Systems,2017,38(9):2146-2151.(in Chinese) 刘罡,徐超,陈思义,等.结合深度置信网络与混合神经网络的图像分类方法[J].小型微型计算机系统,2017,38(9):2146-2151.
[19]赵力.语音信号处理[M].北京:机械工业出版社,2009:72.
[1] CHEN Zhi-qiang, HAN Meng, LI Mu-hang, WU Hong-xin, ZHANG Xi-long. Survey of Concept Drift Handling Methods in Data Streams [J]. Computer Science, 2022, 49(9): 14-32.
[2] WANG Ming, WU Wen-fang, WANG Da-ling, FENG Shi, ZHANG Yi-fei. Generative Link Tree:A Counterfactual Explanation Generation Approach with High Data Fidelity [J]. Computer Science, 2022, 49(9): 33-40.
[3] ZHANG Jia, DONG Shou-bin. Cross-domain Recommendation Based on Review Aspect-level User Preference Transfer [J]. Computer Science, 2022, 49(9): 41-47.
[4] ZHOU Fang-quan, CHENG Wei-qing. Sequence Recommendation Based on Global Enhanced Graph Neural Network [J]. Computer Science, 2022, 49(9): 55-63.
[5] SONG Jie, LIANG Mei-yu, XUE Zhe, DU Jun-ping, KOU Fei-fei. Scientific Paper Heterogeneous Graph Node Representation Learning Method Based onUnsupervised Clustering Level [J]. Computer Science, 2022, 49(9): 64-69.
[6] CHAI Hui-min, ZHANG Yong, FANG Min. Aerial Target Grouping Method Based on Feature Similarity Clustering [J]. Computer Science, 2022, 49(9): 70-75.
[7] ZHENG Wen-ping, LIU Mei-lin, YANG Gui. Community Detection Algorithm Based on Node Stability and Neighbor Similarity [J]. Computer Science, 2022, 49(9): 83-91.
[8] LYU Xiao-feng, ZHAO Shu-liang, GAO Heng-da, WU Yong-liang, ZHANG Bao-qi. Short Texts Feautre Enrichment Method Based on Heterogeneous Information Network [J]. Computer Science, 2022, 49(9): 92-100.
[9] XU Tian-hui, GUO Qiang, ZHANG Cai-ming. Time Series Data Anomaly Detection Based on Total Variation Ratio Separation Distance [J]. Computer Science, 2022, 49(9): 101-110.
[10] NIE Xiu-shan, PAN Jia-nan, TAN Zhi-fang, LIU Xin-fang, GUO Jie, YIN Yi-long. Overview of Natural Language Video Localization [J]. Computer Science, 2022, 49(9): 111-122.
[11] CAO Xiao-wen, LIANG Mei-yu, LU Kang-kang. Fine-grained Semantic Reasoning Based Cross-media Dual-way Adversarial Hashing Learning Model [J]. Computer Science, 2022, 49(9): 123-131.
[12] ZHOU Xu, QIAN Sheng-sheng, LI Zhang-ming, FANG Quan, XU Chang-sheng. Dual Variational Multi-modal Attention Network for Incomplete Social Event Classification [J]. Computer Science, 2022, 49(9): 132-138.
[13] DAI Yu, XU Lin-feng. Cross-image Text Reading Method Based on Text Line Matching [J]. Computer Science, 2022, 49(9): 139-145.
[14] QU Qian-wen, CHE Xiao-ping, QU Chen-xin, LI Jin-ru. Study on Information Perception Based User Presence in Virtual Reality [J]. Computer Science, 2022, 49(9): 146-154.
[15] ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!