计算机科学 ›› 2020, Vol. 47 ›› Issue (6A): 187-195.doi: 10.11896/JsJkx.190900064

• 计算机图形学 & 多媒体 • 上一篇    下一篇

结合EHHT和CI的精神分裂症语音自动检测算法

田维维1, 周悦1, 尹旺1, 何凌1, 邓丽华1, 李元媛2   

  1. 1 四川大学电气工程学院 成都 610065;
    2 四川大学华西医院心理卫生中心 成都 610041
  • 发布日期:2020-07-07
  • 通讯作者: 李元媛(guoJipangxie@126.com)
  • 作者简介:2285675739@qq.com
  • 基金资助:
    成都市科技惠民技术研发项目(2015-HM01-00430-SF);国家自然基金青年科学基金(61503264);四川大学创新火花库项目(2082604401189);四川省科技厅项目(2019YFS0236)

Automatic Voice Detection Algorithm for Schizophrenic Combining EHHT and CI

TIAN Wei-wei1, ZHOU Yue1, YIN Wang1, HE Ling1, DENG Li-hua1 and LI Yuan-yuan2   

  1. 1 College of Electrical Engineering,Sichuan University,Chengdu 610065,China
    2 Mental Health Center of West China Hospital,Sichuan University,Chengdu 610041,China
  • Published:2020-07-07
  • About author:TIAN Wei-wei, born in 1998, postgradua-te.Her main research interests include speech signal processing and so on.
    LI Yuan-yuan, born in 1984, Ph.D.attending doctor.Her main research interests include psychiatry and mental health.
  • Supported by:
    This work was supported by the Chengdu Science and Technology Benefiting People Technology Research and Development ProJect,Sichuan Province,China (2015-HM01-00430-SF),National Natural Science Foundation of China (61503264), Sichuan University Innovation SparkBank ProJect,China(2082604401189) and Science and Technology Department ProJect of Sichuan Province,China (2019YFS0236).

摘要: 通过对精神分裂症语音的临床特点进行研究,实验采集了14个精神分裂症患者的686个元音数据样本和14个与之性别、年龄、文化程度相匹配的健康对照组的793个元音数据样本,来建立病理语音数据库,利用结合集成希尔伯特黄变换(Ensemble Hilbert-Huang Transform,EHHT)和倒谱内插(Cepstrum Interpolation,CI)的改进共振峰提取算法,来获取反映精神分裂症语音音质情感变化的声学特征参数集,结合支持向量机(Support Vector Machine,SVM)分类器来进行分类,实现了精神分裂症患者语音和健康对照组语音的自动检测,并设计实验讨论了白噪声的次数和方差、IMF分量个数、窗长4个因素对检测效果的影响,以及与经典的共振峰估算方法的比较。实验结果表明,文中提出的算法的检测正确率可以达到98.8%,精神分裂症患者在体现音质特征的共振峰语音声学参数上与健康对照组存在显著差异,并有可能为精神分裂症的临床辅助诊断研究提供一个全新、客观、定量和高效的指标。

关键词: 倒谱内插, 共振峰, 集成希尔伯特黄变换, 精神分裂症语音, 音质特征

Abstract: Through studying the clinical characteristics of schizophrenic speech,the experiment collected 686 vowel data samples from 14 schizophrenic patients and 793 vowel data samples from 14 healthy controls matched with gender,age and education level to establish a pathological voice database.Using the improved formant extraction algorithm combining Ensemble Hilbert-Huang Transform (EHHT) and Cepstrum Interpolation (CI) to obtain the acoustic feature parameter set reflecting emotion change of schizophrenic voice quality,then combined with the Support Vector Machine (SVM) classifier for classification,automatic voice detection of schizophrenic patients and the healthy controls is achieved.Besides,it designed experiments to discuss the influence of the four factors,that is,the number and variance of white noise,the number of IMF components and the window length,on the detection effect,and compared with the classical formant estimation methods.Experimental results show that the detection accuracy of the proposed algorithm can reach 98.8%.The patients with schizophrenia have a significant difference in the acoustical parameters of the formants represent the sound quality feature with the healthy controls,and it may provide a new obJective,quantitative and efficient indicator for the clinical assistant diagnostic research of schizophrenia.

Key words: Cepstral interpolation, Ensemble hilbert-huang transform, Formant, Schizophrenic voice, Sound quality feature

中图分类号: 

  • TP391.9
[1] QIAO Y S,ZHANG S Q,CUI W D,et al.The Effect of Mental Health Education Diverse Family Therapy on Social and Family Functioning in Patients with Schizophrenia.China Journal of Health Psychology,2019,27(2):11-15.
[2] DELATTRE P.The Physiological Interpretation of Sound Spectrograms.Publications of the Modern Language Association of America,1951,66(5): 864-875.
[3] JIN Y F,XIONG L N,CAI C L,et al.Research Progress on Psychotic Symptom Assessment Tools in Patients with Schizophrenia.Journal of Nursing,2015,22(16):17-21.
[4] WANG X H.Diagnostic Research Method of Schizophrenia Based on fMRI Technology.Imaging Research and Medical Applications,2019,3(2):59-60.
[5] WANG J J,WANG P F,QUAN W X,et al.The Characteristics of Language Cognition and Its Neural Basis in Schizophrenia .Progress in Biochemistry and Biophysics,2015,42(1):49-55.
[6] BREAKSPEAR M.The Nonlinear Theory of Schizophrenia.Australian and New Zealand Journal of Psychiatry,2006,40(1):20-35.
[7] CHAKRABORTY D,TAHIR Y,YANG Z,et al.Assessment and Prediction of Negative Symptoms of Schizophrenia from RGB+D Movement Signals//2017 IEEE 19th International Workshop on Multimedia Signal Processing (MMSP).New York:IEEE Press,2017:1-6.
[8] XUE C B,CHEN Q L,ZHU J,et al.Progress in Schizophrenic Genomics Research.International Journal of Psychiatry,2014,41(1):17-20.
[9] RAPCAN V,D’ARCY S,YEAP S,et al.Acoustic and Temporal Analysis of Speech:A Potential Biomarker for Schizophrenia.Medical Engineering and Physics,2010,32(9):1074-1079.
[10] COMPTON,MICHAEL T,ANYA L,et al.The Aprosody of Schizophrenia:Computationally Derived Acoustic Phonetic Underpinnings of Monotone Speech.Schizophrenic Research,2018,197:392-399.
[11] GOLD R,BUTLER P,REVHEIM N,et al.Auditory Emotion Recognition Impairments in Schizophrenia:Relationship to Acoustic Features and Cognition.American Journal of Psychia-try,2012,169(4):424-432.
[12] XU S,YANG Z,CHAKRABORTY D,et al.Automatic Verbal Analysis of Interviews with Schizophrenic Patients//2018 IEEE 23rd International Conference on Digital Signal Processing (DSP).New York:IEEE Press,2018:1-5.
[13] COHEN A S,MITCHELL K R,DOCHERTY N M,et al.Vocal Expression in Schizophrenia:Less Than Meets the Ear.Journal of Abnormal Psychology,2016,125(2):299-309.
[14] ZHANG J,PAN Z D,GUI C,et al.Clinical Investigation of Speech Signal Features Among Patients with Schizophrenia.Shanghai Archives of Psychiatry,2016,28(2):95-102.
[15] HAN W J,LI H F,RUAN H B,et al.Review on Speech Emotion Recognition.Journal of Software,2014,25(1):37-50.
[16] GOBL C,CHASAIDE A N.The Role of Voice Quality in Communicating Emotion,Mood and Attitude.Speech Communication,2003,40(1/2):189-212.
[17] JIANG H H,HU B.Speech Emotion Recognition in Mandarin Based on PCA and SVM.Computer Science,2015,42(11):270-273.
[18] SONG Z Y.Application of MATLAB in Speech Signal Analysis and Synthesis.BeiJing:BeiJing University of Aeronautics and Astronautics Press,2013:1-344.
[19] YANG J X,SHE Y M,FU M J,et al.Formant Estimation of Isolated Words in the Wa Language Based on Adaptive Variational Mode Decomposition.Journal of Yunnan Minzu University (Natural Science Edition),2019,28(3):83-91.
[20] FRANCES M A,PINCUS H A,FIRST M B.Diagnostic and Statistical Manual of Mental Disorders.BMC Med,2013,17:133-137.
[21] COVINGTON M A,LUNDEN S L A,CRISTOFARO S L,et al.Phonetic Measures of Reduced Tongue Movement Correlate with Negative Symptom Severity in Hospitalized Patients with First-Episode Schizophrenia-Spectrum Disorders.Schizophrenia Research,2012,142(1/2/3):93-95.
[22] BERNARDINI F,LUNDEN A,COVINGTON M,et al.Associations of Acoustically Measured Tongue/Jaw Movements and Portion of Time Speaking with Negative Symptom Severity in Patients with Schizophrenia in Italy and the United States.Psychiatry Research,2016,239:253-258.
[23] HAN Z Y,WANG J,WANG D,et al.Dynamic Feature Extraction for Speech Signal Based on Formant Curve.Computer Technology and Development,2017,27(6):72-80.
[24] FENG J C,PAN S Y.Extraction Algorithm of Vital Signals Based on Empirical Mode Decomposition.Journal of South China University of Technology (Natural Science Edition),2010,38(10):1-6.
[25] WU Z,HUANG N E.Ensemble Empirical Mode Decomposi tion:A Noise-Assisted Data Analysis Method.Advances in Adaptive Data Analysis,2009,1(1):1-41.
[26] WANG Y H,YEH C H,YOUNG H W V,et al.On the Computational Complexity of the Empirical Mode Decomposition Algorithm.Physica A:Statistical Mechanics and Its Applications,2014,400:159-167.
[27] FU K.Research on Feature Extraction Methods and Its Application to Nonstationary Signal Based on HHT.Chongqing:Chongqing University,2015.
[28] MA T L.Improvement of HHT Method and Its Application in Ground Tilting Tide Signal.Kunming:Kunming University of Technology,2013.
[29] BAO H Q,LIN M C.Experimental Phonetics Summary (Updated Vision) .BeiJing:Peking University Press,2014:107-113.
[30] ZHAO T T,YANG H W.Formant Extraction Algorithm of Speech Signal by Combining EMD and WMCEP.Computer Engineering and Application,2015,51(9):207-212.
[31] ZHUO J W,WANG H J.MATLAB Mathematical Modeling Method and Practice (3rd Edition) .BeiJing:BeiJing University of Aeronautics and Astronautics Press,2018:66-69.
[32] ZHAO Y,YIN X F,CHEN K A.A New Formant Detection Algorithm Based on Cepstrum.Applied Acoustics,2010,29(6):416-424.
[33] HAN F,ZHENG J J.Improved Resonance Peak Detection Algorithm Based on LPC.Electronic Design Engineering,2017,25(17):85-89.
[34] HUANG H,CHEN X X.Speech Formant Frequency Estimation Based on Hilbert Huang Transform.Journal of ZheJiang University (Engineering Science),2006,40(11):1926-1930.
[35] YU F Q,XIAO Z.Finding Speech Formant by Using the Chara-cter of Hilbert-Huang Transform as An Adaptive Band-Filter.Technical Acoustics,2008,27(2):266-270.
[1] 柴慧敏, 张勇, 方敏.
基于特征相似度聚类的空中目标分群方法
Aerial Target Grouping Method Based on Feature Similarity Clustering
计算机科学, 2022, 49(9): 70-75. https://doi.org/10.11896/jsjkx.210800203
[2] 汪晋, 刘江.
基于GPU的并行DILU预处理技术
GPU-based Parallel DILU Preconditioning Technique
计算机科学, 2022, 49(6): 108-118. https://doi.org/10.11896/jsjkx.210300259
[3] 邵欣欣.
TI-FastText自动商品分类算法
TI-FastText Automatic Goods Classification Algorithm
计算机科学, 2022, 49(6A): 206-210. https://doi.org/10.11896/jsjkx.210500089
[4] 毛森林, 夏镇, 耿新宇, 陈剑辉, 蒋宏霞.
基于密度敏感距离和模糊划分的改进FCM算法
FCM Algorithm Based on Density Sensitive Distance and Fuzzy Partition
计算机科学, 2022, 49(6A): 285-290. https://doi.org/10.11896/jsjkx.210700042
[5] 姚烨, 朱怡安, 钱亮, 贾耀, 张黎翔, 刘瑞亮.
一种基于异质模型融合的 Android 终端恶意软件检测方法
Android Malware Detection Method Based on Heterogeneous Model Fusion
计算机科学, 2022, 49(6A): 508-515. https://doi.org/10.11896/jsjkx.210700103
[6] 毛典辉, 黄晖煜, 赵爽.
符合监管合规性的自动合成新闻检测方法研究
Study on Automatic Synthetic News Detection Method Complying with Regulatory Compliance
计算机科学, 2022, 49(6A): 523-530. https://doi.org/10.11896/jsjkx.210300083
[7] 周楚霖, 陈敬东, 黄凡.
基于无迹粒子滤波的WiFi-PDR融合室内定位技术
WiFi-PDR Fusion Indoor Positioning Technology Based on Unscented Particle Filter
计算机科学, 2022, 49(6A): 606-611. https://doi.org/10.11896/jsjkx.210700108
[8] 徐佳楠, 张天瑞, 赵伟博, 贾泽轩.
面向供应链风险评估的改进BP小波神经网络研究
Study on Improved BP Wavelet Neural Network for Supply Chain Risk Assessment
计算机科学, 2022, 49(6A): 654-660. https://doi.org/10.11896/jsjkx.210800049
[9] 陈于思, 艾志华, 张清华.
基于三角不等式判定和局部策略的高效邻域覆盖模型
Efficient Neighborhood Covering Model Based on Triangle Inequality Checkand Local Strategy
计算机科学, 2022, 49(5): 152-158. https://doi.org/10.11896/jsjkx.210300302
[10] 赵耿, 王超, 马英杰.
基于混沌序列相关性的峰均比抑制研究
Study on PAPR Reduction Based on Correlation of Chaotic Sequences
计算机科学, 2022, 49(5): 250-255. https://doi.org/10.11896/jsjkx.210400292
[11] 林金城, 纪庆革, 钟圳伟.
考虑行人特征与领导者角色的改进社会力模型
Modified Social Force Model Considering Pedestrian Characteristics and Leaders
计算机科学, 2022, 49(5): 347-354. https://doi.org/10.11896/jsjkx.210500144
[12] 蒋化南, 张帅, 林宇斐, 李豪.
基于MPI的分布式并行Gazebo仿真优化与测试
Simulation Optimization and Testing Based on Gazebo of MPI Distributed Parallelism
计算机科学, 2021, 48(11A): 672-677. https://doi.org/10.11896/jsjkx.210100109
[13] 邵欣欣.
基于Canopy和共享最近邻的服务推荐算法
Service Recommendation Algorithm Based on Canopy and Shared Nearest Neighbor
计算机科学, 2020, 47(11A): 479-481. https://doi.org/10.11896/jsjkx.200200031
[14] 陈沛, 郑万波, 刘文奇, 肖敏, 张凌霄.
基于多种模型的云南省农作物主产区域部分气候指标分析与预测
Analysis and Forecast of Some Climate Indexes in Main Producing Areas of Yunnan Province Based on Multiple Models
计算机科学, 2020, 47(11A): 496-503. https://doi.org/10.11896/jsjkx.200200059
[15] 曾蕾, 李豪, 林宇斐, 张帅.
基于异步机制的Gazebo仿真优化研究
Study on Simulation Optimization of Gazebo Based on Asynchronous Mechanism
计算机科学, 2020, 47(11A): 593-598. https://doi.org/10.11896/jsjkx.200300131
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!