计算机科学 ›› 2020, Vol. 47 ›› Issue (1): 144-152.doi: 10.11896/jsjkx.180701349

所属专题: 医学图像

• 计算机图形学&多媒体 • 上一篇    下一篇

基于多延迟四阶累积量倍频程谱线的腭裂语音咽擦音自动检测算法

何飞1,孟雨璇1,田维维1,王熙月1,何凌1,尹恒2   

  1. (四川大学电气信息学院 成都610065)1;
    (口腔疾病研究国家重点实验室 成都610041)2
  • 收稿日期:2018-07-21 发布日期:2020-01-19
  • 通讯作者: 尹恒(yinheng@scu.edu.cn)
  • 基金资助:
    国家自然基金青年科学基金(61503264)

Automatic Detection Algorithm of Pharyngeal Fricative in Cleft Palate Speech Based on Multi-delay Fourth-order Cumulant Octave Spectral Line

HE Fei1,MENG Yu-xuan1,TIAN Wei-wei1,WANG Xi-yue1,HE Ling1,YIN Heng2   

  1. (School of Electrical Engineering and Information,Sichuan University,Chengdu 610065,China)1;
    (State Key Laboratory of Oral Diseases,Chengdu 610041,China)2
  • Received:2018-07-21 Published:2020-01-19
  • About author:HE Fei,born in 1998,postgraduate.Her main research interests include speech signal processing and image processing;YIN Heng,born in 1971,master.Her main research interests include evaluation of cleft palate speech.
  • Supported by:
    This work was supported by the Young Scientists Fund of the National Natural Science Foundation of China (61503264).

摘要: 为了实现对腭裂语音咽擦音及正常音节的自动分类检测,通过对腭裂咽擦患者发音特点的研究,提出了基于多延迟四阶累积量倍频程谱线(Fourth-order Cumulant One-third Octave Spectra Line,FTSL)的腭裂语音咽擦音自动检测算法。目前,咽擦音的研究多基于咽擦音的辅音时长及其在频域的能量分布等特征,实现了咽擦音及正常擦音自动检测的其他研究较少。文中实验基于腭裂语音咽擦音的发音特性,通过研究语音信号的多延迟四阶累计量,利用1/3倍频程算法提取特征谱线,实现了腭裂语音咽擦音与正常擦音的自动分类检测。实验提取了200个正常擦音辅音和194个腭裂语音咽擦音辅音的FTSL特征谱线,使用SVM(Support Vector Machine)分类器进行分类,并设计了FTSL谱线与其他传统语音特征的对比实验,进行了充分的分析讨论。实验结果表明,FTSL谱线对咽擦音的自动分类检测正确率高达92.7%,具有较优的性能,能为临床腭咽功能评估提供有效、客观、无创的辅助依据。

关键词: 1/3倍频程谱线, FTSL谱线, 四阶累积量, 咽擦音

Abstract: In order to realize the automatic classification and detection of palate pharyngeal fricative and normal speech, an automatic pharyngeal fricative detection algorithm based on multi-delay fourth-order cumulant one-third octave spectral line (FTSL) was proposed by studying the pronunciation characteristics of cleft palate patients with pharyngeal fricative.Currently,most researches involved with the detection of pharyngeal fricatives are based on the length of consonants and the energy distribution of speech in frequency-domain.There exist few researches which have achieved automatic classification of pharyngeal fricatives and normal speech.This experiment is based on the pronunciation characteristics of pharyngeal fricative.Each frame’s multi-delay fourth-ordercumulant is computed,and then one-third octave is used to extract the FTSL.Automatic classification of pharyngeal fricative and normal speech is realized by FTSL.In this experiment,the FTSL of 200 normal consonants and 194 consonants of pharyngeal fricative are extracted,and the SVM classifier is used to classify.Besides,comparative experiments were conducted on FTSL feature and traditional acoustic features,and the results were fully analyzed and discussed in this paper.The experimental results show that the proposed FTSL has an accurate rate of 92.7% for the automatic classification of pharyngeal speeches,and it has excellent performance and can provide an effective,objective and non-invasive auxiliary basis for clinical pharyngeal state assessment.

Key words: Fourth-order cumulant, FTSL spectral line, One-third octave spectral line, Pharyngeal fricative

中图分类号: 

  • TP391.9
[1]XIAO Y,LIANG M G.Automatic Detection ofPharyngealFricativesin Cleft Palate Speech[C]∥Proceedings of the 4th International Conferenceon Computer Engineering and Networks.Springer International Publishing.2015:861-868.
[2]REN Z,ZHOU X,MA L,et al.Comparison Study of Vocal Attack Time in Patients With Cleft Palate With and Without Glottal Stop in Mandarin[J].Journal of Voice:Official Journal of the Voice Foundation,2018,33(5):803.e15-803.e21.
[3]MA S W,REN Z P,WEN Y X,et al.Compensatory articulation in patients with repaired cleft palate and the speech therapy approach[J].Journal of Practical Stomatology,2012,28(5):619-622.
[4]GUERRA T A,MARINO V C C,ROCHA D C,et al.Nasalance at presence and absence of pharyngeal fricative[J].Speech,Language,Hearing Sciences and Education Journal,2016,18(2):449-458.
[5]DENG S H,WANG T S,HUANG R C,et al.Study on the characteristics of the speech of persons with disorder and sound after Postoperatively in cleft palate[J].China Prac Med,2017,12(2):194-195.
[6]JIANG L P,WANG G M,YANG Y S,et al.The study on articulation characteristics of the patients after pharyngoplasty[J].China Journal of Oral Maxillofacial Surgery,2005(1):56-58.
[7]MA L.The acoustic features and the articulation character of tougue movement of pharyngeal fricatives ∥Abstracts of the 2nd Chinese International Congress on Oral and Maxillofacial Surgery in Conjunction with 5th National Congress on Oral and Maxillofacial Surgery.1998:267-268.
[8]GARCIA A F,MARINO V C,PEGORARO-KROOK M I,et al.Nasalance during use of pharyngeal and glottal place of production[J].Codas,2014,26(5):395-401.
[9]WANG G M,CHEN Y,QIU W L,et al.Clinical application and evaluation in analysis of articulation disorders WTH TSL[J].J. Oral Maxil. Surg.,2000(3):189-197.
[10]ZHANG C H,ZHOU H Y,JIAO X H.Phonetic fbatIlres of before and after posterior pharyngeal flap surgery in older parents with velopharyngeal insufficiency[J].Journal of Harbin Medical University,2016,50(2):162-165.
[11]ZHU Y S,SHI J J.A acoustic technology analysis of misarticulation in patients with cleft palate [J].Journal of Practical Stomatology,2004(3):364-366.
[12]ALAM M K,ZULKIPLI A S,HAQUE S,et al.A perceptual evaluation of speech disorders in children with repaired unilateral cleft lip and palate in Hospital UniversitiSains Malaysia[J].Angladesh Journal of Medical Science,2018,17(3):470-478.
[13]MCLEOD S, CROWE K.Children’s Consonant Acquisition in 27 Languages:A Cross-Linguistic Review[J].Am. J. Speech Lang Pathol.,2018,27:1546-1571.
[14]雷丽.腭裂语音治疗学[M].武汉:湖北科学技术出版社,2004:24-37.
[15]TROST J E.Articulatory additions to the classical description of the speech of persons with cleft palate[J].Cleft Palate Journal,1981,18(3):193-203.
[16]张贤达.时间序列分析一高阶统计量方法[M].北京:清华大学出版社,1999.
[17]张贤达.现代信号处理(2版)[M].北京:清华大学出版社,2002.
[18]DONG X H.Application of MUSIC algorithm based on fourth-order cumulants in high frequency ground wave radar[D].Wuhan:Wuhan University,2004.
[19]FAN Y Y.High order statistics feature extraction of ship noise and its response[D].Xi’an:Northwestern Polytechnical University,1999.
[20]VOSOUGHI E,JAVAHERIAN A.Parameters effective on estimating a nonstationary mixed-phase wavelet using cumulant matching approach[J].Journal of Applied Geophysics,2018,148:83-97.
[21]LV J Y.High Order Statistics Analysis and its Applications
[D].Beijing:Beijing University of Posts and Telecommunications,2014.
[22]ANANTHRAM S,GEORGIOS B,et al.Bibliography on higher-order statistics[J].Signal Processing,1997(60):65-66.
[23]MENDEL J M.Tutorial on higher order statistics (spectra) in signal processing and system theory:Theoretical results and some applications[J].Proc.IEEE,1991,79(3):278-305.
[24]ALBATAINEH Z.Robust blind channel estimation algorithm for linear STBC systems using fourth ordercumulant matrices[J].Telecommunication Systems,2018,68(3):573-582.
[25]LIANG H,YANG C S.A Signal DetectionAlgorithm Based on Fourth-orderCumulant[J].Torpedo Technology,2007(5):48-50.
[26]MEI T M.Blind signal separation algorithm based on symmetric fourth-order mutual cumulant[C]∥Signal Processing Branch of China Electronics Society and Signal Processing Branch of China Institute of Instruments and Instruments.2003:4.
[27]ELIAS N,RAFIK G,SAMY M.Speechenhancement using fourth-order cumulants and optimum filters in the subband do- main[J].Speech Communication,2002,36(3):219-246.
[28]QIAN Z,LI X Y,ZHANG R B,et al.Speech-stream detection in short-wave channel based on empirical mode decomposition and higher-order statistics[J].Journal of Harbin Institute of Technology,2009,16(5):713-716.
[29]BAO H Q.A brief introduction to phonological and acoustic analysis of Putonghua (continued 1)[J].Journal of Audiology and Speech Pathology,2004(4):285-286.
[30]CHENG J,LI G H,ZHOU G L.Simplified Calculating Simulation of Fourth-Order Cumulants[J].Computer Simulation,2009,26(8):80-83.
[31]ZHANG A Q,ZHANG X H.Recursive estimation of fourth-order cumulants and application[J].Signal Processing,2002(1):88-90.
[32]LU W C,CHEN N Y,YE C Z,et al.Introduction to Support Vector Machine Algorithms and Software ChemSVM[J].Computer and Applied Chemistry,2002(6):697-702.
[33]FAN X W.Research and application of support vector machine algorithm [D].Hangzhou:Zhejiang University,2003.
[34]QIN Y Q,ZHANG X Y.Speech signal emotion recognition based on SVM[J].Journal of Circuits and Systems,2012,17(5):55-59.
[35]NAZEER O,JAVAID N,et al.Short Term Load Forcasting Using Heuristic Algorithm and Support Vector Machine[C]∥12th International Conference on Complex,Intelligent,and Software Intensive Systems (CISIS).2019:791-799.
[36]LUO R L.Research on text independent speaker recognition algorithm based on SVM[D].Lanzhou:Lanzhou University of Technology,2009.
[37]LUO R L.Text independent speaker recognition algorithm based on SVM[D].Lanzhou:Lanzhou University of Technology,2009.
[38]TANG J T,HU D,GONG Z M.Research on image texture classification based on SVM[J].Computer Engineering and Science,2008(8):44-45,48.
[39]GANDEK B,WARE J E,AARONSON N K,et al.Cross-validation of item selection and scoringfor the SF-12 Health Survey in nine countries:results from the IQOLA Project[J].Journal of clinical epidemiology,1998,51(11):1171-1178.
[1] 柴慧敏, 张勇, 方敏.
基于特征相似度聚类的空中目标分群方法
Aerial Target Grouping Method Based on Feature Similarity Clustering
计算机科学, 2022, 49(9): 70-75. https://doi.org/10.11896/jsjkx.210800203
[2] 汪晋, 刘江.
基于GPU的并行DILU预处理技术
GPU-based Parallel DILU Preconditioning Technique
计算机科学, 2022, 49(6): 108-118. https://doi.org/10.11896/jsjkx.210300259
[3] 邵欣欣.
TI-FastText自动商品分类算法
TI-FastText Automatic Goods Classification Algorithm
计算机科学, 2022, 49(6A): 206-210. https://doi.org/10.11896/jsjkx.210500089
[4] 毛森林, 夏镇, 耿新宇, 陈剑辉, 蒋宏霞.
基于密度敏感距离和模糊划分的改进FCM算法
FCM Algorithm Based on Density Sensitive Distance and Fuzzy Partition
计算机科学, 2022, 49(6A): 285-290. https://doi.org/10.11896/jsjkx.210700042
[5] 姚烨, 朱怡安, 钱亮, 贾耀, 张黎翔, 刘瑞亮.
一种基于异质模型融合的 Android 终端恶意软件检测方法
Android Malware Detection Method Based on Heterogeneous Model Fusion
计算机科学, 2022, 49(6A): 508-515. https://doi.org/10.11896/jsjkx.210700103
[6] 毛典辉, 黄晖煜, 赵爽.
符合监管合规性的自动合成新闻检测方法研究
Study on Automatic Synthetic News Detection Method Complying with Regulatory Compliance
计算机科学, 2022, 49(6A): 523-530. https://doi.org/10.11896/jsjkx.210300083
[7] 周楚霖, 陈敬东, 黄凡.
基于无迹粒子滤波的WiFi-PDR融合室内定位技术
WiFi-PDR Fusion Indoor Positioning Technology Based on Unscented Particle Filter
计算机科学, 2022, 49(6A): 606-611. https://doi.org/10.11896/jsjkx.210700108
[8] 徐佳楠, 张天瑞, 赵伟博, 贾泽轩.
面向供应链风险评估的改进BP小波神经网络研究
Study on Improved BP Wavelet Neural Network for Supply Chain Risk Assessment
计算机科学, 2022, 49(6A): 654-660. https://doi.org/10.11896/jsjkx.210800049
[9] 陈于思, 艾志华, 张清华.
基于三角不等式判定和局部策略的高效邻域覆盖模型
Efficient Neighborhood Covering Model Based on Triangle Inequality Checkand Local Strategy
计算机科学, 2022, 49(5): 152-158. https://doi.org/10.11896/jsjkx.210300302
[10] 赵耿, 王超, 马英杰.
基于混沌序列相关性的峰均比抑制研究
Study on PAPR Reduction Based on Correlation of Chaotic Sequences
计算机科学, 2022, 49(5): 250-255. https://doi.org/10.11896/jsjkx.210400292
[11] 林金城, 纪庆革, 钟圳伟.
考虑行人特征与领导者角色的改进社会力模型
Modified Social Force Model Considering Pedestrian Characteristics and Leaders
计算机科学, 2022, 49(5): 347-354. https://doi.org/10.11896/jsjkx.210500144
[12] 蒋化南, 张帅, 林宇斐, 李豪.
基于MPI的分布式并行Gazebo仿真优化与测试
Simulation Optimization and Testing Based on Gazebo of MPI Distributed Parallelism
计算机科学, 2021, 48(11A): 672-677. https://doi.org/10.11896/jsjkx.210100109
[13] 邵欣欣.
基于Canopy和共享最近邻的服务推荐算法
Service Recommendation Algorithm Based on Canopy and Shared Nearest Neighbor
计算机科学, 2020, 47(11A): 479-481. https://doi.org/10.11896/jsjkx.200200031
[14] 陈沛, 郑万波, 刘文奇, 肖敏, 张凌霄.
基于多种模型的云南省农作物主产区域部分气候指标分析与预测
Analysis and Forecast of Some Climate Indexes in Main Producing Areas of Yunnan Province Based on Multiple Models
计算机科学, 2020, 47(11A): 496-503. https://doi.org/10.11896/jsjkx.200200059
[15] 曾蕾, 李豪, 林宇斐, 张帅.
基于异步机制的Gazebo仿真优化研究
Study on Simulation Optimization of Gazebo Based on Asynchronous Mechanism
计算机科学, 2020, 47(11A): 593-598. https://doi.org/10.11896/jsjkx.200300131
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!