计算机科学 ›› 2015, Vol. 42 ›› Issue (9): 24-28.doi: 10.11896/j.issn.1002-137X.2015.09.005
金琴,陈师哲,李锡荣,杨 刚,许洁萍
JIN Qin, CHEN Shi-zhe, LI Xi-rong, YANG Gang and XU Jie-ping
摘要: 语音情感识别是语音处理领域中一个具有挑战性和广泛应用前景的研究课题。探索了语音情感识别中的关键问题之一:生成情感识别的有效的特征表示。从4个角度生成了语音信号中的情感特征表示:(1)低层次的声学特征,包括能量、基频、声音质量、频谱等相关的特征,以及基于这些低层次特征的统计特征;(2)倒谱声学特征根据情感相关的高斯混合模型进行距离转化而得出的特征;(3)声学特征依据声学词典进行转化而得出的特征;(4)声学特征转化为高斯超向量的特征。通过实验比较了各类特征在情感识别上的独立性能,并且尝试了将不同的特征进行融合,最后比较了不同的声学特征在几个不同语言的情感数据集上的效果(包括IEMOCAP英语情感语料库、CASIA汉语情感语料库和Berlin德语情感语料库)。在IEMOCAP数据集上,系统的正确识别率达到了71.9%,超越了之前在此数据集上报告的最好结果。
[1] Litman D,Forbes K.Recognizing emotions from student speech in tutoring dialogues[C]∥Proceeding of IEEE Workshop on Automatic Speech Recognition and Understanding(ASRU).2003:25-30 [2] France D J,Shiavi R G,Silverman S,et al.Acoustical properties of speech as indicators of depression and suicidal risk [J].IEEE Trans.on Biomedical Engineering,2000,47(7):829-837 [3] Yang N,Muraleedharan R,Kohl J,et al.Speech-based emotion classification using multiclass SVM with hybrid kernel and thresholding fusion[C]∥Proceedings of the 4th IEEE workshop on Spoken Language Technology(SLT),2012.Miami,Florida,2012:455-460 [4] Schuller B,Rigoll G,Lang M.Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture[C]∥Proceedings of the ICASSP.2004,1:577-580 [5] Ayadi M,Kamel M,Karray F.Survey on speech emotion recognition:Features,classification schemes,and databases[J].Pattern Recognition,2011,44(3):572-587 [6] Zeng Z,Pantic M,Rosiman G I,et al.A survey of affect recognition methods:Audio,visual,and spontaneous expressions[J].IEEE Trans.on Pattern Analysis and Machine Intelligence,2009,1(1):39-58 [7] Kockmann M,Burget L,Cemocky J.Application of speaker and language independent state-of-the-art techniques for emotion recognition[J].Speech Communication,2011,53(9):1172-1185 [8] Chen L,Mao X,Xue Y-L,et al.Speech Emotion Recognition:Features and Classification Models[J].Digital Signal Proces-sing,2012,22(6):1154-1160 [9] Zhang B Y,Yu J Q,Tang J F,et al.Movie background music classification foremotion [J].Computer Science,2013,0(12):37-40,4 [10] Schuller B,Reiter S,Mueller R,et al.Speaker-independentspeech emotion recognition by ensemble classification[C]∥Proceedings of IEEE International Conference on Multimedia and Expo(ICME).Amsterdam,Netherlands,2005:864-867 [11] Pao T L,Chen Y T,Ye J H,et al.Mandarin Emotional Speech Recognition based on SVM and NN[C]∥Proceedings of International Conference on Patter Recognition(ICPR).2006,1:1096-1100 [12] Lee H,Largman Y,Pham P,et al.Unsupervised feature learning for audio classification using convolutional deep belief networks[C]∥Proceedings of Advances in Neural Information Proces-sing Systems(NIPS).2009:1-9 [13] Eyben F,Wollmer M,Schuller B.OpenSMILE-The MunichVersatile and Fast Open-Source Audio Feature Extractor[C]∥Proceedings of ACM Multimedia(MM).Florence,Italy,2010:1459-1462 [14] Schuller B,Batliner A,Steidl S,et al.Recognizing Realistic Emotions and Affect in Speech:State of the Art and Lessons Leant from the First Challenge[J].Speech Communication,2011,53(10):1062-1087 [15] Rozgic V,Ananthakrishnan S,Saleem S,et al.Emotion Recognition using Acoustic and Lexical Features[C]∥Proceedings of INTERSPEECH 2012.September Portland,2012 [16] Lee K,Ellis D P W.Audio-Based Semantic Concept Classification for Consumer Video[J].IEEE Trans.Audio,Speech,and Language Processing,2010,18(6):1406-1416 [17] Campbell W M,Sturim D E,Reynolds D A.Support vector machines using GMM supervectors for speaker verification[J].IEEE Signal Processing Letters,2006:308-311 [18] Busso C,Bulut M,Lee C C,et al.IEMOCAP:Interactive emotional dyadic motion capture database[J].Journal of Language Resources and Evaluation,2008,42(4):335-359 [19] Data collected by the speech group at National Key Laboratory of Pattern Recognition.http://www.datatang.com/data/39277 [20] Burkhardt F,Paeschke A,Rolfes M,et al.A database of German emotional speech[C]∥Proceedings of INTERSPEECH 2005.Lisbon,2005:1517-1520 [21] Hsu C W,Chang C C,Lin C J.A practical guide to support vector classification.2010.http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf [22] Witten I H,Frank E,Trigg L E,et al.Weka:Practical machine learning tools and techniques with Java implementations.http://www.cs.waikato.ac.nz/~eibe/pubs/99IHW-EF-LT-MH-GH-SJC-Tools-Java.pdf [23] Brummer N.FoCal-II:Toolkit for calibration of multiclass reco-gnition scores.https://sites.google.com/site/nikobrummer/focal |
No related articles found! |
|