Computer Science ›› 2025, Vol. 52 ›› Issue (6): 219-227.doi: 10.11896/jsjkx.240400150
• Computer Graphics & Multimedia • Previous Articles Next Articles
SHEN Xinyang1, WANG Shanmin2, SUN Yubao1
CLC Number:
[1]DEAN J,KESHAVAN M.The neurobiology of depression:An integrated view[J].Asian Journal of Psychiatry,2017,27:101-111. [2]CASSANO P,FAVA M.Depression and public health:an overview[J].Journal of Psychosomatic Research,2002,53(4):849-857. [3]PAYKEL E S.Basic concepts of depression[J].Dialogues inClinical Neuroscience,2008,10(3):279-289. [4]PAMPALLONA S,BOLLINI P,TIBALDI G,et al.Combined pharmacotherapy and psychological treatment for depression:a systematic review[J].Archives of General Psychiatry,2004,61(7):714-719. [5]HALFIN A.Depression:the benefits of early and appropriatetreatment[J].American Journal of Managed Care,2007,13(4):S92. [6]MAURER D M,RAYMOND T J,DAVIS B N.Depression:screening and diagnosis[J].American Family Physician,2018,98(8):508-515. [7]O'CONNOR E,ROSSOM R C,HENNINGER M,et al.Primary care screening for and treatment of depression in pregnant and postpartum women:evidence report and systematic review for the US Preventive Services Task Force[J].Jama,2016,315(4):388-406. [8]COHN J F,KRUEZ T S,MATTHEWS I,et al.Detecting depression from facial actions and vocal prosody[C]//2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.IEEE,2009:1-7. [9]CUMMINS N,SCHERER S,KRAJEWSKI J,et al.A review of depression and suicide risk assessment using speech analysis[J].Speech Communication,2015,71:10-49. [10]FRANCE D J,SHIAVI R G,SILVERMAN S,et al.Acoustical properties of speech as indicators of depression and suicidal risk[J].IEEE Transactions on Biomedical Engineering,2000,47(7):829-837. [11]CUMMINS N,SETHU V,EPPS J,et al.Analysis of acoustic space variability in speech affected by depression[J].Speech Communication,2015,75:27-49. [12]DU M,LIU S,WANG T,et al.Depression recognition using a proposed speech chain model fusing speech production and perception features[J].Journal of Affective Disorders,2023,323:299-308. [13]MA X,YANG H,CHEN Q,et al.Depaudionet:An efficientdeep model for audio based depression classification[C]//Proceedings of the 6th International Workshop on Audio/visual Emotion Challenge.2016:35-42. [14]WANG H,LIU Y,ZHEN X,et al.Depression speech recognition with a three-dimensional convolutional network[J].Frontiers in Human Neuroscience,2021,15:713823. [15]ZHAO Y,LIANG Z,DU J,et al.Multi-head attention-basedlong short-term memory for depression detection from speech[J].Frontiers in Neurorobotics,2021,15:684037. [16]DUMPALA S H,REMPEL S,DIKAIOS K,et al.Estimating severity of depression from acoustic features and embeddings of natural speech[C]//2021 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2021).IEEE,2021:7278-7282. [17]ALGHIFARI M F,GUNAWAN T S,NORDIN M A W,et al.On the optimum speech segment length for depression detection[C]//2019 IEEE International Conference on Smart Instrumentation,Measurement and Application(ICSIMA).IEEE,2019:1-5. [18]ZHANG P,WU M,DINKEL H,et al.Depa:Self-supervised audio embedding for depression detection[C]//Proceedings of the 29th ACM International Conference on Multimedia.2021:135-143. [19]SALEKIN A,EBERLE J W,GLENN J J,et al.A weakly supervised learning framework for detecting social anxiety and depression[J].Proceedings of the ACM on Interactive,Mobile,Wearable and Ubiquitous Technologies,2018,2(2):1-26. [20]ALGHOWINEM S,GOECKE R,WAGNER M,et al.Detecting depression:a comparison between spontaneous and read speech[C]//2013 IEEE International Conference on Acoustics,Speech and Signal Processing.IEEE,2013:7547-7551. [21]LIU Z,XIONG H,LI X,et al.Comparing thin-slicing of speech for clinical depression detection[C]//2018 IEEE International Conference on Systems,Man,and Cybernetics(SMC).IEEE,2018:1885-1891. [22]BRAUNSCHWEILER N,DODDIPATLA R,KEIZER S,et al.Factors in emotion recognition with deep learning models using speech and text on multiple corpora[J].IEEE Signal Processing Letters,2022,29:722-726. [23]SCHULLER B,VLASENKO B,EYBEN F,et al.Acoustic emotion recognition:A benchmark comparison of performances[C]//2009 IEEE Workshop on Automatic Speech Recognition &Understanding.IEEE,2009:552-557. [24]YANG Y,FAIRBAIRN C,COHN J F.Detecting depression severity from vocal prosody[J].IEEE Transactions on Affective Computing,2012,4(2):142-150. [25]TEASDALE J D,FOGARTY S J,WILLIAMS J M G.Speech rate as a measure of short-term variation in depression[J].British Journal of Social and Clinical Psychology,1980,19(3):271-278. [26]LONG H,GUO Z,WU X,et al.Detecting depression in speech:Comparison and combination between different speech types[C]//2017 IEEE International Conference on Bioinformatics and Biomedicine(BIBM).IEEE,2017:1052-1058. [27]JIANG H,HU B,LIU Z,et al.Detecting depression using an ensemble logistic regression model based on multiple speech features[J].Computational and Mathematical Methods in Medicine,2018,2018(1):6508319. [28]KUCHIBHOTLA S,DOGGA S S,THOTA N G V,et al.Depression detection from speech emotions using MFCC based recurrent neural network[C]//2023 2nd International Conference on Vision Towards Emerging Trends in Communication and Networking Technologies(ViTECoN).IEEE,2023:1-5. [29]TAO F,GE X,MA W,et al.Multi-Local Attention for Speech-Based Depression Detection[C]//2023 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP2023 ).IEEE,2023:1-5. [30]ZHANG X,ZHANG X,CHEN W,et al.Improving speech depression detection using transfer learning with wav2vec 2.0 in low-resource environments[J].Scientific Reports,2024,14(1):9543. [31]ZUO L,MAK M W,TU Y.Promoting Independence of Depression and Speaker Features for Speaker Disentanglement in Speech-Based Depression Detection[C]//ICASSP 2024-2024 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).IEEE,2024:10191-10195. [32]XUE J,QIN R,ZHOU X,et al.Fusing Multi-Level Features from Audio and Contextual Sentence Embedding from Text for Interview-Based Depression Detection[C]//ICASSP 2024-2024 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).IEEE,2024:6790-6794. [33]WU W,ZHANG C,WOODLAND P C.Self-supervised representations in speech-based depression detection[C]//ICASSP 2023-2023 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).IEEE,2023:1-5. [34]ZHAO Z,BAO Z,ZHANG Z,et al.Hierarchical attention transfer networks for depression assessment from speech[C]//ICASSP 2020-2020 IEEE international conference on acoustics,speech and signal processing(ICASSP).IEEE,2020:7159-7163. [35]SRIMADHUR N S,LALITHA S.An end-to-end model for detection and assessment of depression levels using speech[J].Procedia Computer Science,2020,171:12-21. [36]YANG W,LIU J,CAO P,et al.Attention guided learnable time-domain filterbanks for speech depression detection[J].Neural Networks,2023,165:135-149. [37]YOON S,MAENG S,KIM R,et al.Strategy for developing a speech recognition model specialized for patients with depression or Parkinson's disease with small size speech database[J].Biomedical Engineering Letters,2024,14(5):1049-1055. [38]GUPTA S,AGARWAL G,AGARWAL S,et al.Depression detection using cascaded attention based deep learning framework using speech data[J].Multimedia Tools and Applications,2024,83(25):66135-66173. [39]CHEN W,XING X,XU X,et al.SpeechFormer:A hierarchical efficient framework incorporating the characteristics of speech[J].arXiv:2203.03812,2022. [40]CHEN W,XING X,XU X,et al.Speechformer++:A hierarchical efficient framework for paralinguistic speech processing[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2023,31:775-788. [41]PAN Y,SHANG Y,WANG W,et al.Multi-feature deep supervised voiceprint adversarial network for depression recognition from speech[J].Biomedical Signal Processing and Control,2024,89:105704. [42]MCAULIFFE M,SOCOLOF M,MIHUC S,et al.Montrealforced aligner:Trainable text-speech alignment using kaldi[C]//Interspeech.2017:498-502. [43]BAEVSKI A,ZHOU Y,MOHAMED A,et al.wav2vec 2.0:A framework for self-supervised learning of speech representations[J].Advances in Neural Information Processing Systems,2020,33:12449-12460. [44]SHAZEER N,MIRHOSEINI A,MAZIARZ K,et al.Outra-geously large neural networks:The sparsely-gated mixture-of-experts layer[J].arXiv:1701.06538,2017. [45]EIGEN D,RANZATO M A,SUTSKEVER I.Learning factored representations in a deep mixture of experts[J].arXiv:1312.4314,2013. [46]BENGIO E,BACON P L,PINEAU J,et al.Conditional computation in neural networks for faster models[J].arXiv:1511.06297,2015. [47]CAI H,GAO Y,SUN S,et al.Modma dataset:a multi-modal open dataset for mental-disorder analysis[J].arXiv:2002.09283,2020. [48]ZHANG R,WANG Y,WOMER F,et al.School-based Evaluation Advancing Response for Child Health(SEARCH):a mixed longitudinal cohort study from multifacetedperspectives in Jiang-su,China[J].BMJ Ment Health,2023,26(1). [49]ZHAO S,MA B,WATCHARASUPAT K N,et al.FRCRN:Boosting feature representation using frequency recurrence for monaural speech enhancement[C]//ICASSP 2022-2022 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).IEEE,2022:9281-9285. [50]GUPTA S,FAHAD M S,DEEPAK A.Pitch-synchronous single frequency filtering spectrogram for speech emotion recognition[J].Multimedia Tools and Applications,2020,79:23347-23365. [51]LUNA-JIMÉNEZ C,KLEINLEIN R,GRIOL D,et al.A proposal for multimodal emotion recognition using aural transfor-mers and action units on RAVDESS dataset[J].Applied Sciences,2021,12(1):327. [52]KANOUJIA S,KARUPPANAN P.Depression Detection inSpeech Using ML and DL Algorithm[C]//2024 IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation(IATMSI).IEEE,2024,2:1-5. |
[1] | LI Weirong, YIN Jibin. FB-TimesNet:An Improved Multimodal Emotion Recognition Method Based on TimesNet [J]. Computer Science, 2025, 52(6A): 240900046-8. |
[2] | ZHENG Chuangrui, DENG Xiuqin, CHEN Lei. Traffic Prediction Model Based on Decoupled Adaptive Dynamic Graph Convolution [J]. Computer Science, 2025, 52(6A): 240400149-8. |
[3] | ZHANG Yongyu, GUO Chenjuan, WEI Hanyue. Deep Learning Stock Price Probability Prediction Based on Multi-modal Feature Wavelet Decomposition [J]. Computer Science, 2025, 52(6A): 240600140-11. |
[4] | XU Yutao, TANG Shouguo. Visual Question Answering Integrating Visual Common Sense Features and Gated Counting Module [J]. Computer Science, 2025, 52(6A): 240800086-7. |
[5] | WANG Rui, TANG Zhanjun. Multi-feature Fusion and Ensemble Learning-based Wind Turbine Blade Defect Detection Method [J]. Computer Science, 2025, 52(6A): 240900138-8. |
[6] | LI Mingjie, HU Yi, YI Zhengming. Flame Image Enhancement with Few Samples Based on Style Weight Modulation Technique [J]. Computer Science, 2025, 52(6A): 240500129-7. |
[7] | WANG Rong , ZOU Shuping, HAO Pengfei, GUO Jiawei, SHU Peng. Sand Dust Image Enhancement Method Based on Multi-cascaded Attention Interaction [J]. Computer Science, 2025, 52(6A): 240800048-7. |
[8] | JIN Lu, LIU Mingkun, ZHANG Chunhong, CHEN Kefei, LUO Yaqiong, LI Bo. Pedestrian Re-identification Based on Spatial Transformation and Multi-scale Feature Fusion [J]. Computer Science, 2025, 52(6A): 240800156-7. |
[9] | SHI Xincheng, WANG Baohui, YU Litao, DU Hui. Study on Segmentation Algorithm of Lower Limb Bone Anatomical Structure Based on 3D CTImages [J]. Computer Science, 2025, 52(6A): 240500119-9. |
[10] | GUO Yecai, HU Xiaowei, MAO Xiangnan. Multi-scale Feature Fusion Residual Denoising Network Based on Cascade [J]. Computer Science, 2025, 52(6): 239-246. |
[11] | GENG Sheng, DING Weiping, JU Hengrong, HUANG Jiashuang, JIANG Shu, WANG Haipeng. FDiff-Fusion:Medical Image Diffusion Fusion Network Segmentation Model Driven Based onFuzzy Logic [J]. Computer Science, 2025, 52(6): 274-285. |
[12] | JIANG Wenwen, XIA Ying. Improved U-Net Multi-scale Feature Fusion Semantic Segmentation Network for RemoteSensing Images [J]. Computer Science, 2025, 52(5): 212-219. |
[13] | LI Xiwang, CAO Peisong, WU Yuying, GUO Shuming, SHE Wei. Study on Security Risk Relation Extraction Based on Multi-view IB [J]. Computer Science, 2025, 52(5): 330-336. |
[14] | LI Xiaolan, MA Yong. Study on Lightweight Flame Detection Algorithm with Progressive Adaptive Feature Fusion [J]. Computer Science, 2025, 52(4): 64-73. |
[15] | DENG Ceyu, LI Duantengchuan, HU Yiren, WANG Xiaoguang, LI Zhifei. Joint Inter-word and Inter-sentence Multi-relationship Modeling for Review-basedRecommendation Algorithm [J]. Computer Science, 2025, 52(4): 119-128. |
|