基于注意力LSTM的音乐主题推荐模型

摘要/Abstract

摘要： 针对传统音乐推荐过程中存在的分类准确率较低、周期较长、难以满足人们在生活中对主题音乐的需求等问题,设计了一种注意力机制与长短期记忆(Long Short-Term Memory,LSTM)相结合的神经网络模型,它由音乐主题模型和音乐推荐模型构成,在使用注意力机制和LSTM网络实现音乐情感分类的基础上,音乐主题模型有效地组合了音频码本和主题模型,实现了对某个情感下的音乐主题子类的判别。音乐推荐模型则利用低级描述符(Low-Level Descriptor,LLD)和频谱图,构建手工特征与卷积循环神经网络(Convolutional Recurrent Neural Network,CRNN)特征的联合表示形式,从而获得用户语音表达的情感,并对其进行精准的音乐主题推荐。实验中,针对两个模型分别进行设计,采用两种不同的传统模型作为基线,实验结果表明,与传统的单一模型相比,此模型不仅可以提升主题分类精度,而且可以精准地判断用户语音数据的情感,从而定向地完成主题音乐的推荐。

关键词: 长短期记忆网络, 低级描述符, 卷积循环神经网络, 音乐主题推荐, 主题模型, 注意力机制

Abstract: Aiming at the problems of low classification accuracy,long period,and difficulty in meeting the demand for theme music in people’s life,an attention mechanism and LSTM (Long Short-Term Memory) were designed.Based on the neural network model,it consists of a music theme model and a music recommendation model.On the basis of using the attention mechanism and the LSTM network to realize music emotion classification,the music theme model effectively combines the audio codebook and the topic model to achieve Discrimination of a subcategory of music topics under an emotion.In the music recommendation model,a low-level descriptor and a spectrogram are used to construct a joint representation of manual features and Convolutional Recurrent Neural Network (CRNN) features.The emotions expressed by the user’s voice are obtained,and the user is given a precise music theme recommendation by using this mo-del.In the experiment,two models were designed separately,and two different traditional models were used as the baseline.The experimental results show thatthis model not only can improve the classification accuracy of the subject,but also can accurately judge the emotion of the user’s voice data,so as to achieve the recommendation of the theme music compared with the traditional single model.

Key words: Attention mechanism, Convolutional recurrent neural network, Long short-term memory network, Low-Level descriptor, Music theme recommendation, Topic model

中图分类号:

TP183

贾宁, 郑纯军. 基于注意力LSTM的音乐主题推荐模型[J]. 计算机科学, 2019, 46(11A): 230-235. https://doi.org/

JIA Ning, ZHENG Chun-jun. Model of Music Theme Recommendation Based on Attention LSTM[J]. Computer Science, 2019, 46(11A): 230-235. https://doi.org/

参考文献

[1]VELARDE G,CHACÓN C C,MEREDITH D,et al.Convolution-based classification of audio and symbolic representations of music[J].Journal of New Music Research,2018:1-15.
[2]LAKOMKIN E,ZAMANI M A,Weber C,et al.EmoRL:Continuous Acoustic Emotion Classification using Deep Reinforcement Learning[C]∥ICRA’18.2018.
[3]HE H,XIA R.Joint Binary Neural Network for Multi-labelLearning with Applications to Emotion Classification∥Natural Language Processing and Chinese Computing.2018.
[4]RAJANNA A R,ARYAFAR K,SHOKOUFANDEH A,et al.Deep Neural Networks:A Case Study for Music Genre Classification[C]∥IEEE International Conference on Machine Lear-ning & Applications.IEEE,2015.
[5]TRABELSII,AYED D B.On the Use of Different Feature Extraction Methods for Linear and Non Linear kernels[C]∥2012 6th International Conference on Sciences of Electronics,Technologies of Information and Telecommunications (SETIT). IEEE,2012.
[6]LI T,OGIHARA M,LI Q.A Comparative Study on Content-Based Music Genre Classification[C]∥International AcmSigir Conference on Research & Development in Informaion Retrie-val.ACM,2003.
[7]LEE K K,PARK K S.Robust Feature Extraction for Automatic Classification of Korean Traditional Music in Digital Library[C]∥International Conference on Asian Digital Libraries:Implementing Strategies & Sharing Experiences.Springer-Verlag,2005.
[8]DU W,LIN H,SUN J,et al.A new hierarchical method for music genre classification[C]∥International Congress on Image & Signal Processing.IEEE,2017.
[9]DEB S,DANDAPAT S.Multiscale Amplitude Feature and Significance of Enhanced Vocal Tract Information for Emotion Classification[J].IEEE Transactions on Cybernetics,2018,PP(99):1-14.
[10]HUANG Y S,CHOU S Y,YANG Y H.Pop Music Highligh-ter:Marking the Emotion Keypoints∥Audio and Speech Processing.2018.
[11]MIRSAMADI S,BARSOUM E,ZHANG C.Automatic Speech Emotion Recognition Using Recurrent Neural Networks with Local Attention[C]∥ICASSP.IEEE,2017.
[12]BERTIN-MAHIEUX T,ELLIS D P.Large-scale cover songrecognition using hashed chroma landmarks[C]∥2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).IEEE,2011:117-120.
[13]VAN DEN OORD A,DIELEMAN S,ZEN H,et al.Wavenet:A generative model for raw audio[C]∥SSW.2016:125.
[14]EZZAT S,EL GAYAR N,GHANEM M M.Sentiment analysis of call centre audio conversations using text classification[J].International Journal of Computer Information Systems and Industrial Management Applications,2012,4(1): 619-627.
[15]PALKAR V V,JOEG P.Proposing scalable method for musicgenre classification[C]∥International Conference on Inventive Computation Technologies.2017.
[16]韩文静,李海峰,阮华斌.语音情感识别研究进展综述[J].软件学报,2014,25(1):37-50.
[17]PALO H K,MOHANTY M N,CHANDRA M.Computational Vision and Robotics[J].Advances in Intelligent Systems and Computing,2015,332:63-70.
[18]RODDY C.Emotion recognition in human-computer interaction[J].Signal Processing Magazine,IEEE,2001,18(1):32-80.
[19]DAVIES M E P,DEGARA N,PLUMBLEY M D.Measuringthe Performance of Beat Tracking Algorithms Using a Beat Error Histogram[J].IEEE Signal Processing Letters,2011,18(3):157-160.
[20]YUAN C,GLASS J.Speech2Vec:A Sequence-to-SequenceFramework for Learning Word Embedding from Speech[C]∥Interspeech.2018.

相关文章 15

[1]	周芳泉, 成卫青. 基于全局增强图神经网络的序列推荐 Sequence Recommendation Based on Global Enhanced Graph Neural Network 计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[2]	戴禹, 许林峰. 基于文本行匹配的跨图文本阅读方法 Cross-image Text Reading Method Based on Text Line Matching 计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032
[3]	周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[4]	熊丽琴, 曹雷, 赖俊, 陈希亮. 基于值分解的多智能体深度强化学习综述 Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization 计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112
[5]	饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[6]	汪鸣, 彭舰, 黄飞虎. 基于多时间尺度时空图网络的交通流量预测模型 Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction 计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[7]	王馨彤, 王璇, 孙知信. 基于多尺度记忆残差网络的网络流量异常检测模型 Network Traffic Anomaly Detection Method Based on Multi-scale Memory Residual Network 计算机科学, 2022, 49(8): 314-322. https://doi.org/10.11896/jsjkx.220200011
[8]	姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[9]	朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[10]	孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[11]	闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[12]	金方焱, 王秀利. 融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取 Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM 计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
[13]	熊罗庚, 郑尚, 邹海涛, 于化龙, 高尚. 融合双向门控循环单元和注意力机制的软件自承认技术债识别方法 Software Self-admitted Technical Debt Identification with Bidirectional Gate Recurrent Unit and Attention Mechanism 计算机科学, 2022, 49(7): 212-219. https://doi.org/10.11896/jsjkx.210500075
[14]	彭双, 伍江江, 陈浩, 杜春, 李军. 基于注意力神经网络的对地观测卫星星上自主任务规划方法 Satellite Onboard Observation Task Planning Based on Attention Neural Network 计算机科学, 2022, 49(7): 242-247. https://doi.org/10.11896/jsjkx.210500093
[15]	赵冬梅, 吴亚星, 张红斌. 基于IPSO-BiLSTM的网络安全态势预测 Network Security Situation Prediction Based on IPSO-BiLSTM 计算机科学, 2022, 49(7): 357-362. https://doi.org/10.11896/jsjkx.210900103

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed