计算机科学 ›› 2023, Vol. 50 ›› Issue (11A): 221000217-7.doi: 10.11896/jsjkx.221000217

• 人工智能 • 上一篇    下一篇

基于多特征融合的评论文本个性化情感分类新方法

王友卫1, 刘奥1, 凤丽洲2   

  1. 1 中央财经大学信息学院 北京 100081
    2 天津财经大学统计学院 天津 300222
  • 发布日期:2023-11-09
  • 通讯作者: 刘奥(liuaohit@163.com)
  • 作者简介:(ywwang15@126.com)
  • 基金资助:
    国家自然科学基金(61906220);教育部人文社科项目(19YJCZH178);国家社科基金(18CTJ008);中央财经大学新兴交叉学科建设项目

Multi-feature Fusion Based New Personalized Sentiment Classification Method for Comment Texts

WANG Youwei1, LIU Ao1, FENG Lizhou2   

  1. 1 School of Information,Central University of Finance and Economics,Beijing 100081,China
    2 School of Science and Engineering,Tianjin University of Finance and Economics,Tianjin 300222,China
  • Published:2023-11-09
  • About author:WANG Youwei,born in 1987,Ph.D,associate professor,is a member of China Computer Federation.His main researchinterests include machine learning,data mining and NLP.
    LIU Ao,born in 1997,postgraduate.His main research interests include data mining and NLP.
  • Supported by:
    National Natural Science Foundation of China(61906220),Ministry of Education of Humanities and Social Science Project(19YJCZH178),National Social Science Foundation of China(18CTJ008) and Emerging Interdisciplinary Project of CUFE.

摘要: 现有的情感分类研究未能充分考虑用户个人历史评论中蕴含的个性特征对情感分类结果的影响,且未能综合考虑用户社会关系、个人属性、历史评论与当前评论等诸多因素的共同作用。为此,提出一种基于多特征融合的评论文本个性化情感分类新方法。首先,利用大量无标注的用户历史评论挖掘用户个性表达,结合用户历史评论和用户属性信息提取得到用户特征向量;然后,利用node2vec算法在获得图节点表示方面的优势对用户社会关系网络进行学习以得到用户的社会关系向量,并利用预训练的word2vec模型获得用户当前评论向量;最后,将用户特征向量、社会关系向量和有标注的当前评论向量输入全连接神经网络中进行训练以得到最终的分类模型。在从中文股吧爬取的真实数据集上的实验结果表明,与支持向量机、朴素贝叶斯、TextCNN、Bert等典型方法相比,所提方法能够有效提高情感分类的准确率和F1值,验证了其在改善情感分类表现方面的有效性。

关键词: 情感分类, 股票评论, 社会关系, 历史评论, 全连接神经网络

Abstract: Existing research on sentiment classification fails to fully consider the influence of personality characteristics contained in user’s personal historical comments on the results of sentiment classification,and fails to comprehensively consider the combined effects of many factors such as user’s social relations,personal attributes,historical comments and current comments.To this end,a new personalized method for sentiment classification of comment texts based on multi-feature fusion is proposed.First,the user’s personality expressions is mined by using a great number of unlabeled user’s historical comments,and the user’s feature vector is extracted by combining user’s historical comments and attribute information.Then,the advantages of the node2vec algorithm in obtaining the node representation of the graph are used to learn users’ social relationship networks,so as to obtain the users’ social relationship vectors,and the pre-trained word2vec model is used to obtain the user’s current comment vector.Finally,the user’s feature vector,social relationship vector and labeled current comment vector are entered into the fully connected classifier for training to obtain the final classification model.Experimental results on the real data set crawled from the Chinese stock page show that compared with typical methods such as support vector machine,naive Bayes,TextCNN,Bert,the proposed method can effectively improve the accuracy and F1 value of sentiment classification,which verifies its effectiveness in improving sentiment classification performance.

Key words: Sentiment classification, Stock comments, Social relations, Historical comments, Full connect neural network

中图分类号: 

  • TP391
[1]MAQSOOD H,MEHMOOD I,MAQSOOD M,et al.A local and global event sentiment based efficient stock exchange forecasting using deep learning[J].International Journal of Information Management,2020,50:432-451.
[2]CHEN K J,CHEN R H.Automatic Construction and Optimization of Stock Market Sentiment Dictionary[J].Science Techno-logy and Engineering,2020,20(21):8683-8689.
[3]ALKUBAISI G A A J,KAMARUDDIN S S,HUSNI H.Stock Market Classification Model Using Sentiment Analysis on Twitter Based on Hybrid Naive Bayes Classifiers[J].Comput.Inf.Sci.,2018,11(1):52-64.
[4]LIU Z,HUANG D,HUANG K,et al.Finbert:A pre-trained financial language representation model for financial text mining[C]//Proceedings of the Twenty-Ninth International Confe-rence on International Joint Conferences on Artificial Intelligence.2021:4513-4519.
[5]CHEN Y,YAO L B,ZHANG G H,et al.Text Sentiment Oriention Analysis of Multi-Channels CNN and BiGRU Based on Attention Mechanism[J].Journal of Computer Applications,2020,57(12):2583-2595.
[6]HU X,TANG L,TANG J,et al.Exploiting social relations for sentiment analysis in microblogging[C]//Proceedings of the Sixth ACM International Conference on Web Search and Data Mining.2013:537-546.
[7]LIU W,ZHANG M.Semi-supervised sentiment classificationmethod based on weibo social relationship[C]//International Conference on Web Information Systems and Applications.Cham:Springer,2019:480-491.
[8]YANG J,ZOU X,ZHANG W,et al.Microblog sentiment analy-sis via embedding social contexts into an attentive LSTM[J].Engineering Applications of Artificial Intelligence,2021,97:104048.
[9]JIANG Z L,ZHANG J.Multi-Head Attention Model with User and Product Information for Sentiment Classification[J].Computer Systems & Applications,2020,29(7):131-138.
[10]WANG Q F,ZHOU M,WANG Z Q,et al.Graph Convolution Network for Sentiment Classification via User and Product Information[J].Journal of ChineseInformation Processing,2021,35(3):134-142.
[11]ZOU X,YANG J,ZHANG J.Microblog sentiment analysisusing social and topic context[J].PloS One,2018,13(2):e0191163.
[12]GROVER A,LESKOVEC J.node2vec:Scalable feature learning for networks[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mi-ning.2016:855-864.
[13]NORTHCUTT C,JIANG L,CHUANG I.Confident learning:Estimating uncertainty in dataset labels[J].Journal of Artificial Intelligence Research,2021,70:1373-1411.
[14]WANG Y W,ZHU C,ZHU J M,et al.User Interest Dictionary and LSTM Based Method for Personalized Emotion Classification[J].Computer Science,2021,48(S2):251-257.
[15]WANG D,ZHAO Y.Using news to predict investor sentiment:Based on SVM model[J].Procedia Computer Science,2020,174:191-199.
[16]KUMAR R,KAUR J.Random forest-based sarcastic tweet classification using multiple feature collection[M]//Multimedia Big Data Computing For IoT Applications.Springer,Singapore,2020:131-160.
[17]BIRJALI M,KASRI M,BENI-HSSANE A.A comprehensivesurvey on sentiment analysis:Approaches,challenges and trends[J].Knowledge-Based Systems,2021,226:107134.
[18]DING F,SUN X.Negative-emotion Opinion Target Extraction Basedon Attention and BiLSTM-CRF[J].Computer Science,2022,49(2):223-230.
[19]GUO B,ZHANG C,LIU J,et al.Improving text classification with weighted word embeddings via a multi-channel TextCNN model[J].Neurocomputing,2019,363:366-374.
[20]GUO Z,ZHU L,HAN L.Research on Short Text Classification Based on RoBERTa-TextRCNN[C]//2021 International Conference on Computer Information Science and Artificial Intelligence(CISAI).IEEE,2021:845-849.
[21]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language under-standing[J].arXiv:1810.04805,2018.
[22]SOARES L D,FRANCO E M C.BiGRU-CNN neural network applied to short-term electric load forecasting[J].Production,2021,32,e20210087.
[23]WANG K,WANG M Y,LIU X,et al.Event detection by combining self-attention and CNN-BiGRU[J].Journal of Xidian University,2022,49(5):181-188.
[24]PEROZZI B,AL-RFOU R,SKIENA S.Deepwalk:Online learning of social representations[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2014:701-710.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!