计算机科学 ›› 2024, Vol. 51 ›› Issue (6A): 230600111-8.doi: 10.11896/jsjkx.230600111
杨俊哲1, 宋莹2, 陈逸菲2
YANG Junzhe1, SONG Ying2, CHEN Yifei2
摘要: 随着大型语言模型的快速发展,如何在保证模型性能的同时减少模型参数量,成为了自然语言处理领的一个重要挑战。然而,现有的参数压缩技术往往难以兼顾模型的稳定性和泛化能力。为此,提出了一种融合主题特征的情感分析新架构,旨在利用主题信息增强模型对文本情感极性的判断能力。具体而言,采用一种结合LDA和K-means的方法来提取文本的主题特征,并将其作为固定维度的向量与词嵌入进行拼接,得到新的词向量表示。随后使用平均池化技术构建句子级别的表征向量,并输入到一个全连接层进行情感分类。为了验证所提模型的有效性,在公开的情感分析数据集上与多个基准算法进行了对比实验。实验结果表明,所提模型在多个数据集上明显优于ALBERT,准确率提高了约3.5%,在参数量仅有微小增加的情况下维持了较高的稳定性和泛化能力。
中图分类号:
[1]WANKHADE M,RAO A C S,KULKARNI C.A survey onsentiment analysis methods,applications,and challenges[J].Artificial Intelligence Review,2022,55(7):5731-5780. [2]TAHERDOOST H,MADANCHIAN M.Artificial Intelligence and Sentiment Analysis:A Review in Competitive Research[J].Computers,2023,12(2):37. [3]ZHOU J,YE J M.Sentiment analysis in education research:a review of journal publications[J].Interactive learning environments,2023,31(3):1252-1264. [4]LAN Y X,ZHANG L W,WANG H W,et al.Risk-oriented online public opinion abnormal perception and empirical research[J].Modern intelligence,2022,42(3):102-108. [5]OSMANI A,MOHASEFI J B,SHI Y.Opinion Mining Using Enriched Joint Sentiment-Topic Model[J].International Journal of Information Technology & Decision Making,2023,22(1):313-375. [6]CHATURVEDI I,CAMBRIA E,WELSCH R E,et al.Distinguishing between facts and opinions for sentiment analysis:Survey and challenges[J].Information Fusion,2018,44:65-77. [7]ZHOU C,LI Q,LI C,et al.A comprehensive survey on pretrained foundation models:A history from bert to chatgpt[J/OL].https://arxiv.org/abs/2302.09419. [8]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[J].arXiv:1810.04805,2018. [9]LAN Z,CHENM,GOODMAN S,et al.ALBERT:A Lite BERT for Self-supervised Learning of Language Representations[J].arXiv:1909.11942,2019. [10]TURNEY P D,LITTMAN M L.Measuring praise and criticism:Inference of semantic orientation from association[J].ACM Transactions on Information Systems(TOIS),2003,21(4):315-346. [11]DEY S,WASIF S,TONMOY D S,et al.A comparative study of support vector machine and Naive Bayes classifier for sentiment analysis on Amazon product reviews[C]//2020 International Conference on Contemporary Computing and Applications(IC3A),2020:217-220. [12]CHEN Y.Convolutional neural network for sentence classification[D].University of Waterloo,2015. [13]ALROOBAEAl R.Sentiment Analysis on Amazon Product Reviews using the Recurrent Neural Network(RNN)[J].International Journal of Advanced Computer Science and Applications,2022,13(4):314-318. [14]LIN X,CHEN Z Z,WANG Z Q.Attribute-level emotional classification based on unbalanced data and integrated learning[J].Computer Science,2022,49(S1):144-149. [15]HU Y L,TONG T Q,ZHANG X Y,et al.In-depth learningemotional analysis method of integrating self-attention mechanism[J].Computer Science,2022,49(1):252-258. [16]YAHAV A,VISHWAKARMAI D K.Sentiment analysis using deep learning architectures:a review[J].Artificial Intelligence Review,2020,53(6):4335-4385. [17]MIKOLOV T,CHEN K,COEEADO G,et al.Efficient estimation of word representations in vector space[J].arXiv:1301.3781,2013. [18]VASWANI A,SHAZEER N,PARMAR N,et al.Attention Is All You Need[J].arXiv:1706.03762,2017. [19]HUANG S C,HAN D H,QIAO B Y,et al.Insumer emotional analysis method based on ERNIE2.0-BILSTM-ATTENTION[J].Journal of Chinese Computer Systems,2021,42(12):2485-2489. [20]LIU P,YUAN W,FU J,et al.Pre-train,prompt,and predict:A systematic survey of prompting methods in natural language processing[J].ACM Computing Surveys,2023,55(9):1-35. [21]LI M,LI W,WANG F,et al.Applying BERT to analyze investor sentiment in stock market[J].Neural Computing and Applications,2020(3):1-14. [22]SONG M,LIU Y L.Bert in the application and optimization of the emotional classification of Weibo short text[J].Journal of Chinese Computer Systems,2021,42(4):714-718. [23]WANG H,HU X,ZHANG H.Sentiment analysis of com modity reviews based on ALBERT-LSTM[C]//Journal of Physics:Conference Series.Bristol,UK,2020:012022. [24]GAO X,DING G,LIU C,et al.Research on high precision Chinese text sentiment Classification based on ALBERT Optimization[C]//2023 15th International Conference on Advanced Computational Intelligence(ICACI).Nanjing,China,2023:1-6. [25]BLEI D M,NG A Y,JORDAN M I.Latent Dirichlet Allocation[J].The Annals of Applied Statistics,2003,3:993-1022. [26]YAN F,DU T F,MAO J H,et al.Emotional analysis of the stock market text based on emotional dictionary and LDA model[J].Electronic Measurement Technology,2017,40(12):82-87. [27]XUE J,CHEN J,HU R,et al.Twitter discussions and emotions about the COVID-19 pandemic:Machine learning approach[J].Journal of Medical Internet Research,2020,22(11):e20550. [28]BUI Q V,SAYADI K,BUI M.A multi-criteria document clustering method based on topic modeling and pseudoclosure function[C]//Proceedings of the Sixth International Symposium on Information and Communication Technology.Ho Chi Minh City,Vietnam,2015:38-45. [29]SUN Y,WANG S,FENG S,et al.ERNIE 3.0:Large-scaleKnowledge Enhanced Pre-training for Language Understanding and Generation[J].arXiv:2107.02137,2021. |
|