计算机科学 ›› 2014, Vol. 41 ›› Issue (12): 133-137.doi: 10.11896/j.issn.1002-137X.2014.12.028

• 人工智能 • 上一篇    下一篇

基于卷积树核的中文微博情感要素识别

陈锋,巢文涵,周庆,李舟军   

  1. 北京航空航天大学 北京100191;北京航空航天大学 北京100191;北京航空航天大学 北京100191;北京航空航天大学 北京100191
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受国家自然科学基金(61003111,9,61370126),高等学校博士学科点专项科研基金(20101102120016)资助

Convolution Tree Kernel Based Sentiment Element Recognition Approach for Chinese Microblog

CHEN Feng,CHAO Wen-han,ZHOU Qing and LI Zhou-jun   

  • Online:2018-11-14 Published:2018-11-14

摘要: 情感要素识别是情感分析的关键子任务之一,其目的是识别出文本情感所作用的情感对象。文本情感要素识别属于最细粒度的情感分析,吸引了大量研究者的关注。中文微博由于其语言简短灵活、文本不规范、噪声较大等特点,给中文微博情感分析研究工作带来了新的挑战。目前大部分情感要素识别方法都是基于规则的方法或者基于扁平化特征的统计学习方法,区分噪声的能力不强,性能提升有限。针对中文微博的特点,提出一种基于卷积树核的情感要素识别算法,即首先对句子进行词性标注与依存关系分析,将句子中的名词作为候选情感要素;然后基于两种不同的修剪策略对依存树进行修剪,以获取每个候选情感要素的结构化信息;最后采用卷积树核计算依存树的相似度,并在此基础上识别句子中的情感要素。NLP&CC2012和NLP&CC2013中文微博情感分析评测任务中的实验验证了该方法的性能,其准确率相比于传统方法有显著提升。

关键词: 情感要素识别,中文微博,卷积树核,依存树修剪

Abstract: Sentiment element recognition is one of the key sub-tasks of sentiment analysis,and its goal is to identify the sentiment targets in the text.Sentiment target recognition is identified as the most fine-grained sentiment analysis task,and many researchers have conducted lots of research work on it.Since Chinese Microblog text is short and very flexible,which is often not standardized and contains a lot of noisy information in the text,it brings new challenges to the Chinese Microblog sentiment analysis research.At present,most of sentiment target recognition methods are based on rules or statistical learning methods using flat features,which can not distinguish between noisy information and sentiment targets very well,resulting in the low recognition performance.According to the characteristics of Chinese Microblog,a novel sentiment element recognition approach based on convolution tree kernel was proposed.Firstly,the approach analyses the part of speech(POS) and dependency relationship of the Microblog sentences,and takes the nouns in the sentences as candidate sentiment elements.Secondly,it adopts two different pruning strategies to obtain every candidate’s structured information.Finally,convolution tree kernel method is used to calculate the similarity of dependency tree,which is the foundation of sentiment elements recognition.The experiments of NLP & CC2012 and NLP & CC2013 Chinese Microblog sentiment target analysis tasks show the performance of this approach is improved significantly comparing to the baseline.

Key words: Sentiment target recognition,Chinese microblog,Convolution tree kernel,Pruning strategy

[1] Jeonghee Y,Nasukawa T,Bunescu R,et al.Sentiment analyzer:extracting sentiments about a given topic using natural language processing techniques[C]∥Third IEEE International Confe-rence on Data Mining.ICDM 2003,2003
[2] Kim S,Hovy E.Automatic detection of opinion bearing wordsand sentences[C]∥Companion Volume to the Proceedings of the International Joint Conference on Natural Language Processing (IJCNLP).2005
[3] Hu M,Liu B.Mining and summarizing customer reviews[C]∥Seattle,WA,USA:Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining.ACM,2004
[4] Popescu A,Etzioni O.Extracting product features and opinions from reviews[C]∥Vancouver,British Columbia,Canada:Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing.Association for Computational Linguistics,2005
[5] Jakob N,Gurevych I.Extracting opinion targets in a single- and cross-domain setting with conditional random fields[C]∥Cambridge,Massachusetts:Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing.Association for Computational Linguistics,2010
[6] Haussler D.Convolution kernels on discrete structures[R].Department of Computer Science,University of California at Santa Cruz,1999
[7] Watkins C.Dynamic alignment kernels.http://www.kernel-mechines.org/publications/wathinsoo
[8] Collins M,Duffy N.Convolution kernels for natural language[C]∥Advances in Neural Information Processing Systems.2001 (下转第142页)(上接第137页)
[9] Moschitti A.A study on convolution kernels for shallow semantic parsing[C]∥Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics.Association for Computational Linguistics,2004
[10] Moschitti A.Making Tree Kernels Practical for Natural Language Learning[C]∥EACL.2006
[11] Joachims T.Making large scale SVM learning practical.http://www.academic.reaserch.microsoft.com/paper/1361548
[12] Wiegand M,Klakow D.Convolution kernels for opinion holder extraction[C]∥The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics.Los Angeles,California:Human Language Technologies.Association for Computational Linguistics,2010
[13] Qiu G,Liu B,Bu J,et al.Opinion word expansion and target extraction through double propagation[J].Comput.Linguist.,2011,37(1):9-27
[14] Kim Y,Kim S,Myaeng S.Extracting Topic-related Opinions and their Targets in NTCIR-7[C]∥Proc.of NTCIR-7 Workshop.2008
[15] Agarwal A,Xie B,Vovsha I,et al.Sentiment analysis of Twitter data[C]∥Portland,Oregon:Proceedings of the Workshop on Languages in Social Media.Association for Computational Linguistics,2011
[16] Wilson T,Wiebe J,Hoffmann P.Recognizing contextual polarity in phrase-level sentiment analysis[C]∥Vancouver,British Columbia,Canada:Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing.Association for Computational Linguistics,2005
[17] Liu B,Hu M,Cheng J.Opinion observer:analyzing and comparing opinions on the web[C]∥Proceedings of the 14th international conference on World Wide Web.ACM,2005
[18] Collins M,Duffy N.New ranking algorithms for parsing andtagging:Kernels over discrete structures,and the voted perceptron[C]∥Proc.of the ACL 2002.2002:28-136
[19] 赵妍妍,秦兵,车万翔,等.基于句法路径的情感评价单元识别[J].软件学报,2011(5)
[20] 王倩,何婷婷,闻彬,等.基于依存关系的中文情感要素抽取技术研究[C]∥第十届全国计算语言学学术会议.烟台,2009
[21] 黄亿华,濮小佳,袁春风,等.基于句法树结构的情感评价单元抽取算法[J].计算机应用研究,2011,28(9):3229-3234
[22] 娄德成,姚天昉.汉语句子语义极性分析和观点抽取方法的研究[J].计算机应用,2006,26(11):2622-2655
[23] 黄晨,钱龙华,周国栋,等.基于卷积树核的无指导中文实体关系抽取研究[J].中文信息学报,2010,24(4):11-17

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!