计算机科学 ›› 2021, Vol. 48 ›› Issue (12): 319-323.doi: 10.11896/jsjkx.201100105

• 人工智能 • 上一篇    下一篇

基于双嵌入卷积神经网络的涉案微博评价对象抽取

王晓涵, 谭陈琛, 相艳, 余正涛   

  1. 昆明理工大学信息工程与自动化学院 昆明650500
    昆明理工大学云南省人工智能重点实验室 昆明650500
  • 收稿日期:2020-11-13 修回日期:2021-04-16 出版日期:2021-12-15 发布日期:2021-11-26
  • 通讯作者: 相艳(50691012@qq.com)
  • 作者简介:1097942784@qq.com
  • 基金资助:
    国家重点研发计划(2018YFC0830105,2018YFC0830101,2018YFC0830100);云南省基础研究专项面上项目(202001AT070047,202001AT070046);国家自然科学基金(61762056,61972186);云南省高新技术产业专项(201606)

Aspect Extraction of Case Microblog Based on Double Embedded Convolutional Neural Network

WANG Xiao-han, TAN Chen-chen, XIANG Yan, YU Zheng-tao   

  1. Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China
    Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming 650500,China
  • Received:2020-11-13 Revised:2021-04-16 Online:2021-12-15 Published:2021-11-26
  • About author:WANG Xiao-han,born in 1995,master.Her main research interests include na-tural language processing and emotion analysis.
    XIANG Yan,born in 1979,Ph.D.Her main research interests include natural language processing,text mining and emotion analysis.
  • Supported by:
    National Key Research and Development Program of China(2018YFC0830105,2018YFC0830101,2018YFC0830100),General Projects of Basic Research in Yunnan Province(202001AT070047,202001AT070046),National Natural Science Foundation of China(61762056,61972186) and Special Project of New and High-tech Industry in Yunnan Province(201606).

摘要: 涉案微博的评价对象抽取是一个特定领域的任务,其评价对象词表达多样且含义与通用领域不同,仅依赖于通用领域的词嵌入无法很好地表征这些评价对象词。为此,提出了一种综合利用领域词嵌入和通用词嵌入的涉案微博评价对象抽取方法。首先对涉案微博文本进行预训练,得到具有涉案领域特征的嵌入层,其次将微博评论分别输入两个嵌入层,得到不同领域对评价对象的表征结果并进行拼接操作,然后通过卷积层抽取出与案件相关的特征,最后利用分类器对序列进行标记,以提取涉案微博评价对象。实验结果表明,所提方法的F1值在#重庆公交车坠江案#和#奔驰女司机维权案#的两个数据集上分别达到了72.36%和71.02%,较现有的基准模型有所提升,验证了不同领域词嵌入对涉案微博评价对象抽取的影响。

关键词: 微博, 评价对象抽取, 双嵌入, 卷积神经网络

Abstract: Aspect extraction of the microblog involved in the case is a task in a specific domain.The expression of aspect words is diverse and the meaning is different from that of the general domain.Only relying on the word embedding in the general domain,these aspect words cannot be well represented.This paper proposes a method for extracting aspect words from microblogs by using both domain word embedding and generic word embedding.Firstly,all the microblogs involved in the case is pre-trained to obtain the embedding layer with the characteristics of the involved domain.Secondly,the microblog comments are input into two embedding layers to obtain the characterization results of the aspect words in different domains,and perform the splicing operation.Then,the features related to the case are extracted through the convolution layer.Finally,the classifier is used to label the sequence to extract aspect words involved in the case.The experimental results show that the F1 value of the proposed method reaches 72.36% and 71.02% respectively on the data sets of #Chongqing bus falling into the river# and #Mercedes Benz female driver rights protection#,which is better than the existing benchmark models,and verifies the influence of word embedding in different domains on the aspect extraction of the microblogs.

Key words: Microblog, Aspect extraction, Double embedding, Convolutional neural network

中图分类号: 

  • TP311
[1]ZHANG S Q,DU S D,ZHANG X B,et al.Social Rumor Detection Method Based on Multimodal Fusion[J].Computer Science,2021,48(5):117-123.
[2]ZHUANG L,JING F,ZHU X Y.Movie review mining and sum-marization[C]//Proceedings of the 15th ACM International Conference on Information and Knowledge Management.Arlington,Virginia,USA,2006:43-50.
[3]BLAIR-GOLDENSOHN S,HANNAN K,MCDONALD R, et al.Building a sentiment summarizer for local service reviews[C]//Proceedings of the 2008 WWW Workshop on NLP in the Information Explosion Era(NLPIX 2008).Beijing,China,2008:339-348.
[4]SONG X L,WANG S G,LI H X.Research on automatic identification of product evaluation object oriented to specific domain[J].Journal of Chinese Information Processing,2010,24(1):89-93.
[5]ZHANG M,ZHANG Y,VO D T.Neural networks for open domain targeted sentiment[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing(EMNLP 2015).2015:348-355.
[6]PORIA S,CAMBRIA E,GELBUKH A.Aspect extraction for opinion mining with a deep con-volutional neural network[J].Knowledge Based Systems,2016,108(15):42-49.
[7]MA X Z,HOVY E.End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.ACL 2016:1064-1074.
[8]WANG W,PAN S J,DAHLMEIER D,et al.Coupled multi- layer attentions for co-extraction of aspect and opinion terms[C]//Proceedings of AAAI Conference on Artificial Intelligence(AAAI 2017).2017:3316-3322.
[9]ZHANG P,CHEN T,CHEN C.Aspect extraction method for Chinese microblog based on deep learning[J].Computer Engineering and Design,2018(8):246-250.
[10] CHEN S,LIU J,WANG Y,et al.Synchronous Double-channel Recurrent Network for Aspect-Opinion Pair Extraction[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:6515-6524.
[11]LI K,CHEN C,QUAN X,et al.Conditional augmentation for aspect term extraction via masked sequence-to-sequence generation[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:7056-7066.
[12]WEI Z,HONG Y,ZOU B,et al.Don't eclipse your arts due to small discrepancies:Boundary repositioning with a pointer network for aspect extraction[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:3678-3684.
[13] ZHOU X,WAN X,XIAO J.Representation Learning for Aspect Category Detection in Online Reviews[C]//Proceedings of Twenty-ninth AAAI Conference on Artificial Intelligence.AAAI Press,2015:417-424.
[14]YIN Y,WEI F,DONG L,et al.Unsupervised word and depen- dency path embeddings for aspect term extraction[C]//Procee-dings of the International Joint Conference on Artificial Intelligence(IJCAI 2016).2016:2979-2985.
[15]HE R,LEE W S,NG H T,et al.An unsupervised neural attention model for aspect extraction[C]//Proceedings of Annual Meeting of the Association for Computational Linguistics(ACL 2017).2017:1008-1015.
[16]LI X,LAM W.Deep multi-task learning for aspect term extraction with memory interaction[C]//Proceedings of Empirical Methods on Natural Language Processing(EMNLP 2017).2017:457-462.
[17]SHU L,HU X,BING L.Lifelong learning crf for supervised aspect extraction[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics(Volume 2:Short Papers).2017:148-154.
[18]LUO H,LI T,LIU B,et al.Improving aspect term extraction with bidirectional dependency tree representation[J].IEEE ACM Transactions on Audio,Speech,and Language Processing,2019,27(7):1201-1212.
[19]LI Y Z,LIU T W,LI Q G,et al.Character-based BiLSTM-CRF Incorporating POS and Dictionaries for Chinese Opinion Target Extraction[C]//Proceedings of the 10th Asian Conference on Machine Learning(PMLR 2018).2018:518-533.
[20]YAN H,DENG B,LI X,et al.TENER:Adapting Transformer Encoder for Named Entity Recognition[J].arXiv:1911.04474,2019.
[1] 黄颖琦, 陈红梅. 基于代价敏感卷积神经网络的非平衡问题混合方法[J]. 计算机科学, 2021, 48(9): 77-85.
[2] 徐涛, 田崇阳, 刘才华. 基于深度学习的人群异常行为检测综述[J]. 计算机科学, 2021, 48(9): 125-134.
[3] 王乐, 杨晓敏. 基于感知损失的遥感图像全色锐化反馈网络[J]. 计算机科学, 2021, 48(8): 91-98.
[4] 王炽, 常俊. 基于3D卷积神经网络的CSI跨场景手势识别方法[J]. 计算机科学, 2021, 48(8): 322-327.
[5] 程松盛, 潘金山. 基于深度学习特征匹配的视频超分辨率方法[J]. 计算机科学, 2021, 48(7): 184-189.
[6] 王栋, 周大可, 黄有达, 杨欣. 基于多尺度多粒度特征的行人重识别[J]. 计算机科学, 2021, 48(7): 238-244.
[7] 熊朝阳, 王婷. 基于卷积神经网络的建筑构件图像识别[J]. 计算机科学, 2021, 48(6A): 51-56.
[8] 胡京徽, 许鹏. 一种基于图像分类的航空紧固件产品自动分类方法[J]. 计算机科学, 2021, 48(6A): 63-66.
[9] 和青芳, 王慧, 程光. 自适应小数据集乳腺癌病理组织分类研究[J]. 计算机科学, 2021, 48(6A): 67-73.
[10] 徐少伟, 秦品乐, 曾建朝, 赵致楷, 高媛, 王丽芳. 基于多级特征和全局上下文的纵膈淋巴结分割算法[J]. 计算机科学, 2021, 48(6A): 95-100.
[11] 王建明, 黎向锋, 叶磊, 左敦稳, 张丽萍. 基于信道注意结构的生成对抗网络医学图像去模糊[J]. 计算机科学, 2021, 48(6A): 101-106.
[12] 韩斌, 曾松伟. 基于多特征融合和卷积神经网络的植物叶片识别[J]. 计算机科学, 2021, 48(6A): 113-117.
[13] 余晗青, 杨贞, 殷志坚. 基于区域激活策略的Tiny YOLOv3目标检测算法[J]. 计算机科学, 2021, 48(6A): 118-121.
[14] 史伟, 付月. 考虑语境的微博短文本挖掘:情感分析的方法[J]. 计算机科学, 2021, 48(6A): 158-164.
[15] 刘吉华, 张梦迪, 彭红霞, 贾兴平. 基于卷积神经网络的汽车销量预测模型[J]. 计算机科学, 2021, 48(6A): 178-183.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 李小春,贾春阳,李卫华. 面向对象的ICA变化检测新方法[J]. 计算机科学, 2014, 41(Z6): 184 -186 .
[2] 柴云鹏,杨 楠. 冷数据集中的流媒体存储系统节能方法[J]. 计算机科学, 2012, 39(10): 148 -151 .
[3] 赵 培,李国徽. Multi-bank闪存文件系统的一种I/O调度机制[J]. 计算机科学, 2012, 39(4): 287 -292 .
[4] 潘孝勤, 芦天亮, 杜彦辉, 仝鑫. 基于深度学习的语音合成与转换技术综述[J]. 计算机科学, 2021, 48(8): 200 -208 .
[5] 王俊, 王修来, 庞威, 赵鸿飞. 面向科技前瞻预测的大数据治理研究[J]. 计算机科学, 2021, 48(9): 36 -42 .
[6] 余力, 杜启翰, 岳博妍, 向君瑶, 徐冠宇, 冷友方. 基于强化学习的推荐研究综述[J]. 计算机科学, 2021, 48(10): 1 -18 .
[7] 王梓强, 胡晓光, 李晓筱, 杜卓群. 移动机器人全局路径规划算法综述[J]. 计算机科学, 2021, 48(10): 19 -29 .
[8] 高洪皓, 郑子彬, 殷昱煜, 丁勇. 区块链技术专题序言[J]. 计算机科学, 2021, 48(11): 1 -3 .
[9] 毛瀚宇, 聂铁铮, 申德荣, 于戈, 徐石成, 何光宇. 区块链即服务平台关键技术及发展综述[J]. 计算机科学, 2021, 48(11): 4 -11 .
[10] 蒋建峰, 孙金霞, 尤澜涛. 基于粒子群优化算法的无线传感网络安全分簇策略[J]. 计算机科学, 2021, 48(11A): 452 -455 .