计算机科学 ›› 2020, Vol. 47 ›› Issue (2): 195-200.doi: 10.11896/jsjkx.181202410

• 人工智能 • 上一篇    下一篇

基于篇章层次结构的商品评论摘要

张宜飞,王中卿,王红玲   

  1. (苏州大学计算机科学与技术学院 江苏 苏州215006)
  • 收稿日期:2018-12-15 出版日期:2020-02-15 发布日期:2020-03-18
  • 通讯作者: 王红玲(hlwang@suda.edu.cn)
  • 基金资助:
    国家自然科学基金青年科学基金项目(61806137,61702518);江苏省高等学校自然科学研究面上项目48KJB520043)

Product Review Summarization Using Discourse Hierarchical Structure

ZHANG Yi-fei,WANG Zhong-qing,WANG Hong-ling   

  1. (School of Computer Science & Technology,Soochow University,Suzhou,Jiangsu 215006,China)
  • Received:2018-12-15 Online:2020-02-15 Published:2020-03-18
  • About author:ZHANG Yi-fei,born in 1995,postgra-duate,is member of China Computer Federation (CCF).Her main research interests include natural language processing and product review summarization;WANG Hong-ling,born in 1975,assistant professor,is member of China Computer Federation (CCF).Her main research interests include natural language processing and text summarization.
  • Supported by:
    This work was supported by the Young Scientists Fund of the National Natural Science Foundation of China (61806137, 61702518) and Natural Science Foundation of the Jiangsu Higher Education Institutions of China (18KJB520043).

摘要: 商品评论摘要是从一个商品的所有评论中抽取出一系列有序的能够代表评论广泛意见的句子作为该商品的综合评论。篇章层次结构分析旨在对篇章内部各个语义单元之间的层次结构和语义关系进行分析。由此可见,分析篇章层次结构有利于更加准确地判断篇章内各个语义单元的语义信息和重要程度,这对于抽取篇章的重要内容有很大帮助。因此,文中提出了一种基于篇章层次结构的商品评论摘要方法。该方法基于LSTM(Long Short Term Memory Network)神经网络构建抽取式商品评论摘要模型,并利用注意力机制将篇章层次结构信息作为判断篇章单元重要程度的参照加入该模型中,以便更加准确地抽取出商品评论中的重要内容,从而提升整个任务的性能。将所提方法在Yelp 2013数据集上进行实验,并在ROUGE评价指标上进行评测。实验结果表明,加入篇章层次结构信息后,模型的ROUGE-1值达到了0.3608,与仅考虑评论句子信息的标准LSTM方法相比提升了1.57%,这说明在商品评论摘要任务中引入篇章层次结构信息能够有效地提升该任务的性能。

关键词: LSTM, 篇章层次结构, 商品评论摘要, 神经网络, 注意力机制

Abstract: Product review summarization aims to extract a series of relevant sentences that represent the overall opinions of the product.Analysis of discourse hierarchical structure aims to analyze the hierarchical structure and semantic relationship between the various semantic units in the discourse.Obviously,the analysis of discourse hierarchical structure is conducive to determine the semantic information and importance of each semantic unit in the discourse,which is very useful for extracting the important content of the discourse.Therefore,this paper proposed a product review summarization method based on discourse hierarchical structure.This method builds a product review summarization model based on LSTM and applies attention mechanism to extract the important content in the product review by integrating discourse hierarchical structure into the model.The experiments was conducted on the Yelp 2013 dataset and evaluated on the ROUGE evaluation index.The experimental results show that the ROUGE-1 value of the model after adding the discourse hierarchical structure is 0.3608,which is 1.57% higher than the stan-dard LSTM method using only sentences information of the product review.This shows that the introduction of discourse hierarchical structure into the product review summarization task can effectively improve the performance of the task.

Key words: Attention mechanism, Discourse hierarchical structure, LSTM, Neural network, Product review summarization

中图分类号: 

  • TP391
[1]KANG M,AHN J,LEE K,et al.Opinion mining using ensemble text hidden Markov models for text classification[J].Expert Systems With Applications,2017,94.
[2]LY D K,SUGIYAMA K,LIN Z,et al.Product review summarization from a deeper perspective[C]∥ACM IEEE Joint Conference On Digital Libraries.ACM,2011:311-314.
[3]YATANI K,NOVATI M,TRUSTY A,et al.Analysis of adjective-noun word pair extraction methods for online review summarization[C]∥International Joint Conference on Artificial Intelligence.AAAI Press,2011:2771-2776.
[4]ALLAHYARI M,POURIYEH S A,ASSEFI M,et al.Text Summarization Techniques:A Brief Survey[J].International Journal of Advanced Computer Science and Applications,2017,8(10):397-405.
[5]HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
[6]SUN Q Y,WANG Z Q,ZHU Q M,et al.Stance Detection with Herarchical Attention Network[C]∥Proceedings of the 27th International Conference on Computational Linguistics.Santa Fe,New Mexico,USA,2018:20-26.
[7]HU M,LIU B.Mining opinion features in customer reviews.[C]∥National Conference on Artificial Intelligence,American Association for Artificial Intelligence.2004:755-760.
[8]HU M,LIU B.Mining and summarizing customer reviews[C]∥Knowledge Discovery and Data Mining.ACM,2004:168-177.
[9]NISHIKAWA H,HASEGAWA T,MATSUO Y,et al.Optimizing Informativeness and Readability for Sentiment Summarization[C]∥Meeting of the Association for Computational Linguistics.2010:325-330.
[10]GANESAN K,ZHAI C,HAN J,et al.Opinosis:A Graph Based Approach to Abstractive Summarization of Highly Redundant Opinions[C]∥International Conference on Computational Linguistics.2010:340-348.
[11]GANESAN K,ZHAI C,VIEGAS E,et al.Micropinion generation:an unsupervised approach to generating ultra-concise summaries of opinions[C]∥International World Wide Web Conferences.ACM,2012:869-878.
[12]GERANI S,MEHDAD Y,CARENINI G,et al.Abstractive Summarization of Product Reviews Using Discourse Structure[C]∥Empirical Methods in Natural Language Processing.2014:1602-1613.
[13]LI Y C.Research of Chinese Discourse Structure Representation And Resource Construction[D].Soochow:Soochow University,2015.
[14]SHAN Y M.Formal Analyses of Chinese Text Structure and Its Indexing Algorithm[C]∥China National Conference on Computational Linguistics.Journal of Chinese information processing,2001.
[15]YANG J,HOU M,WANG N,et al.Sentiment Polarity Analysis of Reviews Based on Shallow Text Structure[J].Journal of Chinese Information Processing,2011,25(2):83-89.
[16]TANG D,QIN B,LIU T,et al.Document Modeling with Gated Recurrent Neural Network for Sentiment Classification[C]∥Empirical Methods in Natural Language Processing.Association for Computational Linguistics,2015:1422-1432.
[17]YANG Z,YANG D,DYER C,et al.Hierarchical Attention Networks for Document Classification[C]∥North American Chapter of the Association for Computational Linguistics.Association for Computational Linguistics,2016:1480-1489.
[18]REN P,CHEN Z,REN Z,et al.Leveraging Contextual Sentence Relations for Extractive Summarization Using a Neural Attention Model[C]∥International ACM SIGIR Conference on Research and Development in Information Retrieval.ACM,2017:95-104.
[19]SINHA A,YADAV A,GAHLOT,et al.Extractive Text Summarization using Neural Networks[J].arXiv:Computation and Language,2018.
[20]LIN C Y,HOVAY E.Automatic evaluation of summaries using N-gram co-occurrence statistics[C]∥Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology.2003:71-78.
[21]WAN X,YANG J.Multi-document summarization using cluster-based link analysis[C]∥Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.ACM,2008:299-306.
[22]PEI Y,YIN W,FAN Q,et al.A Supervised Aggregation Framework for Multi-Document Summarization[C]∥International Conference on Computational Linguistics.2012:2225-2242.
[1] 周芳泉, 成卫青.
基于全局增强图神经网络的序列推荐
Sequence Recommendation Based on Global Enhanced Graph Neural Network
计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[2] 戴禹, 许林峰.
基于文本行匹配的跨图文本阅读方法
Cross-image Text Reading Method Based on Text Line Matching
计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032
[3] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[4] 熊丽琴, 曹雷, 赖俊, 陈希亮.
基于值分解的多智能体深度强化学习综述
Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization
计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112
[5] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[6] 宁晗阳, 马苗, 杨波, 刘士昌.
密码学智能化研究进展与分析
Research Progress and Analysis on Intelligent Cryptology
计算机科学, 2022, 49(9): 288-296. https://doi.org/10.11896/jsjkx.220300053
[7] 王润安, 邹兆年.
基于物理操作级模型的查询执行时间预测方法
Query Performance Prediction Based on Physical Operation-level Models
计算机科学, 2022, 49(8): 49-55. https://doi.org/10.11896/jsjkx.210700074
[8] 陈泳全, 姜瑛.
基于卷积神经网络的APP用户行为分析方法
Analysis Method of APP User Behavior Based on Convolutional Neural Network
计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121
[9] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[10] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[11] 檀莹莹, 王俊丽, 张超波.
基于图卷积神经网络的文本分类方法研究综述
Review of Text Classification Methods Based on Graph Convolutional Network
计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[12] 闫佳丹, 贾彩燕.
基于双图神经网络信息融合的文本分类方法
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[13] 汪鸣, 彭舰, 黄飞虎.
基于多时间尺度时空图网络的交通流量预测模型
Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction
计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[14] 李宗民, 张玉鹏, 刘玉杰, 李华.
基于可变形图卷积的点云表征学习
Deformable Graph Convolutional Networks Based Point Cloud Representation Learning
计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[15] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!