计算机科学 ›› 2020, Vol. 47 ›› Issue (6A): 45-48.doi: 10.11896/JsJkx.190500028

• 人工智能 • 上一篇    下一篇

基于专利结构的中文专利摘要研究

束云峰, 王中卿   

  1. 苏州大学计算机科学与技术学院 江苏 苏州 215006
  • 发布日期:2020-07-07
  • 通讯作者: 王中卿(wangzq@suda.edu.cn)
  • 作者简介:1627405102@stu.suda.edu.cn
  • 基金资助:
    国家自然科学基金(61806137,61702518)

Research on Chinese Patent Summarization Based on Patented Structure

SHU Yun-feng and WANG Zhong-qing   

  1. School of Computer Science and Technology,Soochow University,Suzhou,Jiangsu 215006,China
  • Published:2020-07-07
  • About author:SHU Yun-feng, born in 1998, undergra-duate.His main research interest isna-tural language processing.WANG Zhong-qing, born in 1987, Ph.D, lecturer, is a member of China ComputerFederation.His main research interestis natural language processing.

摘要: 文本摘要任务旨在通过对原文进行压缩提炼,得出简明扼要的内容描述。针对中文专利文本,提出了一种基于Patent-Rank算法生成专利摘要的算法。首先,对候选句群做冗余处理,以去除候选句群中相似度较高的句子;然后,对专利中的权利要求书和说明书构建3种不同的相似度计算方法,以计算句子之间的影响权重;最后,选取权值高的句子输出,并将其作为专利的摘要。该算法在选取的数据集中取得了较好的效果。实验结果表明提出的算法相比于已有方法在ROUGE值上有显著提高。

关键词: PatentRank, 文本摘要, 相似度计算, 中文信息处理, 专利

Abstract: Text summarization aims to provide a concise description of the content by compressing and refining the original text.For the Chinese patented text,an algorithm for generating patent summarization based on the PatentRank algorithm is proposed.Firstly,the candidate sentence groups are redundantly processed to remove the sentences with high similarity in the candidate sentence groups.Then,three different similarity calculation methods are constructed for the patent claims and descriptions to calculate the weights between sentences.Finally,the sentence with high weight is selected as the summarization of the patent.The algorithm has achieved good results in the selected datasets.Experimental results demonstrate that the proposed method substantially outperforms existing approaches in terms of ROUGE measurement.

Key words: Chinese information processing, Patent, PatentRank, Similarity calculating, Text summarization

中图分类号: 

  • TP391
[1] WANG L,YAO J,TAO Y,et al.A reinforced topic-aware convolutional sequence-to-sequence model for abstractive text summarization.arXiv:1805.03616,2018.
[2] LIN J,SUN X,MA S,et al.Global Encoding for Abstractive Summarization//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Short Papers).Melbourne,Australia,2018:163-169.
[3] PAULUS R,XIONG C,SOCHER R.A deep reinforced model for abstractive summarization.arXiv:1705.04304,2017.
[4] CHEN Y C,BANSAL M.Fast abstractive summarization with reinforce-selected sentence rewriting.arXiv:1805.11080,2018.
[5] LIU L,LU Y,YANG M,et al.Generative adversarial network for abstractive text summarization//Thirty-Second AAAI Conference on Artificial Intelligence.2018.
[6] YASUNAGA M,ZHANG R,MEELU K,et al.Graph-based neural multi-document summarization.arXiv:1706.06681,2017.
[7] NARAYAN S,COHEN S B,LAPATA M.Ranking sentences for extractive summarization with reinforcement learning.arXiv:1802.08636,2018.
[8] ZHOU Q,YANG N,WEI F,et al.Neural document summarization by Jointly learning to score and select sentences.arXiv:1807.02305,2018.
[9] LUHN H P.The automatic creation of literature abstracts.IBM Journal of Research and Development,1958,2(2):159-165.
[10] MIHALCEA R,TARAU P.Textrank:Bringing order into text//Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing.2004.
[11] ERKAN G,RADEV D R.Lexrank:Graph-based lexical centrality as salience in text summarization.Journal of Artificial Intelligence Research,2004,22:457-479.
[12] RUSH A M,CHOPRA S,WESTON J.A neural attention model for abstractive sentence summarization.arXiv:1509.00685,2015.
[13] CHOPRA S,AULI M,RUSH A M.Abstractive sentence summarization with attentive recurrent neural networks//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2016:93-98.
[14] NALLAPATI R,ZHOU B,GULCEHRE C,et al.Abstractive text summarization using sequence-to-sequence rnns and beyond.arXiv:1602.06023,2016.
[15] SEE A,LIU P J,MANNING C D.Get to the point:Summarization with pointer-generator networks.arXiv:1704.04368,2017.
[16] LI P,LAM W,BING L,et al.Deep recurrent generative decoder for abstractive text summarization.arXiv:1708.00625,2017.
[17] CAO Z,LI W,LI S,et al.Retrieve,rerank and rewrite:Softtemplate based neural summarization//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.2018:152-161.
[18] LIN C Y,HOVY E.Automatic evaluation of summaries using n-gram co-occurrence statistics//Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics.2003.
[1] 吴子仪, 李邵梅, 姜梦函, 张建朋.
基于自注意力模型的本体对齐方法
Ontology Alignment Method Based on Self-attention
计算机科学, 2022, 49(9): 215-220. https://doi.org/10.11896/jsjkx.210700190
[2] 李健智, 王红玲, 王中卿.
基于图卷积网络的专利摘要自动生成研究
Automatic Generation of Patent Summarization Based on Graph Convolution Network
计算机科学, 2022, 49(6A): 172-177. https://doi.org/10.11896/jsjkx.210400117
[3] 王毅, 李政浩, 陈星.
基于用户场景的Android 应用服务推荐方法
Recommendation of Android Application Services via User Scenarios
计算机科学, 2022, 49(6A): 267-271. https://doi.org/10.11896/jsjkx.210700123
[4] 刘小蝶.
基于边界感知的复杂名词短语的识别和转换研究
Recognition and Transformation for Complex Noun Phrases Based on Boundary Perception
计算机科学, 2021, 48(6A): 299-305. https://doi.org/10.11896/jsjkx.200500157
[5] 王省, 康昭.
基于光滑表示的半监督分类算法
Smooth Representation-based Semi-supervised Classification
计算机科学, 2021, 48(3): 124-129. https://doi.org/10.11896/jsjkx.200700078
[6] 陈迎仁, 郭莹楠, 郭享, 倪一涛, 陈星.
基于特征相似度计算的网页包装器自适应
Web Page Wrapper Adaptation Based on Feature Similarity Calculation
计算机科学, 2021, 48(11A): 218-224. https://doi.org/10.11896/jsjkx.210100230
[7] 张梦月, 胡军, 严冠, 李慧嘉.
基于可见性图网络的中国专利申请关注度分析
Analysis of China’s Patent Application Concern Based on Visibility Graph Network
计算机科学, 2020, 47(8): 189-194. https://doi.org/10.11896/jsjkx.200300001
[8] 倪海清, 刘丹, 史梦雨.
基于语义感知的中文短文本摘要生成模型
Chinese Short Text Summarization Generation Model Based on Semantic-aware
计算机科学, 2020, 47(6): 74-78. https://doi.org/10.11896/jsjkx.190600006
[9] 吴小坤, 赵甜芳.
自然语言处理技术在社会传播学中的应用研究和前景展望
Application of Natural Language Processing in Social Communication:A Review and Future Perspectives
计算机科学, 2020, 47(6): 184-193. https://doi.org/10.11896/jsjkx.191200151
[10] 钟雅,郭渊博,刘春辉,李涛.
内部威胁检测中用户属性画像方法与应用
User Attributes Profiling Method and Application in Insider Threat Detection
计算机科学, 2020, 47(3): 292-297. https://doi.org/10.11896/jsjkx.190200379
[11] 许飞翔,叶霞,李琳琳,曹军博,王馨.
基于SA-BP算法的本体概念语义相似度综合计算
Comprehensive Calculation of Semantic Similarity of Ontology Concept Based on SA-BP Algorithm
计算机科学, 2020, 47(1): 199-204. https://doi.org/10.11896/jsjkx.181202351
[12] 张林.
三维全景图像显示专利分析
Patent Analysis on Picture Display of Three Dimensional Panorama
计算机科学, 2019, 46(6A): 558-561.
[13] 吴祎凡, 崔艳鹏, 胡建伟.
基于层次聚类的警报处理方法
Alert Processing Method Based on Hierarchical Clustering
计算机科学, 2019, 46(4): 203-209. https://doi.org/10.11896/j.issn.1002-137X.2019.04.032
[14] 卢献华, 王洪俊.
基于大数据计算框架的分布式新闻聚类系统设计
Design of Distributed News Clustering System Based on Big Data Computing Framework
计算机科学, 2019, 46(11A): 220-223.
[15] 程宏兵, 王珂, 李兵, 钱漫匀.
一种高效的社交网络朋友推荐方案
Efficient Friend Recommendation Scheme for Social Networks
计算机科学, 2018, 45(6A): 433-436.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!