Computer Science ›› 2014, Vol. 41 ›› Issue (Z6): 87-90.

Previous Articles     Next Articles

Keyphrase-based Chinese Tags Generation Hybrid Algorithm

LIU Dong and ZHANG Cai-huan   

  • Online:2018-11-14 Published:2018-11-14

Abstract: This work provided an algorithm HTGA(Hybrid Tags Generation Algorithm) to generate tags for Chinese documents,which extracts phrase chunks as candidate keywords,and considers other factors like TF.IDF,words span etc.Experiments show that this algorithm improves the accuracy of keyword extraction,and has a stable performance over various texts.Some samples were extracted and compared with the standard answers.There are more than 60% results that are as well as or better than the standard answers in reflection of document topics.

Key words: Keyword extraction,Tag generation,Keyphrase,Chinese tags,Algorithm

[1] 章成志.自动标引研究的回顾与展望[J].现代图书情报技术,2007(11):33-39
[2] SCWS分词软件.http://www.xunsearch.com/scws/
[3] Liu Zhi-yuan,Chen Xin-xiong,Zheng Ya-bin,et al.Automatic keyphrase extraction by bridging vocabulary gap[C]∥Procee-dings of the Fifteenth Conference on Computational Natural Language Learning.Association for Computational Linguistics.2011:135-144
[4] 谢晋.基于词跨度的中文文本关键词提取及在文本分类中的应用[D].杭州:浙江工业大学,2011
[5] 刘华.基于关键短语的文本内容标引研究[D].北京:北京语言大学,2005
[6] 韩艳.基于统计的中文文本关键短语自动抽取方法研究[D].苏州:苏州大学,2009
[7] Mihalcea R,Tarau P.TextRank:Bringing Order into Texts[C]∥Proceedings of EMNLP.2004:404-411
[8] 刘知远.基于文档主题结构的关键词抽取方法研究[D].北京:清华大学,2011
[9] 方俊,郭雷,王晓东.基于语义的关键词提取算法[J].计算机科学,2008,35(6):148-151
[10] 索红光,刘玉树.一种基于词汇链的关键词抽取方法[J].中文信息学报,2006,20(6):25-30
[11] 胡燕,吴虎子,钟珞.中文文本分类中基于词性的特征提取方法研究[J].武汉理工大学学报,2007,4
[12] 赵军,黄吕宁.汉语基本名词短语结构分析模型[J].计算机学报,1999,22(2):141-146
[13] 赵蕾蕾.基于词和基本短语模式的特征提取方法[D].保定:河北大学,2009
[14] 王军.词表的自动丰富——从元数据中提取关键词及其定位[J].中文信息学报,2005,19(6):36-43
[15] Hulth A.Improved Automatic Keyword Extraction Given More Linguistic Knowledge[C]∥Proceedings of EMNLP.2003:216-223
[16] Peter D.Turney,Learning Algorithms for Keyphrase Extraction[J].Information Retrieval,2000,2(4):303-336
[17] Frank E,Paynter G W,Witten I H,et al.Domain-specific Keyphrase Extraction[C]∥Proceedings of IJCAI.1999:668-673
[18] 李素建,王厚峰,俞士汶,等.关键词自动标引的最大熵模型应用研究[J].计算机学报,2004,27(9):l192-1197
[19] Zhang K,Xu H,Tang J,et al.Keyword Extraction Using Support Vector Machine[C]∥Proc.of the Seventh International Conference on Web-Age Information Management(WAIM2006).2006:85-96
[20] Zhang Cheng-zhi,Wang Hui-lin,Liu Yao,et al.Automatic Keyword Extraction from Documents Using Conditional Random Fields[J].Journal of Computational Information Systems,2008,4(3):1169-1180
[21] 钱爱兵,江岚.基于改进TFIDF的中文网页关键词抽取一以新闻网页为例[J].情报理论与实践,2008,6
[22] 郑家恒,卢娇丽.关键词抽取方法的研究[J].计算机工程,2005,31(18)
[23] 都云程,周伟,韩艳铧,等.基于字同现频率的关键词自动抽取[J].北京信息科技大学学报,2011,26(6)
[24] 肖根胜.改进TFIDF和谱分割的关键词自动抽取方法研究[D].武汉:华中师范大学,2012
[25] 赵鹏,蔡庆生,王清毅,等.一种基于复杂网络特征的中文文档关键词抽取算法[J].模式识别与人工智能,2007,20(6)
[26] 汪小帆,李翔,陈关荣.复杂网络理论及其应用[M].北京:清华大学出版社,2006

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!