基于领域词语本体的短文本分类

计算机科学 ›› 2009, Vol. 36 ›› Issue (3): 142-145.

基于领域词语本体的短文本分类

出版日期:2018-11-16 发布日期:2018-11-16
基金资助:
本文受国家自然科学基金（60703010）,重庆市自然科学基金（2006BB2374）,重庆市教委科学技术研究项目（KJ070519）,教育部回国留学人员启动基金（教外司留[2007]1109号）资助.

Online:2018-11-16 Published:2018-11-16

摘要/Abstract

摘要： 短文本自身长度较短，描述概念能力弱，常用文本分类方法都不太适用于短文本分类。提出了基于领域词语本体的短文本分类方法。首先抽取领域高频词作为特征词，借助知网从语义方面将特征词扩展为概念和义元，通过计算不同概念所包含相同义元的信息量来衡量词的相似度，从而进行分类。对比实验表明，该方法在一定程度上弥补了短文本特征不足的缺点，且提高了准确率和召回率。

关键词: 短文本本体知网文本分类语义义元

Abstract: The conventional methods of text classification are not suitable for short text classification because short texts are short and their ability of describing concept is weak. A method using the domain word ontologies for short texts classification was prop

Key words: Short-text, Ontology, Hownet; Text-classification, Semantic, Sememe

. 基于领域词语本体的短文本分类[J]. 计算机科学, 2009, 36(3): 142-145. https://doi.org/

参考文献

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed