计算机科学 ›› 2011, Vol. 38 ›› Issue (9): 150-154.

• 数据库与数据挖掘 • 上一篇    下一篇

一种基于稀疏编码的语义标注方法

陈叶旺,李海波,余金山,陈维斌   

  1. (华侨大学计算机科学学院 厦门 361021)
  • 出版日期:2018-11-16 发布日期:2018-11-16
  • 基金资助:
    本文受福建农业科技重大项目((201ON5008) ,福建省自然科学基金(A0810013)资助。

Semantic Annotation Method Based on Sparse Coding

CHEN Ye-wang, LI Hai-bo, YU Jin-shan, CHEN Wei-bin   

  • Online:2018-11-16 Published:2018-11-16

摘要: 语义标注是实现语义网的一个重要研究内容,目前已有很多标注方法取得了不错的效果。但这些方法几乎都没有注意到本体所描述的知识往往稀疏地分布在文档中,也未能有效地利用文档的组织结构信息,使得这些方法对质量较差的文档的标注不理想。为此提出了一种基于稀疏编码的本体语义自动标注方法((Semantic Annotation Method based on Sparse Coding, SAMSC),该方法先按本体知识描述从文档中识别出一定的语义作为初始值,再通过迭代解析文档段落结构和描述主题,完成本体知识与文档资源的相关系数矩阵计算,最后在全局文档空间中通过最小化损失函数来实现用本体对文档的语义标注。实验表明,该方法能有效地对互联网中大量良芬不齐的文档进行自动语义标注,对质量差的文档资源能取得让人接受的结果。

关键词: 本体,语义标注,段落结构,SAMSC

Abstract: Semantic annotation plays a significant role in Semantic Web rescarcho There arc many annotation methods for unstructured documents today. However,none of them takes notice of the fact that the knowledge locates in documenu sparsely, and few of them make use of the structure of a document effectively, which results in that they cannotannotate document well in case that the quality of the document is poor. In this paper, we proposed a Semantic Annotalion Method based on Sparse Coding(SAMSC) for unstructured data. This method starts from initiation by identifying some semantics described in documents by ontology; secondly, in order to determine the correlation between a document and a semantic topic described in ontology, it resolves the paragraph structure and topics of the document iterativcly;finally, this method annotates the documents in the global range of all documents by minimizing loss function. The experiment results demonstrate the performance of this method annotates unstructured documents well in the Web automatically and effectively. Also, it annotates low quality documents better than other methods.

Key words: Ontology, Semantic annotation, Text paragraph structure, SAMSC

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!