计算机科学 ›› 2011, Vol. 38 ›› Issue (5): 169-174.

• 数据库与数据挖掘 • 上一篇    下一篇

朝鲜语信息检索索引方法研究

金光赫,王兴伟,蒋定德   

  1. (东北大学信息科学与工程学院 沈阳110819);(金策工业综合大学应用程序学院 平壤)
  • 出版日期:2018-11-16 发布日期:2018-11-16
  • 基金资助:
    本文国家自然科学基金(70671020,70931001,60802023),国家科技支撑计划(2008BAH37B03, 2008BAH37B07),高等学校博士学科点专项科研基金(20070145017),中央高校基本科研业务费专项资金(N090504003, N090504006)资助。

Study on Indexing Method for Korean Information Search

JIN Guang-he,WANG Xing-wei,JIANG Ding-de   

  • Online:2018-11-16 Published:2018-11-16

摘要: 基于朝鲜语信息检索系统的深入分析,研究提高朝鲜语信息检索性能的索引问题。通过剖析名词单位索引法、单位词素索引法、n-gram单位索引法、单位语句索引法等经典索引法的优缺点,以试验分析找出对索引性能有重要影响的关键要素,深入阐述朝鲜语的30个非用词、索引方式与朝鲜语的特征,从而提出一种新的将每种索引方法特征融于一体的朝鲜语信息检索索引方法。仿真实验表明,所提出的新方法具有更好的性能。

关键词: 朝鲜语,词素分析,索引法,n-dram方法,非用词

Abstract: Based on the sufficient analysis of the Korean information search system, this paper investigated the indexing method to improve the search performance. After the advantage and shortcoming of the typical indexing methods such as the noun unit indexing, the morphological analysis indexing, the n-gram unit indexing, the word segmentation unit inflexing and so on, were analyzed in detail, the key factor impacting significantly on the search performance was found by trial and error. At the same time, thirty stop words in Korean, indexing way used to search, and its characteristics were illustrated. Finally, a new indexing method for Korean information search was proposed by taking advantage of every inflexing method. Simulation results show that new method proposed holds the significant performance improvement and is promismg.

Key words: Korean, Morphological analysis, Indexing method, N-gram method, Stop word

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!