计算机科学 ›› 2012, Vol. 39 ›› Issue (Z6): 232-234.

• • 上一篇    下一篇

基于典型句型的词语搭配定量分析及提取算法

王璐,张仰森   

  1. (北京信息科技大学 北京 100192)
  • 出版日期:2018-11-16 发布日期:2018-11-16

Quantitative Analysis and Extracting Arithmetic of Collocations Basic on Typical Patterns

  • Online:2018-11-16 Published:2018-11-16

摘要: 在分析现有的词语搭配自动提取算法的不足后,提出了一种新的词语搭配提取算法,尝试从非结构化语言知识到结构化语言知识的转化。基于词语搭配的语言学知识,构建了基于典型句型的词语搭配模型,其以动词、名词及形容词为中心词分类搭配,以实词为主干提取搭配,利用共现频率及互信息等统计学模型在大规模语料库中进行筛选,固化这些搭配知识,构建搭配知识库。

关键词: 词语搭配,典型句型,互信息,搭配数据库

Abstract: The shortcoming of the existing automatic extraction algorithm was analyzed, and a new model was proposed, trying to transform unstructured language knowledge into structural language knowledge. The language knowledge was introduced to a extraction model based on typical patterns, and collocations were classed by noun, verb and adjective as center, and by substantive as backbone. Then, concurrence frequency and MI etc were used to screen in large-scale corpus. Finally, this knowledge was solidified to build collocation database.

Key words: Collocation, I}ypical patterns,MI,Collocation database

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!