计算机科学 ›› 2010, Vol. 37 ›› Issue (3): 230-233.

• 人工智能 • 上一篇    下一篇

汉英动词次范畴化对应类型的统计分析

韩习武,赵铁军   

  1. (黑龙江大学计算机科学技术学院 哈尔滨150080);(哈尔滨工业大学计算机科学技术学院 哈尔滨150001)
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受国家自然科学基金(60773069,60873169)资助。

Statistical Analysis for Chinese-English Verb Subcategorization

HAN Xi-wu,ZHAO Tie-jun   

  • Online:2018-12-01 Published:2018-12-01

摘要: 基于大规模句子级,对齐双语语料库进行了统计分析汉英动词次范畴化对应类型的系统性实验。首先以语言学量度为启发,应用双重最大似然检验的统计过滤方法初步估计了654种汉英次范畴化对应类型的概率分布;然后 根据汉英句法特点对次范畴化对应类型进行了语言学分类;最后针对每一种对应类型及其背景语料进行了基于支持向量机的语言学类别标注和统计可靠性分析。

关键词: 汉英动词次范畴化,统计分析,支持向量机

Abstract: Based on large scale ChinescEnglish parallel corpus, this paper described a systematic experiment of statistical analysis for bilingual verb subcategorization. Firstly, with lexical and grammatical compatibility as heuristics, probabilistic distributions of 654 bilingual subcategorization frames were estimated by means of a two-fold MI_E filtering method. Then,linguistic classification of the frames was determined according to Chinese and English syntax Finally,linguistic classes for each frame were labeled via SVM on the basis of their supporting corpus.

Key words: Chinese-English verb subcatcgorization, Statistical analysis, SVM

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!