Computer Science ›› 2015, Vol. 42 ›› Issue (7): 291-294.doi: 10.11896/j.issn.1002-137X.2015.07.062

Previous Articles     Next Articles

Automatic Identification and Rule Mining for Relation Words of Chinese Compound Sentences Based on Bayesian Model

YANG Jin-cai GUO Kai-kai SHEN Xian-jun HU Jin-zhu   

  • Online:2018-11-14 Published:2018-11-14

Abstract: The compound sentence is an important unit of the Chinese sentence and its annotation is important to the research on comprehending Chinese texts.Identification of relation words is the basis of compound sentence annotation.Based on a comprehensive analysis of Chinese compounds corpus,this paper extracted features of relation words from their context and collocation.Those features are described in formulas.A combination of mutual information with information gains is used for selecting features and eliminating redundant features.The Bayesian model is used for training and testing feature sets.Rules are created from the statistics results,and rule base is configured with rules,which are used for automatic identification of relation words.The experimental results show that our method obtains a high accuracy in identification,which proves the feasibility and effectiveness of the method.

Key words: Relation words,Bayesian,Rules,Automatic identification

[1] 许嘉璐.现状和设想—试论中文信息处理与现代汉语研究[J].中文信息学报,2001,15(2):1-8 Xu Jia-lu.The-State-of-the-Art and the Related Strategic Considerations—On the Studies of Chinese Information Processing and Contemporary Chinese Language[J].Journal of Chinese Information Processing,2001,15(2):1-8
[2] 刘迁,贾惠波.中文信息处理中自动分词技术的研究与展望[J].计算机工程与应用,2006(3):175-177 Liu Qian,Jia Hui-bo.A View of Chinese Word Automatic Segmentation Research in the Chinese Information Disposal[J].Computer Engineering and Applications,2006(3):175-177
[3] 黄昌宁,赵海.中文分词十年回顾[J].中文信息学报,2007(3):8-18 Huang Chang-ning,Zhao Hai.Chinese Word Segmentation:A Decade Review[J].Journal of Chinese Information Processing,2007(3):8-18
[4] 李艳翠,孙静,周国栋,等.基于清华汉语树库的复句关系词识别与分类研究[J].北京大学学报,2013(12) Li Yan-cui,Sun Jing,Zhou Guo-dong,et al.Complex Sentence Relative Recognition and Classification Based on Tsinghua Chinese Treebank[J].Acta Scientiarum Naturalium Universitatis Pekinensis,2013(12)
[5] 郭艳华,周昌乐.自然语言理解研究综述[J].杭州电子工业学院学报,2005,20(1):58-65 Guo Yan-hua,Zhou Chang-le.Natural Language Understanding Research Review[J].Journal of Hangzhou Institute of ElectronicEngineering,2005,0(1):58-65
[6] 鲁松,宋柔.汉英机器翻译中描述型复句的关系识别与处理[J].软件学报,2001,12(1):83-93 Lu Song,Song Rou.Distinction and Treatment of the Internal Relation of Descriptive Complex Sentences in Chinese-English Machine Translation[J].Journal of Software,2001,12(1):83-93
[7] 鲁松,白硕.汉语多重关系复句的关系层次分析[J].软件学报,2001,12(7):987-995 Lu Song,Bai Shuo.Parsing the Logical Embedded Complex Sentences in Chinese[J].Journal of Software,2001,12(7):987-995
[8] 张文东,易轶虎.利用潜在语义分析和关联规则挖掘构造同义与关联词集[J].计算机工程与科学,2007(1):103-104,116 Zhang Wen-dong,Yi Yi-hu.To Construct the Set of Synonyms and Association Words Using Latent Semantic Analysis and the Mining of Association Rules[J].Computer Engineering & Science,2007(1):103-104,116
[9] 姚双云.复句关系标记搭配研究[M].武汉:华中师范大学出版社,2008 Yao Shuang-yun.Research on Relation Markers Collocation in Chinese Complex Sentences[M].Wuhan:Central China Normal University Press,2008
[10] 胡金柱,吴峰文,等.汉语复句关系词库的建设及其利用[J].语言科学,2010,9(2):133-142 Hu Jin-zhu,Wu Feng-wen,et al.Establishment and Exploitation of Relationship Marked Corpus for Chinese Complex Sentences[J].Linguistic Sciences ,2010,9(2):133-142
[11] 胡金柱,陈江曼,杨进才,等.基于规则的连用关系标记的自动标识研究[J].计算机科学,2012,39(7):190-194 Hu Jin-zhu,Chen Jiang-man,Yang Jin-cai,et al.Research on Auto-identifying of Adjoining Relation Markers Based on Rule[J].Computer Science,2012,39(7):190-194
[12] Xu Y,Zhang F.Using SVM to construct a Chinese dependency parser[J].Journal of Zhejiang University Science A,2006,7(2):199-203
[13] 高维君,姚天顺,黎邦洋,等.机器学习在汉语关联词语识别中的应用[J].中文信息学报,2000,14(3):1-8 Gao Wei-jun,Yao Tian-shun,Li Bang-yang,et al.Applying Machine Learning to Identify Chinese Discourse Markers[J].Journal of Chinese Information Processing,2000,14(3):1-8
[14] 宋锐,林鸿飞,常富洋.中文比较句识别及比较关系抽取[J].中文信息学报,2009,3(2):102-106 Song Rui,Lin Hong-fei,Chang Fu-yang.Chinese Comparative Sentences Identification and Comparative Relations Extraction[J].Journal of Chinese Information Processing,2009,3(2):102-106

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!