计算机科学 ›› 2015, Vol. 42 ›› Issue (10): 239-243.

• 人工智能 • 上一篇    下一篇

生物事件触发词识别方法研究

魏小梅,黄 钰,陈 波,姬东鸿   

  1. 武汉大学计算机学院 武汉430072;华中农业大学信息学院 武汉430070,华中农业大学信息学院 武汉430070,武汉大学计算机学院 武汉430072,武汉大学计算机学院 武汉430072
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受国家自然科学基金(61202304,5,61202193),国家哲学社会科学重大计划招标项目(11&ZD189)资助

Research on Tagging Biomedical Event Trigger

WEI Xiao-mei, HUANG Yu, CHEN Bo and JI Dong-hong   

  • Online:2018-11-14 Published:2018-11-14

摘要: 从生物文献中抽取生物事件对于生物领域的知识挖掘起着重要的作用,而事件触发词的识别是生物事件抽取的一个关键步骤。系统分别采用词汇及其上下文特征、短语标记特征、词聚类特征以及统计的词典特征构造不同的基于词级的CRF模型,用于生物事件触发词的标记。然后针对不同的触发词类型选择对应最优的标记模型,构造了一个混合CRF模型。在BioNLP 2009 ST语料库上进行了实验评估,结果表明提出的方法取得了很好的性能,为生物事件的抽取建立了良好的基础。

关键词: 生物事件,触发词,CRF 模型,Brown Cluster,特征

Abstract: Event extraction from biomedical literature plays an important role in the knowledge mining in biomedical domain.The trigger identification is the key step in biomedical event extraction.We used rich features including lemma,context,phrase label,word cluster and learned trigger dictionary to build several kinds of CRF models.Then we chose the best model for each type of triggers to combine a hybrid model.The evaluation on the BioNLP 2009 ST data set shows that our approach achieves good performance,which lays foundation for biomedical event extraction.

Key words: Biomedical event,Trigger,CRF model,Brown Cluster,Feature

[1] Kim Jin-dong,Ohta T,Pyysalo S,et al.Overview of BioNLP’09 Shared Task on Event Extraction[C]∥BioNLP Shared Task 2009 Workshop,2009.Boston,MA,USA,2009:1-9
[2] Miwa M,Saetre R,Jin-dong K,et al.Event extraction with complex event classification using rich features[J].Journal of bioinformatics and computational biology,2010,8(1):131-146
[3] Casillas A,de Ilarraza A D,Gojenola K,et al.Using Kybots for extracting events in biomedical texts[C]∥BioNLP Shared Task 2011 Workshop,2011.Portland,Oregon,USA,2011:138-142
[4] Bjǒrne J,et al.Extracting Complex Biological Events with Rich Graph-Based Feature Sets[C]∥Proceedings of the Workshop on BioNLP:Shared Task,2009.Boulder,Colorado,2009:10-18
[5] MacKinlay A,Martinez D,Baldwin T.Biomedical event annotation with CRFs and precision grammars[C]∥Workshop on Current Trends in Biomedical Natural Language Processing,2009.Boulder,Colorado,2009:77-85
[6] Lu Ya-nan,Yao Xiao-yuan,Wei Xiao-mei,et al.CHEMDNERSystem with Mixed Conditional Random Fields and Multi-scale Word Clustering[J].Journal of Cheminformatics,2015,7(Suppl 1):S4
[7] Miller S,Guinness J,Zamanian A.Name Tagging with WordClusters and Discriminative Training[C]∥HLT-NAACL.2004:337-342
[8] Turian J,Ratinov L,Bengio Y.Word representations:a simpleand general method for semi-supervised learning[C]∥Procee-dings of the 48th Annual Meeting of the Association for Computational Linguistics,2010.2010:384-394
[9] Brown P F,deSouza P V,Mercer R L,et al.Class-based n-gram models of natural language[J].Computational Linguistics,1992,8(4):467-479
[10] 刘远超,王晓龙,徐志明,等.文档聚类综述[J].中文信息学报,2006,20(3):55-62 Liu Yuan-chao,Wang Xiao-long,Xu Zhi-ming,et al.A survey of document clustering[J].Journal of Chinese Information Proces-sing,2006,0(3):55-62
[11] Zhang Y,Lin H,Yang Z,et al.Biomolecular event trigger detection using neighborhood hash features[J].Journal of Theoretical Biology, 2013,318(2):22-28
[12] Miwa M,Stre R,Kim J-D,et al.Event extraction with complex event classification using rich features[J].Journal of Bioinformatics & Computational Biology,2010,8(1):131-146

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 编辑部. 新网站开通,欢迎大家订阅![J]. 计算机科学, 2018, 1(1): 1 .
[2] 雷丽晖,王静. 可能性测度下的LTL模型检测并行化研究[J]. 计算机科学, 2018, 45(4): 71 -75, 88 .
[3] 夏庆勋,庄毅. 一种基于局部性原理的远程验证机制[J]. 计算机科学, 2018, 45(4): 148 -151, 162 .
[4] 厉柏伸,李领治,孙涌,朱艳琴. 基于伪梯度提升决策树的内网防御算法[J]. 计算机科学, 2018, 45(4): 157 -162 .
[5] 王欢,张云峰,张艳. 一种基于CFDs规则的修复序列快速判定方法[J]. 计算机科学, 2018, 45(3): 311 -316 .
[6] 孙启,金燕,何琨,徐凌轩. 用于求解混合车辆路径问题的混合进化算法[J]. 计算机科学, 2018, 45(4): 76 -82 .
[7] 张佳男,肖鸣宇. 带权混合支配问题的近似算法研究[J]. 计算机科学, 2018, 45(4): 83 -88 .
[8] 伍建辉,黄中祥,李武,吴健辉,彭鑫,张生. 城市道路建设时序决策的鲁棒优化[J]. 计算机科学, 2018, 45(4): 89 -93 .
[9] 刘琴. 计算机取证过程中基于约束的数据质量问题研究[J]. 计算机科学, 2018, 45(4): 169 -172 .
[10] 钟菲,杨斌. 基于主成分分析网络的车牌检测方法[J]. 计算机科学, 2018, 45(3): 268 -273 .