计算机科学 ›› 2019, Vol. 46 ›› Issue (2): 18-23.doi: 10.11896/j.issn.1002-137X.2019.02.003

所属专题: 生物信息学

• 大数据与数据科学 • 上一篇    下一篇

BioPW+:基于Linked Data的生物途径数据可视化系统

刘源1,2, 王鑫1,2, 甘瀛1, 杨朝洲1, 李维熙1   

  1. 天津大学计算机科学与技术学院 天津3000501
    天津大学国际工程师学院 天津3000722
  • 收稿日期:2017-12-04 出版日期:2019-02-25 发布日期:2019-02-25
  • 通讯作者: 王 鑫(1981-),男,博士,副教授,CCF高级会员,主要研究方向为图数据库、知识图谱数据管理、大数据管理,E-mail:wangx@tju.edu.cn
  • 作者简介:刘 源(1992-),男,硕士生,CCF会员,主要研究方向为生物途径数据集成;甘 瀛(1994-),男,硕士生,主要研究方向为图数据库;杨朝洲(1995-),男,主要研究方向为知识图谱;李维熙(1995-),男,主要研究方向为知识图谱。
  • 基金资助:
    本文受国家自然科学基金项目(61572353),天津市自然科学基金项目(17JCYBJC15400)资助。

BioPW+:Biological Pathway Data Visualization System Based on Linked Data

LIU Yuan1,2, WANG Xin1,2 GAN Ying1, YANG Chao-zhou1 LI Wei-xi1   

  1. School of Computer Science and Technology,Tianjin University,Tianjin 300050,China1
    Tianjin International Engineering Institute,Tianjin 300072,China2
  • Received:2017-12-04 Online:2019-02-25 Published:2019-02-25

摘要: 自Linked Data项目被提出以来,大量的开放关联数据被发布到语义Web上,这其中就包含了许多的生物途径数据集。为了使生物学家能够有效地利用这些开放的数据集,对基于Linked Data的生物途径数据可视化系统进行研究,提出了生物途径可视化模型和展示布局方案,并且采用标识符动态映射实现了多源生物途径数据的浏览,最终开发了基于Linked Data的生物途径数据查询可视化系统——BioPW+。该系统应用语义Web技术,依靠SPARQL查询来定位生物途径的基本信息,然后基于Open PHACTS平台获取生物途径元素的详细信息,最终Web界面采用力导向图布局、Sankey图布局对生物途径数据进行展示并提供多种交互操作。与已有的仅仅基于某一特定数据库的生物途径工具相比,BioPW+系统基于Linked Data,可以同时一次性展示多个数据集中的生物途径数据及与其相关的其他生物化学数据,极大节省了时间并增强了数据的完整性。

关键词: Linked Data, 可视化, 生物途径, 语义Web

Abstract: Since the Linked Data project arises,abundant open linked data have been published on the semantic Web,which contain plentiful biological pathway datasets.To make these data be utilized effectively for the biological scientists,this paper conducted the research on heterogeneous biological pathway data visualization system based on Linked Data,proposed a biological pathway visualization model,and then designed visualization layout strategies.After that,this paper utilized the dynamic mapping of identifiers to implement the browsing of heterogeneous biological pathway data,and finally developed a biological pathway visualization system called BioPW+.Primarily,BioPW+ retrieves the essential information with respect to the biological pathway by means of the semantic Web technologies and SPARQL queries.Then,through the Open PHACTS platform,it acquires the detailed information of the pathway.Finally,it illustrates the biological pathway on the Web page by employing the force-directed layout and Sankey layout,and furnishes various interoperable functions.Not like the existing tools that only retrieve data from single data source,BioPW+ is based on the Linked Data,and can elaborate the biological pathways with their relevant biochemical information from multiple datasets,saving large amounts of time and improving the data integrity.

Key words: Biological pathway, Linked Data, Semantic Web, Visualization

中图分类号: 

  • TP311.13
[1]ABELEA,MCCRAE J,BUITELAARP,et al.Linking Open Data cloud diagram[EB/OL].http://lod-cloud.net.
[2]WILLIAMS A J,HARLAND L,GROTH P,et al.Open PHA- CTS:semantic interoperability for drug discovery[J].Drug Discovery Today,2012,17(21):1188-1198.
[3]KELDER T,IERSEL M P V,HANSPERS K,et al.WikiPathways:building research communities on biological pathways[J].Nucleic Acids Research,2012,40(D1):D1301-D1307.
[4]CONSORTIUM U P.UniProt:a hub for protein information [J].Nucleic Acids Research,2015,43(D1):D204-D212.
[5]KIRILL D,PAULA D M,MARCUS E,et al.ChEBI:a database and ontology for chemical entities of biological interest[J].Nucleic Acids Research,2008,36(D1):D344-D350.
[6]GAULTON A,BELLIS L J,BENTO A P,et al.ChEMBL:a large-scale bioactivity database for drug discovery[J].Nucleic Acids Research,2012,40(D1):1100-1107.
[7]LAW V,KNOX C,DJOUMBOU Y,et al.DrugBank 4.0:shedding new light on drug metabolism[J].Nucleic Acids Research,2013,42(D1):D1091-D1097.
[8]ASHBURNER M,BALL C A,BLAKE J A,et al.Gene ontology:tool for the unification of biology[J].Nature Genetics,2000,25(1):25-29.
[9]LIU Y,WANG X,XU Q.BioPW:An Interactive Tool for Biological Pathway Visualization on Linked Data[M]∥ Asia-Pacific Web.Springer,Cham,2017:333-336.
[10]LANE L,ARGOUDPUY G,BRITAN A,et al.neXtProt:a knowledge platform for human proteins[J].Nucleic Acids Research,2012,40(D1):D76-D83.
[11]BAIROCH A.The ENZYME database in 2000[J].Nucleic Acids Research,2000,28(1):304-305.
[12]PENCE H E,WILLIAMS A J.ChemSpider:An Online Chemical Information Resource[J].Journal of Chemical Education,2010,87(11):1123-1124.
[13]QUERALTROSINACH N,PINERO J,BRAVO A,et al.Dis- GeNET-RDF:harnessing the innovative power of the Semantic Web to explore the genetic basis of diseases[J].Bioinformatics,2016,32(14):2236-2238.
[14]KAPUSHESKY M,EMAM I,HOLLOWAY E,et al.Gene Expression Atlas at the European Bioinformatics Institute[J].Nucleic Acids Research,2010,38(D1):D690-D698.
[15]CHEN B,DONG X,JIAO D,et al.Chem2Bio2RDF:a semantic framework for linking and data mining chemogenomic and systems chemical biology data[J].BMC Bioinformatics,2010,11(1):255.
[16]SAMWALD M,JENTZSCH A,BOUTON C,et al.Linked open drug data for pharmaceutical research and development[J].Journal of Cheminformatics,2011,3:19.
[17]BELLEAU F,NOLIN M A,TOURIGNY N,et al.Bio2RDF: Towards a mashup to build bioinformatics knowledge systems[J].Journal of Biomedical Informatics,2008,41(5):706-716.
[18]CSARDI G,NEPUSZ T.The igraph software package for complex network research[J].InterJournal,Complex Systems,2006,1695(5):1-9.
[19]ELLSON J,GANSNER E,KOUTSOFIOS L,et al.Graphviz-open source graph drawing tools[C]∥International Symposium on Graph Drawing.Springer,Berlin,Heidelberg,2001:483-484.
[20]BATAGELJ V,MRVAR A.Pajek-program for large network analysis[J].Connections,1998,21(2):47-57.
[21]SHANNON P,MARKIEL A,OZIER O,et al.Cytoscape:a software environment for integrated models of biomolecular interaction networks[J].Genome Research,2003,13(11):2498-2504.
[22]JOSHITOPE G,GILLESPIE M,VASTRIK I,et al.Reactome:a knowledgebase of biological pathways[J].Nucleic acids research,2005,33(D1):D428-D432.
[23]KANEHISA M,GOTO S.KEGG:Kyoto Encyclopedia of Genes and Genomes[J].Nucleic acids research,2000,28(1):27-30.
[24]VAN IERSEL M P,PICO A R,KELDER T,et al.The BridgeDB framework:standardized access to gene,protein and metabolite identifier mapping services[J].BMC Bioinformatics,2010,11(1):5.
[1] 杨啸, 王翔坤, 胡浩, 朱敏.
面向设备状态监测的可视化技术综述
Survey on Visualization Technology for Equipment Condition Monitoring
计算机科学, 2022, 49(7): 89-99. https://doi.org/10.11896/jsjkx.210900167
[2] 陈慧嫔, 王琨, 杨恒, 郑智捷.
蓝舌病毒基因组序列多元概率特征可视化分析
Visual Analysis of Multiple Probability Features of Bluetongue Virus Genome Sequence
计算机科学, 2022, 49(6A): 27-31. https://doi.org/10.11896/jsjkx.210300129
[3] 朱敏, 梁朝晖, 姚林, 王翔坤, 曹梦琦.
学术引用信息可视化方法综述
Survey of Visualization Methods on Academic Citation Information
计算机科学, 2022, 49(4): 88-99. https://doi.org/10.11896/jsjkx.210300219
[4] 李家振, 纪庆革, 朱泳霖.
分子可视化中的光线追踪棋盘渲染
Ray Tracing Checkerboard Rendering in Molecular Visualization
计算机科学, 2022, 49(2): 134-141. https://doi.org/10.11896/jsjkx.210900126
[5] 李家振, 纪庆革.
动态低采样环境光遮蔽的实时光线追踪分子渲染
Dynamic Low-sampling Ambient Occlusion Real-time Ray Tracing for Molecular Rendering
计算机科学, 2022, 49(1): 175-180. https://doi.org/10.11896/jsjkx.210200042
[6] 骆菁菁, 唐卫贞, 丁继婷.
基于皮尔逊系数的管制仿真训练数据独立化与因子分析下的数据可视化研究
Research of ATC Simulator Training Values Independence Based on Pearson Correlation Coefficient and Study of Data Visualization Based on Factor Analysis
计算机科学, 2021, 48(6A): 623-628. https://doi.org/10.11896/jsjkx.210200021
[7] 苏庆, 黎智洲, 刘添添, 吴伟民, 黄剑锋, 李小妹.
程序调试中的树形结构演变可视化模型
Tree Structure Evaluation Visualization Model for Program Debugging
计算机科学, 2021, 48(5): 68-74. https://doi.org/10.11896/jsjkx.200100133
[8] 鄂海红, 张田宇, 宋美娜.
基于Web的数据可视化图表渲染优化方法
Web-based Data Visualization Chart Rendering Optimization Method
计算机科学, 2021, 48(3): 119-123. https://doi.org/10.11896/jsjkx.200600038
[9] 张倩, 肖丽.
基于流线的流场可视化绘制方法综述
Review of Visualization Drawing Methods of Flow Field Based on Streamlines
计算机科学, 2021, 48(12): 1-7. https://doi.org/10.11896/jsjkx.201200108
[10] 马梦宇, 吴烨, 陈荦, 伍江江, 李军, 景宁.
显示导向型的大规模地理矢量实时可视化技术
Display-oriented Data Visualization Technique for Large-scale Geographic Vector Data
计算机科学, 2020, 47(9): 117-122. https://doi.org/10.11896/jsjkx.190800121
[11] 吕泽宇李纪旋陈如剑陈东明.
电商平台用户再购物行为的预测研究
Research on Prediction of Re-shopping Behavior of E-commerce Customers
计算机科学, 2020, 47(6A): 424-428. https://doi.org/10.11896/JsJkx.190900018
[12] 李天培, 陈黎.
基于双注意力编码-解码器架构的视网膜血管分割
Retinal Vessel Segmentation Based on Dual Attention and Encoder-decoder Structure
计算机科学, 2020, 47(5): 166-171. https://doi.org/10.11896/jsjkx.190400062
[13] 尚骏远, 杨乐涵, 何琨.
基于特征可视化分析深度神经网络的内部表征
Analyzing Latent Representation of Deep Neural Networks Based on Feature Visualization
计算机科学, 2020, 47(5): 190-197. https://doi.org/10.11896/jsjkx.190700128
[14] 杜流云, 郑智捷, 郑华仙.
厚壁菌门下两类细菌的DNA全序列可视化研究
Visualization of DNA Sequences of Two Kinds of Bacteria Under Firmicutes
计算机科学, 2020, 47(11A): 192-195. https://doi.org/10.11896/jsjkx.191200070
[15] 汪洋, 李鹏, 季一木, 樊卫北, 张玉杰, 王汝传, 陈国良.
高性能计算与天文大数据研究综述
High Performance Computing and Astronomical Data:A Survey
计算机科学, 2020, 47(1): 1-6. https://doi.org/10.11896/jsjkx.190900042
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!