计算机科学 ›› 2021, Vol. 48 ›› Issue (1): 326-332.doi: 10.11896/jsjkx.191200030

• 交叉与前沿 • 上一篇    

高可用弹性宏基因组学计算平台

何志鹏1,2, 李瑞琳1, 牛北方1,2   

  1. 1 中国科学院计算机网络信息中心 北京 100190
    2 中国科学院大学 北京 100190
  • 收稿日期:2019-12-03 修回日期:2020-03-09 出版日期:2021-01-15 发布日期:2021-01-15
  • 通讯作者: 牛北方(niubf@cnic.cn)
  • 作者简介:hezhipeng@cnic.cn
  • 基金资助:
    国家重点研发计划(2016YFC0503607);国家自然科学基金(31771466);中国科学院“百人计划”择优支持(牛北方)

Highly Available Elastic Computing Platform for Metagenomics

HE Zhi-peng1,2, LI Rui-lin1, NIU Bei-fang1,2   

  1. 1 Computer Network Information Center,Chinese Academy of Sciences,Beijing 100190,China
    2 University of Chinese Academy of Sciences,Beijing 100190,China
  • Received:2019-12-03 Revised:2020-03-09 Online:2021-01-15 Published:2021-01-15
  • About author:HE Zhi-peng,born in 1995,postgra-duate.His main research interests include high-performance computing and bioinformatics.
    NIU Bei-fang,born in 1978,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include high-perfor-mance computing and bioinformatics.
  • Supported by:
    National Key R&D Program of China(2016YFC0503607),National Natural Science Foundation of China(31771466) and CAS 100-Talents(Dr. Beifang Niu).

摘要: 下一代测序技术(Next Generation Sequencing,NGS)以其低成本、超高通量的特性,显著推动着宏基因组学的发展,同时也为领域内科研人员带来了巨大的挑战。随之而来的大规模、高复杂度测序数据,让研究人员在处理过程中面临诸多困难:一方面,大规模测序数据的分析消耗资源,如硬件资源、时间成本等;另一方面,计算分析过程中必然涉及到的大量宏基因组学计算分析工具很难由普通使用者自行部署、调试与维护。文中对比了领域内主流的宏基因组学计算平台,综合分析了各平台主要的优势与不足;进一步结合当前有效的计算服务技术,构建完成了一个专注于宏基因组学计算分析的高可用弹性宏基因组学计算平台MWS-MGA(More than a Web Service for Metagenomic Analysis);并通过提供多种交互接入方式以及丰富灵活的计算工具,较大程度地降低了科研人员进行宏基因组学NGS数据分析的科研门槛。

关键词: 弹性, 高可用, 宏基因组学, 计算平台, 微服务

Abstract: Next generation sequencing(NGS) has significantly promoted the development of metagenomics due to its low cost and ultra-high throughput.However,it has brought great challenges to researchers at the same time since processing large-scale and high-complexity sequencing data is a tough task.On the one hand,the analysis of large-scale sequencing data consumes too many resources such as hardware resources and the cost of time,etc.On the other hand,in the process of computational analysis,a large number of metagenomics computational analysis tools need to be deployed,debugged and maintained inevitably which are difficult for common users.For the above reasons,this paper compares the mainstream metagenomics computing platforms in the field and analyzes the main advantages and disadvantages of each platform comprehensively.Furthermore,a highly available and flexible metagenomics computing platform MWS-MGA(More than a Web Service for Metagenomic Analysis) focusing on meta-genomics computational analysis has been constructed which is combined with the current effective computing service technology.Meanwhile,not only multiple interactive access methods but also rich and flexible computing tools are provided in MWS-MGA.Thus,the scientific research threshold for researchers to conduct metagenomics analysis has been greatly reduced.

Key words: Computing platform, Elastic, High availability, Metagenomics, Microservices

中图分类号: 

  • TP399
[1] WOOLEY J C,GODZIK A,FRIEDBERG I,et al.A Primer on Metagenomics[J].Plos Computational Biology,2010,6(2):e1000667.
[2] SCHLOSS P D,JENIOR M L,KOUMPOURAS C C,et al.Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system[J].PeerJ,2016,4:e1869.
[3] CHIU C Y,MILLER S A.Clinical metagenomics[J].Nat. Rev. Genet.,2019,20(6):341-355.
[4] COSTA M,WEESE J S.Methods and basic concepts for microbiota assessment[J].The Veterinary Journal,2019,249:10-15.
[5] SESHADRI R,KRAVITZ S A,SMARR L.CAMERA:a community resource for metagenomics[J].Plos Biology,2007,5(3):e75.
[6] MITCHELL A L,SCHEREMETJEW M,DENISE H,et al.EBI Metagenomics in 2017:enriching the analysis of microbial communities,from sequence reads to assemblies[J].Nucleic Acids Research,2018,46(D1):D726-D735.
[7] CHEN I M A,CHU K,PALANIAPPAN K,et al.IMG/M v.5.0:an integrated data management and comparative analysis system for microbial genomes and microbiomes[J].Nucleic Acids Research,2019,47(D1):D666-D677.
[8] MEYER F,PAARMANN D,D'SOUZA M,et al.The met-agenomics RAST server-a public resource for the automatic phylogenetic and functional analysis of metagenomes[J].BMC Bioinformatics,2008,9(1):386.
[9] WU S,ZHU Z,FU L,et al.WebMGA:a customizable web ser-ver for fast metagenomic sequence analysis[J].BMC Genomics,2011,12(1):444.
[10] RODRIGUEZ A.Restful web services:The basics[J].IBM Developer Works,2008,33:18.
[11] NAMIOT D,SNEPS-SNEPPE M.On micro-services architec-ture[J].International Journal of Open Information Technologies,2014,2(9):24-27.
[12] CHESHIRE S,KROCHMAL M.DNS-based service discovery:RFC 6763[R].2013.
[13] HUNT P,KONAR M,JUNQUEIRA F P,et al.ZooKeeper:Wait-free Coordination for Internet-scale Systems[C]//USENIX Annual Technical Conference.2010:11.
[14] CoreOS.etcd[EB/OL].https://coreos.com/etcd.
[15] HashiCorp.Introduction to Consul[EB/OL].https://www.con-sul.io/intro/index.html.
[16] ZHUAN G X.Design Example of API Gateway Architecture[J].China CIO News,2018,293(5):99-100.
[17] Netflix.Announcing Zuul:Edge service in the cloud[EB/OL].http://techblog.netflix.com/201 3/06/announcing-zuul-edge-service-in-cloud.html.
[18] Spring.Spring Cloud Gateway[EB/OL].https://spring.io/projects/spring-cloud-gateway.
[19] Kong.Kong:Next-Generation API platform for Microservices[EB/OL].https://konghq.com.
[20] CHRISTAKIS C,POLYMENAKOU P N,KILIAS S P,et al.What do I do with my metagenomes? A comparison of three online metagenomic analysis platforms[C]//Hellenic Bioinforma-tics.Thessaloniki:Biogeochemical Cycles,2018.
[1] 陆懿帆, 曹芮浩, 王俊丽, 闫春钢.
一种基于微服务的检察业务服务封装方法
Method of Encapsulating Procuratorate Affair Services Based on Microservices
计算机科学, 2021, 48(2): 33-40. https://doi.org/10.11896/jsjkx.191100152
[2] 王焘, 张树东, 李安, 邵亚茹, 张文博.
一种面向异常传播的微服务故障诊断方法
Anomaly Propagation Based Fault Diagnosis for Microservices
计算机科学, 2021, 48(12): 8-16. https://doi.org/10.11896/jsjkx.210100149
[3] 江郑, 王俊丽, 曹芮浩, 闫春钢.
一种基于微服务架构的服务划分方法
Method of Service Decomposition Based on Microservice Architecture
计算机科学, 2021, 48(12): 17-23. https://doi.org/10.11896/jsjkx.210500078
[4] 朱汉卿, 马武彬, 周浩浩, 吴亚辉, 黄宏斌.
基于改进多目标进化算法的微服务用户请求分配策略
Microservices User Requests Allocation Strategy Based on Improved Multi-objective Evolutionary Algorithms
计算机科学, 2021, 48(10): 343-350. https://doi.org/10.11896/jsjkx.201100009
[5] 谢文康, 樊卫北, 张玉杰, 徐鹤, 李鹏.
ENLHS:一种基于抽样的Kafka自适应调优方法
ENLHS:Sampling Approach to Auto Tuning Kafka Configurations
计算机科学, 2020, 47(8): 119-126. https://doi.org/10.11896/jsjkx.200300010
[6] 于曼, 黄凯, 张翔.
基于微服务架构的ETC系统设计
Design of ETC System Based on Microservice Architecture
计算机科学, 2020, 47(6A): 643-647. https://doi.org/10.11896/JsJkx.190800010
[7] 吴文峻, 于鑫, 蒲彦均, 汪群博, 于笑明.
微服务时代的复杂服务软件开发
Development of Complex Service Software in Microservice Era
计算机科学, 2020, 47(12): 11-17. https://doi.org/10.11896/jsjkx.200700181
[8] 贾玉福, 李明磊, 刘文平, 胡胜红, 蒋洪波.
一种基于WiFi相异度的群组感知分析方法
Group Perception Analysis Method Based on WiFi Dissimilarity
计算机科学, 2020, 47(10): 63-68. https://doi.org/10.11896/jsjkx.200600014
[9] 吴斌烽.
基于微服务架构的物联网中间件设计
Design of IoT Middleware Based on Microservices Architecture
计算机科学, 2019, 46(6A): 580-584.
[10] 李文海, 彭鑫, 丁丹, 向麒麟, 郭晓峰, 周翔, 赵文耘.
基于日志可视化分析的微服务系统调试方法
Method of Microservice System Debugging Based on Log Visualization Analysis
计算机科学, 2019, 46(11): 145-155. https://doi.org/10.11896/jsjkx.181102210
[11] 申燕萍, 顾苏杭, 郑丽霞.
基于云计算平台的仿生优化聚类数据挖掘算法
Bionic Optimized Clustering Data Mining Algorithm Based on Cloud Computing Platform
计算机科学, 2019, 46(11): 247-250. https://doi.org/10.11896/jsjkx.190800042
[12] 崔琼,李建华,王宏,南明莉.
基于节点修复的网络化指挥信息系统弹性分析模型
Resilience Analysis Model of Networked Command Information System Based on Node Repairability
计算机科学, 2018, 45(4): 117-121. https://doi.org/10.11896/j.issn.1002-137X.2018.04.018
[13] 杨冬菊,冯凯.
基于缓存的分布式统一身份认证优化机制研究
Distributed and Unified Authentication Optimization Mechanism Based on Cache
计算机科学, 2018, 45(3): 300-304. https://doi.org/10.11896/j.issn.1002-137X.2018.03.049
[14] 徐新黎,陈琛,皇甫晓洁,崔永婷.
能量受限的单移动设备无线充电调度算法
Wireless Charging Scheduling Algorithm of Single Mobile Vehicle with Limited Energy
计算机科学, 2018, 45(3): 108-114. https://doi.org/10.11896/j.issn.1002-137X.2018.03.018
[15] 舒红梅, 谭良.
库操作系统的研究及其进展
Research and Development of Library Operating System
计算机科学, 2018, 45(11): 37-44. https://doi.org/10.11896/j.issn.1002-137X.2018.11.004
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!