计算机科学 ›› 2015, Vol. 42 ›› Issue (Z11): 390-395.

• 高性能与云计算 • 上一篇    下一篇

基于异构系统的生物序列比对并行处理研究进展

朱香元,李仁发,李肯立,胡忠望   

  1. 肇庆学院计算机学院 肇庆526061,湖南大学信息科学与工程学院 长沙410082,湖南大学信息科学与工程学院 长沙410082;国家超级计算长沙中心 长沙410082,肇庆学院计算机学院 肇庆526061
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受国家自然科学基金重点项目(61133005,5),国家自然科学基金青年基金(61402400),广东省自然科学基金(2014A030310288)资助

Advances in Biological Sequence Alignment Parallel Processing Based on Heterogeneous Systems

ZHU Xiang-yuan, LI Ren-fa, LI Ken-li and HU Zhong-wang   

  • Online:2018-11-14 Published:2018-11-14

摘要: 序列比对工作属于生物信息学的基础性研究领域。由于它具有应用广泛、计算复杂以及海量数据等特点,加之现在高性能计算的兴起,使得近年来序列比对并行处理技术快速发展。首先介绍了序列比对领域高性能计算的新进展,接着从体系结构特征入手对其研究进行分类,并对每类方法的实现细节和性能进行分析比较,从中不难看出访存控制、同步、数据交互以及算法可扩展性等问题均为目前基于异构系统的序列比对并行处理研究的关键点。最后,对该领域的未来研究方向进行了展望。

关键词: 序列比对,并行处理,异构,GPU,FPGA,Cell BE,MIC

Abstract: Sequence alignment is a fundamental operation in Bioinformatics.In recent years,there have been extensive studies and rapid progresses in biological sequence alignment parallel processing technologies due to its wide use,high computational complexity and large-scale data.This paper first analyzed the new development of high performance computing in sequence alignment,and then classified the up-to-date researches based on the applied architectures.Their implementation and performance were compared and analyzed in detail.It was pointed out that problems such as memory access control,synchronization,data transferring and scalability of algorithms are the key techniques to the study of biological sequence alignment parallel processing.Finally,some future directions of research were given.

Key words: Sequence alignment,Parallel processing,Heterogeneity,GPU,FPGA,Cell BE,MIC

[1] Needleman S B,Wunsch C D.A general method applicable to the search for similarities in the amino acid sequence of two proteins [J].Journal of Molecular Biology,1970,48(3):443-453
[2] Pearson WR,Lipman DJ.Rapid and sensitive protein similarity searches [J].Science,1985,227(4693):1435-1441
[3] Zhu Xiang-yuan,Li Ken-li,Salah A.A Data Parallel Strategy for Aligning Multiple Biological Sequences on Multi-Core Compu-ters[J].Computers in Biology and Medicine,2013,43(4):350-361
[4] Zhu Xiang-yuan,Li Ken-li,Salah A,et al.Cluster-Distribute-Align-Merge:A General Algorithm to Speed Up Multiple Sequence Alignment on Multi-Core Computers[J].Journal of Computational and Theoretical Nanoscience,2014,1(4):1000-1006
[5] 张法,乔香珍,刘志勇.基于Smith-Waterman算法的并行分而治之生物序列比对算法[J].中国科学E辑技术科学,2004,34(2):190-199
[6] Kim T,Joo H.Clustalxeed:a GUI-based grid computation version for high performance and terabyte size multiple sequence alignment [J].BMC Bioinformatics,2010,11(38):467
[7] Krishna M R,Benedict P,David H.Meta-Alignment with Crumble and Prune Partitioning very large alignment problems for performance and parallelization [J].BMC Bioinformatics,2011,12(5):1447-1448
[8] 刘立芳,霍红卫,王宝树.PHGA-COFFEE:多序列比对问题的并行混合遗传算法求解[J].计算机学报,2006,29(5):727-733
[9] Paolo D T,Miquel O,Fernando G,et al.Cloud-Coffee:imple-mentation of a parallel consistency-based multiple alignment algorithm in the T-Coffee package and its benchmarking on the amazon elastic-cloud [J].Bioinformatics,2010,26(15):1903-1904
[10] Zhu Xiang-yuan,Li Ken-li,Salah A,et al.Parallel Implementation of MAFFT on CUDA-Enabled Graphics Hardware[J].IEEE/ACM Transactions on Computational Biology and Bioinformatics,2015,12(1):205-218
[11] 夏飞.生物序列分析算法硬件加速器关键技术研究[D].长沙:国防科学技术大学,2011
[12] Yang S,Gregory M S,Ali A.Performance Analysis of IBM Cell Broadband Engine on Sequence Alignment [C]∥Proc of IEEE AHS’09.San Francisco,California:IEEE,2009:439-446
[13] David D,Francisco J E,Pilar H,et al.Parallelizing and optimizing a bioinformatics pairwise sequence alignment algorithm for many-core architecture [J].Parallel Computing,2011,37(45):244-259
[14] 叶笑春,林伟,范东睿,等.蛋白质序列比对算法在众核结构上的并行优化[J].软件学报,2010,21(12):3094-3105
[15] Liu Y,Huang W,Johnson J,et al.GPU accelerated Smith-Waterman [M]∥Computational Science-ICCS 2006.2006:188-195
[16] Liu W G,Bertil S,Gerrit V,et al.Streaming algorithms for biological sequence alignment on GPUs [J].IEEE Trans on Parallel and Distrib Systems,2007,18(9):1270-1281
[17] Manavski S A,Valle G.CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment [J].BMC Bioinformatics,2008,9(1):1-9
[18] Liu Y,Maskell D,Schmidt B.CUDASW++:Optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units [J].BMC Research Notes,2009,2(1):1-10
[19] Liu Y,Schmidt B,Maskell D.CUDASW++2.0:Enhanced Smith-Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions [J].BMC Research Notes,2010,3(11):93
[20] 林江,唐敏,童若锋.GPU加速的生物序列比对[J].计算机辅助设计与图形学学报,2010,22(3):420-427
[21] Saeed A K,Poole S,Perot J B.Acceleration of the Smith-Waterman algorithm using single and multiple graphics processors [J].Journal of Computational Physics,2010,229(11):4247-4258
[22] Laiq H,Marijn K,Zaid A.DOPA:GPU-based protein alignment using database and memory access optimizations [J].BMC Research Notes,2011,4(1):1-11
[23] Michael C S,Cole T,Arthur L D,et al.High-throughput sequence alignment using Graphics Processing Units [J].BMC Bioinformatics,2007,8(1):141-144
[24] Trapnell C,Schatz M C.Optimizing data intensive GPGPU computations for DNA sequence alignment [J].Parallel Computing,2009,35(8):429-440
[25] Panagiotis D V,Nikolaos V S.GPU-BLAST:using graphics processors to accelerate protein sequence alignment [J].Bioinformatics,2010,27(2):182-188
[26] Weiguo L,Bertil S,Wolfgang M W.CUDA-BLASTP:Accelerating BLASTP on CUDA-Enabled Graphics Hardware [J].IEEE/ACM Transations on Computational Biology And Bioinformatics,2011,8(6):1678-1684
[27] 裴颂文,王心怡,韦刚.等.基于多核流处理器的BLAST并行化算法研究[J].系统仿真学报,2011,23(10):2065-2069
[28] Lavenier D.Speeding up Genome Computations with a Systolic Accelerator [J].SIAM News,1998,31(8):6-7
[29] Puttegowda K,Worek W,Pappas N,et al.A Run-Time Reconfigurable System for Gene-Sequence Searching[C]∥Proc of the 16th Int Conf on VLSI Design.New Delhi,India:IEEE,2003:561-566
[30] Marongiu A,Palazzari P,Rosato V.A Specialized Hardware Device for the Protein Similarity Search [J].Concurrency and Computation:Practice and Experience,2004,16(9):917-931
[31] 汪冬,唐志敏.Smith-Waterman算法在脉动阵列上的实现及分析[J].计算机学报,2004,27(1):12-20
[32] Oliver T F,Schmidt B,Maskell D L.Hyper Customized Processors for Bio-Sequence Database Scanning on FPGAs[C]∥Proc of ACM/SIDA FPGA’05.Monterey,California:ACM,2005:229-237
[33] Bias A D,Dahle D M,Diekhans M,et al.The UCSC KestrelParallel Processor [J].IEEE Trans on Parallel and Distributed Systems,2005,16(1):80-92
[34] Zhang P,Tan G,Gao GR.Implementation of the Smith-Waterman Algorithm on a Reconfigurable Supercomputing Platform[C]∥Proc of ACM HPRCTA’07.Reno,Nevada:ACM,2007:39-48
[35] Jiang X,Liu X,Xu L,et al.A Reconfigurable Accelerator for Smith-Waterman Algorithm [J].IEEE Trans on Circuits and Systems II,2007,54(12):1077-1081
[36] Abouellail M,El-Araby E,Taher M,et al.DNA and Protein Sequence Alignment with High Performance Reconfigurable Systems[C]∥Proc of NASA/ESA Conf on Adaptive Hardware and Systems.Edinburgh,UK:IEEE,2007
[37] Harris B,Jacob A C,Lancaster JM,et al.A Banded Smith-Waterman FPGA Accelerator for Mercury BlastP[C]∥Proc of Int Conf on Field Programmable Logic and Applications.Amsterdam,Netherlands:IEEE,2007:765-769
[38] Tom V T,Martin C H.Families of FPGA-based accelerators for approximate string matching [J].Microprocessors and Micro-systems,2007,31(2):135-145
[39] Khaled B,Ying L,AbdSamad B.A highly Parameterized and efficient FPGA-based skeleton for pairwise biological sequence alignment [J].IEEE Trans on Very Large Scale Integration.2009,7(4):561-570
[40] Yoshiki Y,Hung K T,Wayne L.FPGA-Based Smith-Waterman Algorithm:Analysis and Novel Design [M]∥ Reconfigurable Computing:Architectures,Tools and Applications.2011:181-192
[41] Nuno S,Nuno R,Paulo F.Hardware accelerator architecture for simultaneous short-read DNA sequences alignment with enhanced traceback phase [J].Microprocessors and Microsystems,2012,36(2):96-109
[42] Azzedine B,Jan MC,Alba de M,et al.A Hardware Accelerator for the Fast Retrieval of DIALIGN Biological Sequence Alignments in Linear Space [J].IEEE Transactions on Computers,2010,59(6):808-821
[43] Steven D,Patrice Q.Hardware Acceleration of HMMER on FPGAs [J].J Sign Process Syst,2010,58(1):53-67
[44] Tim O,Bertil S,Darran N,et al.Using recon?gurable hardware to accelerate multiple sequence alignment with ClustalW [J].Bioinformatics,2005,21(6):3431-3432
[45] Scott L,Quinn O S.Accelerated large-scale multiple sequence alignment [J].BMC Bioinformatics,2011,12(29):466
[46] Vipin S,Michael K,Evan S,et al.Exploring the viability of the Cell Broadband Engine for bioinformatics applications[C]∥Proc of IEEE IPDPS’07.California:IEEE,2007
[47] Vipin S,Michael K,Evan S,et al.Exploring the viability of the Cell Broadband Engine for bioinformatics applications [J].Pa-rallel Computing,2008,34(11):616-626
[48] Adam S,Christian L,Philipp K,et al.SWPS3-fast multi-threaded vectorized Smith-Waterman for IBM Cell/B.E.and x86/SSE2 [J].BMC Research Notes,2008,1(1):107
[49] Adrianto W,Bertil S,Huiliang Zh,et al.High performance protein sequence database scanning on the Cell Broadband Engine [J].Scientific Programming,2009,17(12):97-111
[50] Abhinav S,Srinivas A.Parallel genomic alignments on the Cell Broadband Engine [J].IEEE Transactions on Parallel and Distributed Systems,2009,20(10):1600-1610
[51] Hans V,Sean R,Koen D B.Accelerating multiple sequencealignment with the Cell BE processor [J].The Computer Journal,2010,53(6):814-826
[52] Adrianto W,Chee K K,Bertil S.Multi-threaded vectorized distance matrix computation on the CELL/BE and x86/SSE2 architecture [J].Bioinformatics,2010,26(10):1368-1369
[53] Aarti S,Chen C,Liu W G,et al.A Hybrid Computational Grid Architecture for Comparative Genomics [J].IEEE Transactions on Information Technology in Biomedicine,2008,12(2):218-225
[54] Meng X D,Vipin C.A High-Performance Heterogeneous Computing Platform for Biological Sequence Analysis [J].IEEE Transactions on Parallel and Distributed Systems,2010,21(9):1267-1280
[55] 卢风顺,宋君强,银福康,等.CPU/GPU协同并行计算研究综述[J].计算机科学,2011,38(3):5-9,46

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!