计算机科学 ›› 2018, Vol. 45 ›› Issue (12): 308-312.doi: 10.11896/j.issn.1002-137X.2018.12.049
• 交叉与前沿 • 上一篇
武思文, 李静, 张少强
WU Si-wen, LI Jing, ZHANG Shao-qiang
摘要: 转录组拼接是基因组测序与功能注解问题的一个重要组成部分。为了提高转录组拼接的精度和效率,文中提出了一种新的转录组从头拼接算法StepLink。该算法的主要创新点是提出了最左k-mer(长度为k的短序)和右k-mer的概念,并运用双重哈希表来存储相邻的每对k-mer,使得拼接更加迅速、准确。应用该算法对SRA数据库中人、狗和老鼠的测序数据分别进行拼接,结果表明该算法比其他已有算法更高效。
中图分类号:
[1]YU A M.Research on the sugar and terpenoid metabolism du-ring the AmomumvillosumLour.fruit development using RNA-Seq [D].Guangzhou:Guangzhou University of Chinese Medicine,2014.(in Chinese) 于安民.基于RNA-Seq的阳春砂果实发育过程中糖和萜类代谢的研究[D].广州:广州中医药大学,2014. [2]QI Y X,LIU Y B,RONG W H.RNA-Seq and its applications:a new technology for transcriptome [J].Herditas,2011,33 (11):1191-1202.(in Chinese) 祁云霞,刘永斌,荣威恒.转录组研究新技术:RNA-Seq及其应用[J].遗传,2011,33(11):1191-1202. [3]LU Z Y.Research on assembly algorithm for next new generation sequencing technology [D].Nanjing:Southeast University,2011.(in Chinese) 卢志远.面向新一代测序技术的拼接算法研究[D].南京:东南大学,2011. [4]PERTEA G.Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching du-ring cell differentiation [J].Nature Biotechnology,2010,28(5):511-515. [5]MINGFU S,CARL K.Accurate assembly of transcripts through phase-preserving graph decomposition [J].Nature Biotechnology,2017,35(12):1167-1169. [6]LIU J T,YU T,JIANG T,et al.TransComb:genome-guided transcriptome assembly via combing junctions in splicing graphs [J].Genome Biology,2016,17(1):213. [7]PERTEA M,PERTEA GM,ANTONESCU C M,et al.StringTie enables improved reconstruction of a transcriptome from RNA-Seq reads [J].Nature Biotechnology,2015,33(3):290-295. [8]MARETTY L,SIBBESEN J A,KROGH A,et al.Bayesiantranscriptome assembly [J].Genome Biology,2014,15(10):501. [9]SCHULZ M H,ZERBINO D R,Vingron M,et al.Oases:robust de novo RNA-seq assembly across the dynamic range of expression levels [J].Bioinformatics,2012,28(8):1086-1092. [10]XIE Y,WU G,TANG J,et al.SOAPdenovo-Trans:de novotranscriptome assembly with short RNA-Seq reads [J].Bioinformatics,2014,30(12):1660. [11]PENG Y,LEUNG H C,YIU S M,et al.IDBA-Tran:a more robust de novo de Bruijn graph assembler for transcriptomes with uneven expression levels [J].Bioinformatics,2013,29(13):326-334. [12]GRABHERR M G,HAAS B J,YASSOUR M,et al.Trinity:reconstructing a full-length transcriptome without a genome from RNA-Seq data [J].Nature Biotechnology,2011,29(7):644-652. [13]CHANG Z.De novo transcriptome assembly from RNA-Seq[D].Jinan:Shandong University,2014.(in Chinese) 常征.基于RNA测序技术的转录组从头拼接算法研究[D].济南:山东大学,2014. [14]ZHENG C,LI G,LIU J,et al.Bridger:a new framework for de novo transcriptome assembly using RNA-seq data [J].Genome Biology,2015,16(1):30. [15]XIONG X J.Introduction to NCBI’s SRA database [J].Chemistry of Life,2010(6):959-963.(in Chinese) 熊筱晶.NCBI高通量测序数据库SRA介绍[J].生命的化学,2010(6):959-963. |
[1] | 郭茂祖, 杨帅, 赵玲玲. 基于RNA-Seq的转录组分析方法 Transcriptome Analysis Method Based on RNA-Seq 计算机科学, 2020, 47(11A): 35-39. https://doi.org/10.11896/jsjkx.200600057 |
[2] | 董改芳,付学良,李宏慧. 多序列星比对算法的改进及其在Spark中的并行化研究 Improvement of Multiple Sequence Center Star Method and Its Parallelization in Spark 计算机科学, 2017, 44(10): 55-58. https://doi.org/10.11896/j.issn.1002-137X.2017.10.010 |
[3] | 王磊 张祖平 陈建二. DNA片段拼接中重复序列算法研究 计算机科学, 2006, 33(7): 164-166. |
|