Computer Science ›› 2018, Vol. 45 ›› Issue (12): 308-312.doi: 10.11896/j.issn.1002-137X.2018.12.049

• Interdiscipline & Frontier • Previous Articles    

De Novo Transcriptome Assembly Algorithm Based on RNA-Seq Datasets

WU Si-wen, LI Jing, ZHANG Shao-qiang   

  1. (College of Computer and Information Engineering,Tianjin Normal University,Tianjin 300387,China)
  • Received:2017-10-30 Online:2018-12-15 Published:2019-02-25

Abstract: Transcriptome assembly is an important part of genome sequencing and function annotations.To improve the precision and efficiency of transcriptome assembly,this paper presented a new de novo transcriptome assembly algorithm called StepLink.The main innovations of this algorithm are presenting two concepts,namely leftmost k-mer (short sequence of length k) and right k-mer,and using the hash of hashes table to store the k-mer pairs,which makes the assembly more quickly and accurately.This algorithm was used to assemble the datasets of human,dog and mouse in the SRA databases respectively.The experimental results suggest that the proposed algorithm has higher efficiency than other existing algorithms.

Key words: De novo assembly algorithm, K-mer, RNA-Seq, Transcriptome

CLC Number: 

  • TP301.6
[1]YU A M.Research on the sugar and terpenoid metabolism du-ring the AmomumvillosumLour.fruit development using RNA-Seq [D].Guangzhou:Guangzhou University of Chinese Medicine,2014.(in Chinese)
于安民.基于RNA-Seq的阳春砂果实发育过程中糖和萜类代谢的研究[D].广州:广州中医药大学,2014.
[2]QI Y X,LIU Y B,RONG W H.RNA-Seq and its applications:a new technology for transcriptome [J].Herditas,2011,33 (11):1191-1202.(in Chinese)
祁云霞,刘永斌,荣威恒.转录组研究新技术:RNA-Seq及其应用[J].遗传,2011,33(11):1191-1202.
[3]LU Z Y.Research on assembly algorithm for next new generation sequencing technology [D].Nanjing:Southeast University,2011.(in Chinese)
卢志远.面向新一代测序技术的拼接算法研究[D].南京:东南大学,2011.
[4]PERTEA G.Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching du-ring cell differentiation [J].Nature Biotechnology,2010,28(5):511-515.
[5]MINGFU S,CARL K.Accurate assembly of transcripts through phase-preserving graph decomposition [J].Nature Biotechnology,2017,35(12):1167-1169.
[6]LIU J T,YU T,JIANG T,et al.TransComb:genome-guided transcriptome assembly via combing junctions in splicing graphs [J].Genome Biology,2016,17(1):213.
[7]PERTEA M,PERTEA GM,ANTONESCU C M,et al.StringTie enables improved reconstruction of a transcriptome from RNA-Seq reads [J].Nature Biotechnology,2015,33(3):290-295.
[8]MARETTY L,SIBBESEN J A,KROGH A,et al.Bayesiantranscriptome assembly [J].Genome Biology,2014,15(10):501.
[9]SCHULZ M H,ZERBINO D R,Vingron M,et al.Oases:robust de novo RNA-seq assembly across the dynamic range of expression levels [J].Bioinformatics,2012,28(8):1086-1092.
[10]XIE Y,WU G,TANG J,et al.SOAPdenovo-Trans:de novotranscriptome assembly with short RNA-Seq reads [J].Bioinformatics,2014,30(12):1660.
[11]PENG Y,LEUNG H C,YIU S M,et al.IDBA-Tran:a more robust de novo de Bruijn graph assembler for transcriptomes with uneven expression levels [J].Bioinformatics,2013,29(13):326-334.
[12]GRABHERR M G,HAAS B J,YASSOUR M,et al.Trinity:reconstructing a full-length transcriptome without a genome from RNA-Seq data [J].Nature Biotechnology,2011,29(7):644-652.
[13]CHANG Z.De novo transcriptome assembly from RNA-Seq[D].Jinan:Shandong University,2014.(in Chinese)
常征.基于RNA测序技术的转录组从头拼接算法研究[D].济南:山东大学,2014.
[14]ZHENG C,LI G,LIU J,et al.Bridger:a new framework for de novo transcriptome assembly using RNA-seq data [J].Genome Biology,2015,16(1):30.
[15]XIONG X J.Introduction to NCBI’s SRA database [J].Chemistry of Life,2010(6):959-963.(in Chinese)
熊筱晶.NCBI高通量测序数据库SRA介绍[J].生命的化学,2010(6):959-963.
[1] GUO Mao-zu, YANG Shuai, ZHAO Ling-ling. Transcriptome Analysis Method Based on RNA-Seq [J]. Computer Science, 2020, 47(11A): 35-39.
[2] DONG Gai-fang, FU Xue-liang and LI Hong-hui. Improvement of Multiple Sequence Center Star Method and Its Parallelization in Spark [J]. Computer Science, 2017, 44(10): 55-58.
[3] WANG Lei ,ZHANG Zu-Ping, CHEN Jian-Er (School of Information Science & Engineering,Central South University,Changsha 410083). [J]. Computer Science, 2006, 33(7): 164-166.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!