计算机科学 ›› 2019, Vol. 46 ›› Issue (2): 301-305.doi: 10.11896/j.issn.1002-137X.2019.02.046

• 交叉与前沿 • 上一篇    下一篇

基于彩色编码技术的准种重建算法

黄丹1, 吴璟莉1,2,3   

  1. 广西师范大学计算机科学与信息工程学院 广西 桂林5410041
    广西师范大学广西多源信息挖掘与安全重点实验室 广西 桂林 5410042
    广西区域多源信息集成与智能处理协同创新中心 广西 桂林 5410043
  • 收稿日期:2017-11-23 出版日期:2019-02-25 发布日期:2019-02-25
  • 通讯作者: 吴璟莉(1978-),女,博士,教授,硕士生导师,CCF会员,主要研究方向为生物信息学、算法设计与分析,E-mail:wjlhappy@mailbox.gxnu.edu.cn
  • 作者简介:黄 丹(1993-),女,硕士生,主要研究方向为生物信息学
  • 基金资助:
    本文受国家自然科学基金项目(61363035,61762015,61502111,61662007),广西自然科学基金项目(2015GXNSFAA139288),“八桂学者”工程专项,广西多源信息挖掘与安全重点实验室系统性研究基金项目(14-A-03-02,15-A-03-02),广西科技基地和人才专项(AD16380008)资助。

Quasispecies Reconstruction Algorithm Based on Color Coding Technology

HUANG Dan1, WU Jing-li1,2,3   

  1. College of Computer Science and Information Technology,Guangxi Normal University,Guilin,Guangxi 541004,China1
    Guangxi Key Laboratory of Multi-source Information Mining & Security,Guangxi Normal University,Guilin,Guangxi 541004,China2
    Guangxi Collaborative Innovation Center of Multi-source Information Integration and Intelligent Processing,Guilin,Guangxi 541004,China3
  • Received:2017-11-23 Online:2019-02-25 Published:2019-02-25

摘要: 求解病毒准种单体型有助于了解其基因结构特点,对疫苗的研制及抗病毒治疗具有重要意义。文中通过引入模糊距离,构造一种带权的片段冲突图,并提出了基于彩色编码技术的病毒准种单体型重建算法CWSS。CWSS算法先根据给定阈值对片段冲突图进行预处理;然后根据顶点的边权和及饱和度取值为图中顶点着色,着色遵循相邻顶点颜色相异的原则,直至所有顶点完成着色;最后将相同颜色的顶点片段进行组装,得到准种单体型。CWSS算法的时间复杂度为O(m2n+mn) 。采用模拟测序片段数据进行实验测试,对CWSS算法和Dsatur算法的重建性能和质量进行对比分析。实验结果显示,相比于Dsatur算法,CWSS算法能获得更准确的准种单体型,具有更高的重建性能。

关键词: 彩色编码, 带权图, 单体型, 模糊距离, 准种

Abstract: The reconstruction of viral quasispecies haplotypes contributes to know about the structure of viral gene-tic,and is of great significance for vaccine preparation and antiviral therapy.In this paper,a weighted fragment conflict graph was constructed by introducing fuzzy distance,and a viral quasispecies haplotypes reconstruction algorithm CWSS was proposed based on color coding technology.Firstly,the CWSS algorithm preprocesses thefragment conflict graph in accordance with a given threshold.Secondly,under the condition that adjacent vertices must have different colors,all vertices of the graph are colored according to their sum of edge weigh and saturation value.Finally,quasispecies haplotypes are obtained by assembling the fragment with the same color.The time complexity of the CWSS algorithm is O(m2n+mn).Simulated sequencing fragment were adopted to compare the reconstruction performance and quality of the CWSS algorithm and the Dsatur one.The experimental results show that CWSS algorithm can obtain more accurate quasispecies and higher reconstruction performance than Dsatur algorithm.

Key words: Color coding, Fuzzy distance, Haplotype, Quasispecies, Weighted graph

中图分类号: 

  • TP301
[1]EIGEN M,SCHUSTER P.Hypercycle-principle of natural self-organization.a.emergence of hypercycle[J].Naturwissenschaften,1977,64(11):541-565.
[2]ANDINO R,DOMINGO E.Viral quasispecies[J].Virology, 2015,479-480(1):46-51.
[3]DOMINGO E,SHELDON J,PERALES C.Viral quasispecies evolution[J].Microbiology & Molecular Biology Reviews Mmbr,2012,76(2):159-216.
[4]COX R J,BROKSTAD K A,OGRA P.Influ-enza virus:immunity and vaccination strategies.Comparison of the immune response to inactivated and live,attenuated influenza vaccines[J].Scandinavian Journal of Immunology,2004,59(1):1-15.
[5]LAURING A S,ANDINO R.Quasispecies theory and the behavior of RNA viruses[J].Plos Pathogens,2010,6(7):e1001005.
[6]BU D,TANG H.Quasispecies reconstruction based on vertex coloring algorithms[C]∥IEEE International Conference on Bioinformatics and Biomedicine.Washington D.C:IEEE Co-mputer Society,2014:63-66.
[7]ERIKSSON N,PACHTER L,MITSUYA Y,et al.Viral popula- tion estimation using pyrosequencing[J].Plos Computational Bio-logy,2008,4(5):e1000074.
[8]MANCUSO N,TORK B,SKUMS P,et al.Viral Quasispecies Reconstruction from Amplicon 454 Pyrosequencing Reads[C]∥IEEE International Conference on Bioinformatics and Biomedicine Workshops.Washington D.C:IEEE Computer Society,2011:94-101.
[9]ZAGORDI O,BHATTACHARYA A,ERIKSSON N,et al. ShoRAH:estimating the genetic diversity of a mixed sample from next-generation sequencing data[J].Bmc Bioinformatics,2011,12(1):1-5.
[10]HUANG A,KANTOR R,DELONG A,et al.QColors:An algorithm for conservative viral quasispecies reconstruction from short and non-contiguous next generation sequencing reads[J].Silico Biology,2011,11(5-6):193-201.
[11]BRON C,KERBOSCH J.Algorithm 457:finding all cliques of an undirected graph[J].Communications of the Acm,1973,16(9):575-576.
[12]METZKER M L.Sequencing technologies-the next generation[J].Nature Review Genetic,2010,11(1):31-46.
[13]POH W T,XIA E,CHININMANU K,et al.Viral quasispecies inference from 454 pyrose-quencing[J].BMC Bioinformatics,2013,14(1):355.
[14]ALTSCHUL S F,GISH W,MILLER W,et al.Basic local alignment search tool[J].Journal of Molecular Biology,1990,215(3):403-410.
[1] 刘胜久, 李天瑞, 谢鹏, 刘佳.
带权图的多重分形度量
Measure for Multi-fractals of Weighted Graphs
计算机科学, 2021, 48(3): 136-143. https://doi.org/10.11896/jsjkx.200700159
[2] 张倩,吴璟莉.
基于枚举策略的三倍体个体单体型重建算法
Triploid Individual Haplotype Reconstruction Algorithm Based on Enumeration Strategy
计算机科学, 2017, 44(1): 75-79. https://doi.org/10.11896/j.issn.1002-137X.2017.01.014
[3] 姚华传,王丽珍,陈红梅,胡新.
CP_SDD+RDS:基于分行排序单向检测求解最近对
CP_SDD+RDS:An Algorithm Based on Row-divided Sorting and Single-direction Detecting for Finding Closest Pair of Points
计算机科学, 2013, 40(7): 201-205.
[4] 周伟 王建新 谢民主 陈建二.
单体型组装问题计算模型的比较与分析

计算机科学, 2008, 35(11): 166-169.
[5] 刘云龙 王建新 陈建二.
彩色编码技术的研究进展及应用

计算机科学, 2008, 35(1): 15-18.
[6] 王强.
最短路径问题的若干算法的编程

计算机科学, 2004, 31(B07): 94-95.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!