计算机科学 ›› 2015, Vol. 42 ›› Issue (7): 108-113.doi: 10.11896/j.issn.1002-137X.2015.07.023
胡庆生 雷秀娟
HU Qing-sheng LEI Xiu-juan
摘要: 蛋白质相互作用(PPI)网络是生物信息学的一个新的研究领域。近年来马尔科夫(MCL)聚类算法在未知蛋白质的功能模块预测方面发挥了重要作用,但是聚类质量不高,为此提出了一种基于突变因子和惩罚因子及重新定义解释聚类结果的MCL聚类算法。该算法采用惩罚因子,惩罚质量较大的吸引子;采用突变因子在算法后期断绝初始转移概率对转移概率的束缚。算法在PPI网络数据集上进行了测试,结果表明该算法不但可以抑制小类的产生,而且聚类结果的质量在Avg.F方面相对于基本MCL算法提高了13.1%。
[1] Von Mering C,Krause R,Snel B,et al.Comparative assessment of large-scale data sets of protein-protein interactions [J].Nature,2002,417:399-403 [2] Watts D J,Strogatz S H.Collective dynamics of ‘small-world’ networks [J].Nature,1998,339:440-442 [3] Barabasi A L,Oltvai Z N.Network biology:understanding the cell’s functional organization [J].Nature Review Genetics,2004,5(2):101-103 [4] MacQueen J.Some methods for classification and analysis ofmultivariate observations [C]∥Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability.Berkeley,University of California Press,1967:281-297 [5] Guo Hai-xiang,Zhu Ke-Jun,Gao Si-wei,et al.An improved genetic K-means algorithm for optimal clustering [C]∥Procee-dings of 6th IEEE International Conference on Data Mining Workshops.Washington DC:IEEE Computer Society,2006:793-797 [6] Zhang T,Ramakrishnan R,Livny M.An efficient data clustering method for very large databases [C]∥Proceeding ACM SIGMOD Conference on Management of Data.Montreal,Canada,1996:103-114 [7] Ester M,Kriegel H P,Sander J,et al.A density-based algorithm for discovering clusters in Large Spatial Databases with Noise [C]∥Proceedings of 2nd International Conference on Know-ledge Discovery and Data Mining (KDD’96).Portland,OR,1996:226-231 [8] Ben-Hur A,Horn D,Siegelmann T H,et al.Support vectorclustering [J].Journal of Machine Learning Research,2002,2:125-137 [9] Tipping M E.The Relevance Vector Machine [M]∥Advances in Neural Information Processing Systems 12.2000:652-658 [10] Dongen S V.Graph Clustering by Flow Simulation [D].Utrecht:University of Utrecht,2000 [11] Ruan Pei-ying,Hayashida M,Maruyama O,et al.Prediction of heterotrimeric protein complexes by two-phase learning using neighboring kernels [C]∥BMC Bioinformatics.2014,15,S2:S6 [12] Satuluri V,Parthasarathy S.Scalable graph clustering using stochastic flows:applications to community discovery [C]∥Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York,NY,USA,2009:737-746 [13] Satuluri V,Parthasararthy S,Ucar D.Markov clustering of protein interaction networks with improved balance and scalability [C]∥Proceeding BCB’10 Proceedings of the 1th ACM International Conference on Bioinformatics and Computational Biology.New York,NY,USA,2010:247-256 [14] Shih Y K,Parthasarathy S.Identifying functional modules in interaction networks through overlapping Markov clustering [J].Bioinformatics,2012,8(18):i473-i479 [15] Shih Y K,Parthasarathy S.Identifying protein complexes in AP-MS data with negative evidence via soft Markov clustering [C]∥Proceeding BCB’13 Proceedings of the International Conference on Bioinformatics,Computational Biology and Biomedical Informatics.New York,NY,USA,2013:440 [16] 岑涌,罗林开.一种改善非平衡分布数据SVM分类能力的新策略[J].计算机与数学工程,2006,4(11):103-105 Cen Yong,Lou Lin-kai.A novel strategy for improving the performance of SVM classification for unbalance distribution data[J].Computer & Digital Engineering,2006,4(11):103-105 [17] 方霞,席金菊.基于变异和启发式选择的蚁群优化算法[J].计算机工程与应用,2013,9(24):24-27 Fang Xia,Xi Jin-ju.Ant colony algorithm based on mutation features and selected heuristic[J].Computer Engineering and Applications,2013,9(24):24-27 [18] Güldener U,Munsterkotter M,Kastenmuller G,et al.CYGD:the Comprehensive Yeast Genome Database [J].Nucleic Acids Research,2005,33(1):364-368 [19] Zhang Ai-dong.Protein Interaction Networks [M].UK:Cambridge University Press,2009 [20] 尤梦丽,雷秀娟.PPI 网络聚类的评价方法的研究与应用[J].计算机科学,2013,0(12):254-258 You Meng-li,Lei Xiu-juan.Study and application of evaluating methods of PPI network clustering[J].Computer Science,2013,0(12):254-258 |
No related articles found! |
|