Computer Science ›› 2020, Vol. 47 ›› Issue (4): 60-66.doi: 10.11896/jsjkx.190300073

• Database & Big Data & Data Science • Previous Articles     Next Articles

Study on Multimodal Image Genetic Data Based on Deep Principal Correlated Auto-encoders

LI Gang, WANG Chao, HAN De-peng, LIU Qiang-wei, LI Ying   

  1. School of Electronic and Control Engineering,Chang’an University,Xi’an 710064,China
  • Received:2019-03-18 Online:2020-04-15 Published:2020-04-15
  • Contact: LI Gang,born in 1975,associate professor,postgraduate supervisor,is not member of China Computer Federation.His main research interests include image processing and pattern recognition,machine learning and multi-mode biomedical information fusion
  • Supported by:
    This work was supported by the Science and Technology Innovation Guidance Project of Xi’an Science and Technology Bureau (201805045YD23CG29(5)),Fundamental Research Funds for the Central Universities,Chang’an University (CHD) (300102329203),postgraduate research innovation practice project of Chang’an University (300103002075)

Abstract: Brain imaging phenotype and genetic mutation has become the important factors that affect complex diseases such as schizophrenia,researchers based on previous work in the pathogenesis of in-depth research have proposed many models based on deep neural network or regularization,typically involving either some form of norm or auto-encoders with a reconstruction objective,but the multi-modal data of those models tend to have the number of feature dimensions which more than that of samples.In order to solve the difficulties of high-dimensional data analysis and overcome the limitations of deep canonical correlation analysis,a competent optimization algorithm is exploited to solve deep canonical correlation analysis (DCCA) with principal component analysis (PCA) on the multi-modal linear features learning and multi-layer belief network based on restricted Boltzmann machine (RBM) on multi-modal nonlinear features learning.The model,together with previous advanced model,has been applied to test and analyze the actual multi-modal data.Experiments show that the deep principal component correlation auto-encoders model has higher correlation and better classification performance than those previous model.In terms of classification accuracy,the classification accuracy of the two types of modal data is more than 90%.Compared with the CCA-based model with an average accuracy of about 65% and the DNN-based model with an average accuracy of about 80%,the classification effect of this model is significantly improved.In the experiment of clustering performance evaluation,the model further verified the significant classification effect of the model with average normalized mutual information of 93.75% and average classification error rate of 3.8%.In terms of maximum correlation analysis,on the premise that the output dimensions of top-level nodes are consistent,this model outperforms other advanced models with the maximum correlation of 0.926,showing excellent performance in high-dimensional data analysis.

Key words: Belief networks, Correlation analysis, Deep principal correlated auto-encoders, Image genomics, Optimization algorithms

CLC Number: 

  • TP391
[1]NAYLOR M G,XIHONG L,WEISS S T,et al.Using Canonical Correlation Analysis to Discover Genetic Regulatory Variants [J].Plos One,2010,5(5):1-6.
[2]PARKHOMENKO E,TRITCHLER D,BEYENE J.Sparse Canonical Correlation Analysis with Application to Genomic Data Integration [J].Statistical Applications in Genetics and Molecular Biology,2009,8(1):1-34.
[3]WAAIJENBORG S,VERSELEWEL D W H,PHILIP C,et al.Quantifying The Association between Gene Expressions and DNA-markers by Penalized Canonical Correlation Analysis [J].Statistical Applications in Genetics & Molecular Biology,2008,7(1):1-29.
[4]WITTEN D M,TIBSHIRANI R J.Extensions of Sparse Canonical Correlation Analysis with Applications to Genomic Data [J].Statistical Applications in Genetics & Molecular Biology,2009,8(1):1-27.
[5]CAO S,QIN H,GOSSMANN A,et al.Unified Tests for Finescale Mapping and Identifying Sparse High-dimensional Sequence Associations [J].Bioinformatics,2016,32(3):330-337.
[6]DENG S P,HU W,CALHOUN V D,et al.Integrating Imaging Genomic Data in The Quest For Biomarkers for Schizophrenia Disease [J].IEEE/ACM Transactions on Computational Biology & Bioinformatics,2017,15(5):1480-1491.
[7]HOTELLING H.Relations Between Two Sets of Variates [J].Biometrika,1936,28(3/4):321-377.
[8]WITTEN D M,ROBERT T,TREVOR H.A Penalized Matrix Decomposition with Applications to Sparse Principal Components and Canonical Correlation Analysis [J].Biostatistics,2009,10(3):515-534.
[9]FANG J,LIN D,SCHULZ C,et al.Joint Sparse Canonical Correlation Analysis for Detecting Differential Imaging Genetics Modules [J].Bioinformatics,2011,32(22):3480-3488.
[10]ANDREW G,ARORA R,BILMES J,et al.Deep Canonical Correlation Analysis[C]//Proceedings of the International Conference on Machine Learning.2013:1247-1255.
[11]WANG W,ARORA R,LIVESCU K,et al.On Deep Multi-view Representation Learning[C]//Proceedings of International Conference on Machine Learning.2015:1083-1092.
[12]PARKHOMENKO E,TRITCHLER D,BEVENE J.Genomewide Sparse Canonical Correlation of Gene Expression with Genotypes [J].Bmc Proceedings,2007,1(9):1-5.
[13]CAO K A L,MARTIN P G,ROBERT-GRANIE C,et al.Sparse Canonical Methods for Biological Data Integration:Application to a Cross-platform Study [J].Bmc Bioinformatics,2009,10(1):1-17.
[14]WANG W,ARORA R,LIVESCU K,et al.Unsupervised Learning of Acoustic Features via Deep Canonical Correlation Analysis[C]//Proceedings of IEEE International Conference on Acoustics.2015:1-5.
[15]DAI Y H,LIAO L Z,LI D.On Restart Procedures for The Conjugate Gradient Method [J].Numerical Algorithms,2004,35(2/3/4):249-260.
[16]HU W,CAI B,CALHOUN V,et al.Multi-modal Brain Connectivity Study Using Deep Collaborative Learning [J].Springer Nature America,2018,7(4):1-9.
[17]ZAHARIA M,CHOWDHURY M,FRANKLIN M J,et al.Spark:Cluster Computing with Working Sets [J].HotCloud,2010,10(10):95.
[18]NG A Y,JORDAN M I,WEISS Y.On Spectral Clustering:Analysis and an Algorithm[C]//Proceedings of the 14th International Conference on Neural Information Processing Systems:Natural and Synthetic.2001:1-8.
[19]CAI D,HE X,HAN J.Document Clustering Using Locality Preserving Indexing [J].IEEE Transactions on Knowledge & Data Engineering,2005,17(12):1624-1637.
[20]GHADDAR B,NAOUMSAWAYA J.High Dimensional Data Classification and Feature Selection Using Support Vector Machines [J].European Journal of Operational Research,2018,265(3):86-93.
[21]SOHN K,SHANG W,LEE H.Improved Multimodal Deep Learning with Variation of Information[C]//Proceedings of the International Conference on Neural Information Processing Systems.2014:2141-2149.
[22]SRIVASTAVA N,SALAKHUTDINOV R.Multimodal Learning with Deep Boltzmann Machines [J].Journal of Machine Learning Research,2014,15(8):1-9.
[23]HU W,CAI B,ZHANG A,et al.Deep Collaborative Learning with Application to Multimodal Brain Development Study [J].IEEE Transactions on Biomedical Engineering,2019,7(10):1-8.
[1] YANG Xiao, WANG Xiang-kun, HU Hao, ZHU Min. Survey on Visualization Technology for Equipment Condition Monitoring [J]. Computer Science, 2022, 49(7): 89-99.
[2] SUN Lin, PING Guo-lou, YE Xiao-jun. Correlation Analysis for Key-Value Data with Local Differential Privacy [J]. Computer Science, 2021, 48(8): 278-283.
[3] HONG Yao-qiu. Visual Human Action Recognition Based on Deep Belief Network [J]. Computer Science, 2021, 48(11A): 400-403.
[4] ZHANG Qin, CHEN Hong-mei, FENG Yun-fei. Overlapping Community Detection Method Based on Rough Sets and Density Peaks [J]. Computer Science, 2020, 47(5): 72-78.
[5] LU Xian-guang, DU Xue-hui, WANG Wen-juan. Alert Correlation Algorithm Based on Improved FP Growth [J]. Computer Science, 2019, 46(8): 64-70.
[6] RU Feng, XU Jin, CHANG Qi, KAN Dan-hui. High Order Statistics Structured Sparse Algorithm for Image Genetic Association Analysis [J]. Computer Science, 2019, 46(4): 66-72.
[7] CHEN Zheng, TIAN Bo, HE Zeng-you. PPI Network Inference Algorithm for PCP-MS Data [J]. Computer Science, 2019, 46(12): 313-321.
[8] CHEN Feng, MENG Zu-qiang. Study on Heterogeneous Multimodal Data Retrieval Based on Hash Algorithm [J]. Computer Science, 2019, 46(10): 49-54.
[9] CHEN Li-li, ZHU Feng, SHENG Bin, CHEN Zhi-hua. Quality Evaluation of Color Image Based on Discrete Quaternion Fourier Transform [J]. Computer Science, 2018, 45(8): 70-74.
[10] LI Guang-pu, HUANG Miao-hua. Research Progress and Mainstream Methods of Frequent Itemsets Mining [J]. Computer Science, 2018, 45(11A): 1-11.
[11] WU Jun and WANG Chun-zhi. Multiple Correlation Analysis and Application of Granular Matrix Based on Big Data [J]. Computer Science, 2017, 44(Z11): 407-410.
[12] CUI Hong-fei, LIU Jia, GU Jing-jing and ZHUANG Yi. 3D Localization Estimation Algorithm Based on Locality Preserving Canonical Correlation Analysis in Wireless Sensor Networks [J]. Computer Science, 2017, 44(9): 105-109.
[13] JU An-kang, GUO Yuan-bo, ZHU Tai-ming and WANG Tong. Survey on Network Security Event Correlation Analysis Methods and Tools [J]. Computer Science, 2017, 44(2): 38-45.
[14] QIN Yue, YU Long, TIAN Sheng-wei, ZHAO Jian-guo and FENG Guan-jun. Anaphoricity Determination of Uyghur Personal Pronouns Based on Deep Belief Network [J]. Computer Science, 2017, 44(10): 228-233.
[15] ZHU Chang-bao, CHENG Yong and GAO Qiang. Research on Image Classification Algorithm Based on Semi-supervised Deep Belief Network [J]. Computer Science, 2016, 43(Z6): 46-50.
Full text



No Suggested Reading articles found!