Computer Science ›› 2021, Vol. 48 ›› Issue (5): 124-129.doi: 10.11896/jsjkx.200500058

• Database & Big Data & Data Science • Previous Articles     Next Articles

Post-processing Network Embedding Algorithm with Random Projection and Principal Component Analysis

HU Xin-tong, SHA Chao-feng, LIU Yan-jun   

  1. School of Computer Science,Fudan University,Shanghai 200433,China
  • Received:2020-05-14 Revised:2020-07-06 Online:2021-05-15 Published:2021-05-09
  • About author:HU Xin-tong,born in 1996,postgra-duate.Her main research interests include natural language processing and software engineering.(18212010008@fudan.edu.cn)
    SHA Chao-feng,born in 1976,Ph.D,associate professor.His main research interests include machine learning and data mining,and natural language processing.
  • Supported by:
    National Key Research and Development Program of China (2018YFB0904503).

Abstract: Network embedding as network representation learning has received a lot of attention from researchers in recent years.A number of models based on low-dimensional vector representation of nodes in network structure learning networks,such as DeepWalk,have been developed with good results in tasks such as node classification and link prediction.However,with the network size increases,there are computational bottlenecks with multiple network embedding algorithms.To mitigate this problem,no-learning methods such as random projection can be used,but critical information about the network structure may be lost,resulting in degraded algorithm performance.In this paper,a post-processing algorithm for network embedding(PPNE) is proposed,which uses random projection as well as principal component analysis to effectively retain key information and maintain a higher order approximation of the network structure.Experiments are conducted on three public datasets for node classification and link prediction tasks,while the performance of the PPNE algorithm is verified against other network embedding algorithms.The experimental results show that the PPNE algorithm has a large improvement over other algorithms in terms of both perfor-mance and running time,and the algorithm has a speed improvement of at least two orders of magnitude over other learning-based algorithms while ensuring good task performance.

Key words: Link prediction, Network embedding, Node classification, Principal component analysis, Random projection

CLC Number: 

  • TP391
[1]CUI P,WANG X,PEI J,et al.A survey on network embedding[J].IEEE Transactions on Knowledge and Data Engineering,2018,31(5):833-852.
[2]SEN P,NAMATA G,BILGIC M,et al.Collective classification in network data[J].AI magazine,2008,29(3):93.
[3]LI BEN-NOWELL D,KLEINBERG J.The link-prediction pro-blem for social networks[J].Journal of the American society for information science and technology,2007,58(7):1019-1031.
[4]TU C,LIU Z,SUN M.Inferring correspondences from multiple sources for microblog user tags[C]//Chinese National Confe-rence on Social Media Processing.Springer,Berlin,Heidelberg,2014:1-12.
[5]HAN Z M,ZHENG C Y,DUAN D G,et al.Associated Users Mining Algorithm Based on Multi-information Fusion Representation Learning[J].Computer Science,2019,46(4):77-82.
[6]KOREN Y,BELL R,VOLINSKY C.Matrix factorization techniques for recommender systems[J].Computer,2009(8):30-37.
[7]PEROZZI B,AL-RFOU R,SKIENA S.Deepwalk:Online lear-ning of social representations[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Disco-very and Data Mining.ACM,2014:701-710.
[8]WANG D,CUI P,ZHU W.Structural deep network embedding[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2016:1225-1234.
[9]ZHANG Z,CUI P,LI H,et al.Billion-scale network embedding with iterative random projection[C]//2018 IEEE International Conference on Data Mining(ICDM).IEEE,2018:787-796.
[10]CAO S,LU W,XU Q.Grarep:Learning graph representations with global structural information[C]//Proceedings of the 24th ACM International on Conference on Information and Know-ledge Management.ACM,2015:891-900.
[11]TANG J,QU M,WANG M,et al.Line:Large-scale information network embedding[C]//Proceedings of the 24th International Conference on World Wide Web.International World Wide Web Conferences Steering Committee,2015:1067-1077.
[12]GROVER A,LESKOVEC J.node2vec:Scalable feature learning for networks[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mi-ning.ACM,2016:855-864.
[13]ZHANG D,YIN J,ZHU X,et al.Homophily,structure,andcontent augmented network representation learning[C]//2016 IEEE 16th International Conference on Data Mining(ICDM).IEEE,2016:609-618.
[14]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[J].arXiv:1301.3781,2013.
[15]PEROZZI B,KULKARNI V,SKIENA S.Walklets:Multiscale graph embeddings for interpretable network classification[J].arXiv:1605.02115,2016.
[16]CAO S,LU W,XU Q.Deep neural networks for learning graph representations[C]//Thirtieth AAAI Conference on Artificial Intelligence.2016.
[17]YANG C,LIU Z,ZHAO D,et al.Network representation lear-ning with rich text information[C]//Twenty-Fourth Internatio-nal Joint Conference on Artificial Intelligence.2015.
[18]CHEN J,ZHANG Q,HUANG X.Incorporate group information to enhance network embedding[C]//Proceedings of the 25th ACM International on Conference on Information and Knowledge Management.ACM,2016:1901-1904.
[19]YE Z L,ZHAO H X,ZHANG K,et al.Network Representation Learning Based on Multi-view Ensemble Algorithm[J].Computer Science,2019,46(1):117-125.
[20]KIPF T N,WELLING M.Semi-supervised classification withgraph convolutional networks[J].arXiv:1609.02907,2016.
[21]MU J,BHAT S,VISWANATH P.All-but-the-top:Simple and effective postprocessing for word representations[J].arXiv:1702.01417,2017.
[22]WOLD S,ESBENSEN K,GELADI P.Principal component ana-lysis[J].Chemometrics and Intelligent Laboratory Systems,1987,2(1/2/3):37-52.
[23]FAN R E,CHANG K W,HSIEH C J,et al.LIBLINEAR:A library for large linear classification[J].Journal of Machine Learning Research,2008,9(8):1871-1874.
[24]HANLEY J A,MCNEIL B J.The meaning and use of the area under a receiver operating characteristic(ROC) curve[J].Radio-logy,1982,143(1):29-36.
[25]MENDITTO A,PATRIARCA M,MAGNUSSON B.Under-standing the meaning of accuracy,trueness and precision[J].Accreditation and Quality Assurance,2007,12(1):45-47.
[26]YAO L,WANG L,PAN L,et al.Link prediction based on common-neighbors for dynamic social network[J].Procedia Computer Science,2016,83:82-89.
[27]SHIBATA N,KAJIKAWA Y,SAKATA I.Link prediction in citation networks[J].Journal of the American Society for Information Science and Technology,2012,63(1):78-85.
[28]ACHLIOPTAS D.Database-friendly random projections:Johnson-Lindenstrauss with binary coins[J].Journal of Computer and System Sciences,2003,66(4):671-687.
[29]PAPADIMITRIOU C H,RAGHAVAN P,TAMAKI H,et al.Latent semantic indexing:A probabilistic analysis[J].Journal of Computer and System Sciences,2000,61(2):217-235.
[1] SONG Jie, LIANG Mei-yu, XUE Zhe, DU Jun-ping, KOU Fei-fei. Scientific Paper Heterogeneous Graph Node Representation Learning Method Based onUnsupervised Clustering Level [J]. Computer Science, 2022, 49(9): 64-69.
[2] HUANG Li, ZHU Yan, LI Chun-ping. Author’s Academic Behavior Prediction Based on Heterogeneous Network Representation Learning [J]. Computer Science, 2022, 49(9): 76-82.
[3] LI Qi-ye, XING Hong-jie. KPCA Based Novelty Detection Method Using Maximum Correntropy Criterion [J]. Computer Science, 2022, 49(8): 267-272.
[4] QUE Hua-kun, FENG Xiao-feng, LIU Pan-long, GUO Wen-chong, LI Jian, ZENG Wei-liang, FAN Jing-min. Application of Grassberger Entropy Random Forest to Power-stealing Behavior Detection [J]. Computer Science, 2022, 49(6A): 790-794.
[5] LI Yong, WU Jing-peng, ZHANG Zhong-ying, ZHANG Qiang. Link Prediction for Node Featureless Networks Based on Faster Attention Mechanism [J]. Computer Science, 2022, 49(4): 43-48.
[6] CHEN Shi-cong, YUAN De-yu, HUANG Shu-hua, YANG Ming. Node Label Classification Algorithm Based on Structural Depth Network Embedding Model [J]. Computer Science, 2022, 49(3): 105-112.
[7] GUO Lei, MA Ting-huai. Friend Closeness Based User Matching [J]. Computer Science, 2022, 49(3): 113-120.
[8] YANG Xu-hua, WANG Lei, YE Lei, ZHANG Duan, ZHOU Yan-bo, LONG Hai-xia. Complex Network Community Detection Algorithm Based on Node Similarity and Network Embedding [J]. Computer Science, 2022, 49(3): 121-128.
[9] ZHAO Xue-lei, JI Xin-sheng, LIU Shu-xin, LI Ying-le, LI Hai-tao. Link Prediction Method for Directed Networks Based on Path Connection Strength [J]. Computer Science, 2022, 49(2): 216-222.
[10] ZHENG Su-su, GUAN Dong-hai, YUAN Wei-wei. Heterogeneous Information Network Embedding with Incomplete Multi-view Fusion [J]. Computer Science, 2021, 48(9): 68-76.
[11] YANG Hong-xin, SONG Bao-yan, LIU Ting-ting, DU Yue-feng, LI Xiao-guang. Tensor Completion Method Based on Coupled Random Projection [J]. Computer Science, 2021, 48(8): 66-71.
[12] WU Shan-jie, WANG Xin. Prediction of Tectonic Coal Thickness Based on AGA-DBSCAN Optimized RBF Neural Networks [J]. Computer Science, 2021, 48(7): 308-315.
[13] CHEN Heng, WANG Wei-mei, LI Guan-yu, SHI Yi-ming. Knowledge Graph Completion Model Using Quaternion as Relational Rotation [J]. Computer Science, 2021, 48(5): 225-231.
[14] YANG Xu-hua, WANG Chen. Community Detection Algorithm in Complex Network Based on Network Embedding and Local Resultant Force [J]. Computer Science, 2021, 48(4): 229-236.
[15] GONG Zhui-fei, WEI Chuan-jia. Link Prediction of Complex Network Based on Improved AdaBoost Algorithm [J]. Computer Science, 2021, 48(3): 158-162.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!