Computer Science ›› 2022, Vol. 49 ›› Issue (11): 109-116.doi: 10.11896/jsjkx.210900101

• Database & Big Data & Data Science • Previous Articles     Next Articles

Semantic Information Enhanced Network Embedding with Completely Imbalanced Labels

FU Kun, GUO Yun-peng, ZHUO Jia-ming, LI Jia-ning, LIU Qi   

  1. College of Artificial Intelligence,Hebei University of Technology,Tianjin 300401,China
    Key Laboratory of Big Data Computing,Tianjin 300401,China
  • Received:2021-09-13 Revised:2022-02-26 Online:2022-11-15 Published:2022-11-03
  • About author:FU Kun,born in 1979,Ph.D,associate professor.Her main research interests include social network analysis and network representation learning.
  • Supported by:
    National Natural Science Foundation of China(62072154).

Abstract: The problem of data incompleteness has become an intractable problem for network representation learning(NRL) methods,which makes existing NRL algorithms fail to achieve the expected results.Despite numerous efforts have done to solve the issue,most of previous methods mainly focused on the lack of label information,and rarely consider data imbalance phenomenon,especially the completely imbalance problem that a certain class labels are completely missing.Learning algorithms to solve such problems are still explored,for example,some neighborhood feature aggregation process prefers to focus on network structure information,while disregarding relationships between attribute features and semantic features,of which utilization may enhance representation results.To address the above problems,a semantic information enhanced network embedding with completely imbalanced labels(SECT)method that combines attribute features and structural features is proposed in this paper.Firstly,SECT introduces attention mechanism in the supervised learning for obtaining the semantic information vector on precondition of considering the relationship between the attribute space and the semantic space.Secondly,a variational autoencoder is applied to extract structural features under an unsupervised mode to enhance the robustness of the algorithm.Finally,both semantic and structural information are integrated in the embedded space.Compared with two state-of-the-art algorithms,the node classification results on public data sets Cora and Citeseer indicate the network vector obtained by SECT algorithm outperforms others and increases by 0.86%~1.97% under Mirco-F1.As well as the node visualization results exhibit that compared with other algorithms,the vector distances among different-class clusters obtained by SECT are larger,the clusters of same class are more compact,and the class boundaries are more obvious.All these experimental results demonstrate the effectiveness of SECT,which mainly benefited from a better fusion of semantic information in the low-dimensional embedding space,thus extremely improves the performance of node classification tasks under completely imbalanced labels.

Key words: Network representation learning, Graph embedding, Graph attention network, Completely imbalanced label, Varia-tional autoencoders

CLC Number: 

  • TP391
[1]CUI P,WANG X,PEI J,et al.A survey on network embedding [J].IEEE Transactions on Knowledge and Data Engineering,2019,31(5):833-852.
[2]YIN Y,JI L X,HUANG R Y,et al.Research and developmentof network representation learning [J].Chinese Journal of Network and Information Security,2019,5(2):77-87.
[3]BALASUBRAMANIAN M,SCHWARTZ E L.The isomap algorithm and topological stability[J].Science,2002,295(5552):7.
[4]ROWEIS S T,SAUL L K.Nonlinear dimensionality reduction by locally linear embedding [J].Science,2000:290(5500):2323-2326.
[5]BELKIN M,NIYOGI P.Laplacian eigenmaps and spectral techniques for embedding and clustering[C]//Proceedings of the 2001 14th International Conference on Neural Information Processing Systems:Natural and Synthetic.Cambridge,MA:MIT Press,2001:585-591.
[6]PEROZZI B,ALRFOU R,SKIENA S.Deepwalk:online learningof social representations[C]//Proceedings of the 20th ACMSIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM,2014:701-710.
[7]GROVER A,LESKOVEC J.node2vec:Scalable feature lear-ning for networks[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM,2016:855-864.
[8]TANG J,QU M,WANG M,et al.LINE:large-scale information network embedding[C]//Proceedings of the 24th International Conference on World Wide Web.New York:ACM,2015:1067-1077.
[9]CAO S,LU W,XU Q.Grarep:learning graph representations with global structural information[C]//Proceedings of the 24th ACM International Conference on Information and Knowledge Management.New York:ACM,2015:891-900.
[10]KIPF T N,WELLING M.Semi-supervised classification withgraph convolutional networks[C]//International Conference on Learning Representations(ICLR).2017.
[11]HAMILTON W L,YING R,LESKOVEC J.Inductive representation learning on large graphs[C]//Neural Information Processing Systems(NIPS).2017:1024-1034.
[12]PETAR V,GUILLEM C,ARANTXA C,et al.Graph attention networks [C]//Proceedings of the 6th International Conference on Learning Representations.Vancouver,BC:Elsevier,2018:1-12.
[13]KLICPERA J,BOJCHEVSKI A,GUNNEMANN S.Predictthen propagate:Graph neural networks meet personalized page-rank[C]//International Conference on Learning Representations.2019.
[14]WANG Z,YE X J,WANG C K,et al.Network Embedding with Completely-imbalanced Labels[J].IEEE Transactions on Know-ledge and Data Engineering(TKDE),2020,33(11):3634-3647.
[15]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[C]//International Conference on Learning Representations(ICLR).2013.
[16]MNIH A,HINTON G E.A scalable hierarchical distributed lan-guage model[C]//Advances in Neural Information Processing Systems.2009:1081-1088.
[17]MORIN F,BENGIO Y.Hierarchical probabilisticneural network language model[C]//Proceedings of the International Workshop on Artificial Intelligence and Statistics.2005:246-252.
[18]YANG C,LIU Z Y,ZHAO D L,et al.Network representation learning with rich text information[C]//Proceedings of IJCAI.2015.
[19]KIPF T N,WELLING M.Variational graph auto-encoders[C]//NIPS Workshop on Bayesian Deep Learning.2016.
[20]KINGMA D P,WELLING M.Auto-encoding variational bayes[C]//Proceedings of the International Conference on Learning Representations(ICLR).2014.
[21]BOUSQUETO,ELISSEEFF A.Stability and generalization [J].Journal of Machine Learning Research,2002,2(Mar):499-526.
[22]ZHOU Z H,WANG W,GAO W,et al.Introduction to the theory of Machine Learning[M].Beijing:China Machie Press(CMP),2020:92-94.
[23]STUDENT.The Probable Error of a Mean[J].Biometrika,1908,6(1):1-25.
[1] HUANG Li, ZHU Yan, LI Chun-ping. Author’s Academic Behavior Prediction Based on Heterogeneous Network Representation Learning [J]. Computer Science, 2022, 49(9): 76-82.
[2] TAN Ying-ying, WANG Jun-li, ZHANG Chao-bo. Review of Text Classification Methods Based on Graph Convolutional Network [J]. Computer Science, 2022, 49(8): 205-216.
[3] SHI Dian-xi, ZHAO Chen-ran, ZHANG Yao-wen, YANG Shao-wu, ZHANG Yong-jun. Adaptive Reward Method for End-to-End Cooperation Based on Multi-agent Reinforcement Learning [J]. Computer Science, 2022, 49(8): 247-256.
[4] LI Yong, WU Jing-peng, ZHANG Zhong-ying, ZHANG Qiang. Link Prediction for Node Featureless Networks Based on Faster Attention Mechanism [J]. Computer Science, 2022, 49(4): 43-48.
[5] YANG Hui, TAO Li-hong, ZHU Jian-yong, NIE Fei-ping. Fast Unsupervised Graph Embedding Based on Anchors [J]. Computer Science, 2022, 49(4): 116-123.
[6] JIANG Zong-li, FAN Ke, ZHANG Jin-li. Generative Adversarial Network and Meta-path Based Heterogeneous Network Representation Learning [J]. Computer Science, 2022, 49(1): 133-139.
[7] ZENG Wei-liang, CHEN Yi-hao, YAO Ruo-yu, LIAO Rui-xiang, SUN Wei-jun. Application of Spatial-Temporal Graph Attention Networks in Trajectory Prediction for Vehicles at Intersections [J]. Computer Science, 2021, 48(6A): 334-341.
[8] DU Shao-hua, WAN Huai-yu, WU Zhi-hao, LIN You-fang. Customs Commodity HS Code Classification Integrating Text Sequence and Graph Information [J]. Computer Science, 2021, 48(4): 97-103.
[9] LIU Zhi-xin, ZHANG Ze-hua, ZHANG Jie. Top-N Recommendation Method for Graph Attention Based on Multi-level and Multi-view [J]. Computer Science, 2021, 48(4): 104-110.
[10] FU Kun, ZHAO Xiao-meng, FU Zi-tong, GAO Jin-hui, MA Hao-ran. Deep Network Representation Learning Method on Incomplete Information Networks [J]. Computer Science, 2021, 48(12): 212-218.
[11] XING Chang-zheng, ZHU Jin-xia, MENG Xiang-fu, QI Xue-yue, ZHU Yao, ZHANG Feng, YANG Yi-ming. Point-of-interest Recommendation:A Survey [J]. Computer Science, 2021, 48(11A): 176-183.
[12] PAN Yu, ZOU Jun-hua, WANG Shuai-hui, HU Gu-yu, PAN Zhi-song. Deep Community Detection Algorithm Based on Network Representation Learning [J]. Computer Science, 2021, 48(11A): 198-203.
[13] ZHAO Man, ZHAO Jia-kun, LIU Jin-nuo. Link Prediction Algorithm Based on Ego Networks Structure and Network Representation Learning [J]. Computer Science, 2021, 48(11A): 211-217.
[14] DING Yu, WEI Hao, PAN Zhi-song, LIU Xin. Survey of Network Representation Learning [J]. Computer Science, 2020, 47(9): 52-59.
[15] JIANG Zong-li, LI Miao-miao, ZHANG Jin-li. Graph Convolution of Fusion Meta-path Based Heterogeneous Network Representation Learning [J]. Computer Science, 2020, 47(7): 231-235.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!