Computer Science ›› 2023, Vol. 50 ›› Issue (8): 163-169.doi: 10.11896/jsjkx.220700216

• Artificial Intelligence • Previous Articles     Next Articles

Multimodal Knowledge Graph Embedding with Text-Image Enhancement

XIAO Guiyang, WANG Lisong , JIANG Guohua   

  1. School of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics, Nanjing 210000,China
  • Received:2022-07-21 Revised:2022-11-24 Online:2023-08-15 Published:2023-08-02
  • About author:XIAO Guiyang,born in 1998,master.Her main research interests include knowledge graph embedding and deep learning.
    WANG Lisong,born in 1969,Ph.D,professor,is a member of China Computer Federation.His main research interests include natural language processing and formal method.
  • Supported by:
    Key Projects of Foundation Strengthening Plan(2019JCJQZD33800).

Abstract: Most traditional knowledge representation learning methods only focus on the structured information in triples,and cannot make good use of the additional information such as entity images,relation path and text descriptions to learn knowledge representation or fuse only one additional information.Therefore,a multimodal knowledge graph embedding method combining entity descriptions and images is proposed.Through mutual enhancement of text and images,more comprehensive external information can be provided to make up for the deficiency of knowledge representation learning caused by the incompleteness of a single information source.Firstly,text representation and image representation of entities are obtained by modeling entity descriptions and images.Then,they are used as a supplement to the structural representation in TransE.Finally,through the joint trai-ning of three entity representations,the unified spatial representation of knowledge graph,text and image is realized to improve the accuracy of entity and relation prediction.Experimental results show that the hit rate of entity prediction of this model improves by 3.09% compared with the method of without additional information,improves by 0.97% compared with the method of fusing only entity descriptions,and improves by 1.32% compared with the method of fusing only entity images.

Key words: Knowledge representation learning, Entity descriptions, Entity images, Text-CNN, Joint training

CLC Number: 

  • TP183
[1]BOLLACKER K,EVANS C,PARITOSH P,et al.Freebase:a collaboratively created graph database for structuring human knowledge[C]//Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data.2008:1247-1250.
[2]AUER S,BIZER C,KOBILAROV G,et al.Dbpedia:A nucleus for a web of open data[C]//The Semantic Web:6th Interna-tional Semantic Web Conference,2nd Asian Semantic Web Conference(ISWC 2007+ ASWC 2007).Busan,Korea:Springer,2007:722-735.
[3]SUCHANEK F M,KASNECI G,WEIKUM G.Yago:a core of semantic knowledge[C]//Proceedings of the 16th International Conference on World Wide Web.2007:697-706.
[4]YANG B,YI H W,HE X,et al.Embedding entities and relations for learning and inference in knowledge bases[J].arXiv:1412.6575,2014.
[5]YIN J,JIANG X,LU Z,et al.Neural generative question an-swering[J].arXiv:1512.01337,2015.
[6]WANG Q,MAO Z,WANG B,et al.Knowledge graph embedding:A survey of approaches and applications[J].IEEE Tran-sactions on Knowledge and Data Engineering,2017,29(12):2724-2743.
[7]DONG X,GABRILOVICH E,HEITZ G,et al.Knowledgevault:A web-scale approach to probabilistic knowledge fusion[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2014:601-610.
[8]BORDES A,USUNIER N,GARCIA-DURAN A,et al.Translating embeddings for modeling multi-relational data[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems-Volume 2.2013:2787-2795.
[9]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.ImageNetclassification with deep convolutional neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems-Volume 1.2012:1097-1105.
[10]MIKOLOV T,SUTSKEVER I,CHEN K,et al.Distributed representations of words and phrases and their compositionality[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems-Volume 2.2013:3111-3119.
[11]WANG Z,ZHANG J,FENG J,et al.Knowledge graph embedding by translating on hyperplanes[C]//Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence.2014:1112-1119.
[12]LIN Y,LIU Z,SUN M,et al.Learning entity and relation embeddings for knowledge graph completion[C]//Twenty-ninth AAAI Conference on Artificial Intelligence.2015.
[13]FAN M,ZHOU Q,CHANG E,et al.Transition-based know-ledge graph embedding with relational mapping properties[C]//Proceedings of the 28th Pacific Asia Conference on Language,Information Cnd computing.2014:328-337.
[14]GUO S,WANG Q,WANG B,et al.Semantically smooth know-ledge graph embedding[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2015:84-94.
[15]LIN Y,LIU Z,LUAN H,et al.Modeling relation paths for representation learning of knowledge bases[J].arXiv:1506.00379,2015.
[16]XIE R,LIU Z,JIA J,et al.Representation learning of knowledge graphs with entity descriptions[C]//Proceedings of the Thir-tieth AAAI Conference on Artificial Intelligence.2016:2659-2665.
[17]JIANG T,LIU T,GE T,et al.Encoding temporal information for time-aware link prediction[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Proces-sing.2016:2350-2354.
[18]KIM Y.Convolutional Neural Networks for Sentence Classification[J].arXiv:1408.5882,2014.
[19]PASZKE A,GROSS S,MASSA F,et al.PyTorch:an imperative style,high-performance deep learning library[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems.2019:8026-8037.
[20]SHUTOVA E,KIELA D,MAILLARD J.Black holes and white rabbits:Metaphor identification with visual features[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2016:160-170.
[21]XIE R,LIU Z,LUAN H,et al.Image-embodied knowledge representation learning[J].arXiv:1609.07028,2016.
[22]DENG J,DONG W,SOCHER R,et al.Imagenet:A large-scale hierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2009:248-255.
[23]MILLER G A.WordNet:a lexical database for English[J].Communications of the ACM,1995,38(11):39-41.
[1] SHEN Qiuhui, ZHANG Hongjun, XU Youwei, WANG Hang, CHENG Kai. Comprehensive Survey of Loss Functions in Knowledge Graph Embedding Models [J]. Computer Science, 2023, 50(4): 149-158.
[2] HUA Zhen, ZHANG Hai-cheng, LI Jin-jiang. End-to-end Image Super Resolution Based on Residuals [J]. Computer Science, 2019, 46(6): 246-255.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!