Computer Science ›› 2021, Vol. 48 ›› Issue (10): 51-58.doi: 10.11896/jsjkx.200900194

• Artificial Intelligence • Previous Articles     Next Articles

Fusion Vectorized Representation Learning of Multi-source Heterogeneous User-generated Contents

JI Nan-xun, SUN Xiao-yan, LI Zhen-qi   

  1. School of Information and Control Engineering,China University of Mining and Technology,Xuzhou,Jiangsu 221008,China
  • Received:2020-09-27 Revised:2021-01-08 Online:2021-10-15 Published:2021-10-18
  • About author:JI Nan-xun,born in 1994,postgraduate.His main research interests include na-tural language processing and machine learning.
    SUN Xiao-yan,born in 1978,Ph.D professor.Her main research interests include interactive evolutionary computation,big data and intelligence optimization.
  • Supported by:
    National Natural Science Foundation of China(61876184).

Abstract: With the development of mobile networks and APPs,user generated contents (UGC) containing multi-source heterogeneous data such as evaluations,markings,scoring,images and videos are greatly valuable information for improving the quality of personalized services.The representation learning of fusion and vectorization on the multi-source heterogeneous UGC is the most critical issue for the successful application.Motivated by this,we propose a representation learning method for effectively fusing and vectorizing the comments and image data.We utilize the Doc2vec and LDA models to sufficiently extract the features of the multi-source comments.The images correlated with the comments are represented with deep convolutional network.A hybrid vectorized representation learning for fusing comments and a convolution strategy for integrating images and comments are presented.The feasibility and effectiveness of the proposed method is demonstrated by applying it to typical Amazon public data sets with heterogeneous UGC,in which the vectorized multi-source heterogeneous UGC is taken as the representation of each product and the classification accuracy of the products are compared.

Key words: Fusion, Multi-source heterogeneous, Representation learning, Short text, User generated contents

CLC Number: 

  • TP391
[1]WANG J J,MA Y Q,CHEN S T,et al.Fragmentation know-ledge processing and networked artificial intelligence[J].Scientia Sinica Informations,2017,47(2):171-192.
[2]HUA B L,LI G J.Discussion on Theory and Application ofMulti-Source Information Fusion in Big Data Environment[J].Library and Information Service,2015,59(16):5-10.
[3]ZHU Z T J.A Multi-source Heterogeneous Vector Space Data Integration Scheme Based on GeoJSON[C]//26th International Conference on Geoinformatics.IEEE,2018:1-4.
[4]TEZGIDER M,YLDZ B,AYDN G.Improving Word Representation by Tuning Word2Vec Parameters with Deep Learning Model[C]//International Conference on Artificial Intelligence and Data Processing(IDPA).IEEE,2018:1-7.
[5]WANG X,LIAO Y,ZHU J,et al.A Low-Dimensional Representation Learning Method for Text Classification and Clustering[C]//IEEE Fifth International Conference on Data Science in Cyberspace (DSC).IEEE,2020:214-217.
[6]CHU Y,FENG C,GUO C.Social-Guided Representation Lear-ning for Images via Deep Heterogeneous Hypergraph Embedding[C]//IEEE International Conference on Multimedia and Expo (ICME).IEEE,2018:1-6.
[7]ZHONG P,GONG Z,LI S,et al.Learning to Diversify Deep Belief Networks for Hyperspectral Image Classification[J].IEEE Transactions on Geoscience and Remote Sensing,2017,55(6):3516-3530.
[8]HUA Y,GUO J,ZHAO H.Deep Belief Networks and DeepLearning[C]//International Conference on Intelligent Computing and Internet of Things (ICIT).IEEE,2015:1-4.
[9]KENTER T,DE RIJKE M.Short Text Similarity with Word Embeddings[C]//Proceedings of the 24th ACM International on Conference on Information and Knowledge Management.2015:1411-1420.
[10]YE J M,LUO D X,CHEN S.Short-text Sentiment EnhancedAchievement Prediction Method for Online Learners[J].Acta Automatica Sinica,2020,46(9):1927-1940.
[11]ZHANG Q,GAO Z M,LIU J Y.Research of Weibo Short Text Classification Based on Word2vec[J].Netinfo Security,2017(1):57-62.
[12]ZHANG P,HE Z S.Using data-driven feature enrichment of text representation and ensemble technique for sentence-level polarity classification[J].Journal of Information Science.2015,41(4):531-549.
[13]LAI S W,XU L H,LIU K,et al.Recurrent Convolutional Neural Networks for Text Classification[C]//Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence.2015:2267-2273.
[14]CHEN Q,YAO L,YANG J.Short text classification based on LDA topic model[C]//International Conference on Audio,Language and Image Processing (ICALIP).IEEE,2016:749-753.
[15]ZADEH A,CHEN M,PORIA S,et al.Tensor Fusion Network for Multimodal Sentiment Analysis[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.2017:1103-1114.
[16]WANG Y Y.Relationship Between Linear Convolution and Circular Convolution of Discrete Sequence[J].Sichuan University of Arts and Science Journal,2015,25(5):32-35.
[17]WANG J H,LIU X Q,LI R X.Summary of Understanding and Calculation of Discrete Linear Convolution[J].Science & Technology Vision,2016(27):300-304.
[18]YANG Y,WANG J,YANG Y.Improving SVM classifier with prior knowledge in microcalcification detection1[C]//The International Conference on Image Processing (ICIP).IEEE,2012:2837-2840.
[19]JOELSSON S R,BENEDIKTSSON J A,SVEINSSON J R.Feature Selection for Morphological Feature Extraction using Random Forests[C]//Norwegian Signal Processing Symposium.IEEE,2006:10-13.
[20]NAPA K K,VIGNESWARI M,KRISHNA M V,et al.An Optimized Random Forest Classifier for Diabetes Mellitus[M]// Emerging Technologies in Data Mining and Information Security.Berlin:Springer,2018:765-773.
[21]PATIL S,KULKARNI U.Accuracy Prediction for Distributed Decision Tree using Machine Learning Approach[C]//Procee-dings of the Third International Conference on Trends in Electronics and Informatics (ICOEI).IEEE,2019:1365-1371.
[22]RADHIKA P R,NAIR R A S,VEENA G.A Comparative Studyof Lung Cancer Detection using Machine Learning Algorithms[C]//IEEE International Conference on Electrical,Computer and Communication Technologies (ICECCT).IEEE,2019:1-4.
[23]SINGH G,KUMAR B,GAUR L,et al.Comparison betweenMultinomial and Bernoulli Naïve Bayes for Text Classification[C]//International Conference on Automation,Computational and Technology Management (ICACTM).IEEE,2019:593-596.
[24]ZHANG D,WANG J,ZHAO X,et al.A Bayesian Hierarchical Model for Comparing Average F1 Scores[C]//International Conference on Data Mining.IEEE,2015:589-598.
[1] WU Zi-yi, LI Shao-mei, JIANG Meng-han, ZHANG Jian-peng. Ontology Alignment Method Based on Self-attention [J]. Computer Science, 2022, 49(9): 215-220.
[2] SONG Jie, LIANG Mei-yu, XUE Zhe, DU Jun-ping, KOU Fei-fei. Scientific Paper Heterogeneous Graph Node Representation Learning Method Based onUnsupervised Clustering Level [J]. Computer Science, 2022, 49(9): 64-69.
[3] HUANG Li, ZHU Yan, LI Chun-ping. Author’s Academic Behavior Prediction Based on Heterogeneous Network Representation Learning [J]. Computer Science, 2022, 49(9): 76-82.
[4] LYU Xiao-feng, ZHAO Shu-liang, GAO Heng-da, WU Yong-liang, ZHANG Bao-qi. Short Texts Feautre Enrichment Method Based on Heterogeneous Information Network [J]. Computer Science, 2022, 49(9): 92-100.
[5] CAO Xiao-wen, LIANG Mei-yu, LU Kang-kang. Fine-grained Semantic Reasoning Based Cross-media Dual-way Adversarial Hashing Learning Model [J]. Computer Science, 2022, 49(9): 123-131.
[6] ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[7] XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171.
[8] LI Zong-min, ZHANG Yu-peng, LIU Yu-jie, LI Hua. Deformable Graph Convolutional Networks Based Point Cloud Representation Learning [J]. Computer Science, 2022, 49(8): 273-278.
[9] QIN Qi-qi, ZHANG Yue-qin, WANG Run-ze, ZHANG Ze-hua. Hierarchical Granulation Recommendation Method Based on Knowledge Graph [J]. Computer Science, 2022, 49(8): 64-69.
[10] CHEN Jing, WU Ling-ling. Mixed Attribute Feature Detection Method of Internet of Vehicles Big Datain Multi-source Heterogeneous Environment [J]. Computer Science, 2022, 49(8): 108-112.
[11] WEI Kai-xuan, FU Ying. Re-parameterized Multi-scale Fusion Network for Efficient Extreme Low-light Raw Denoising [J]. Computer Science, 2022, 49(8): 120-126.
[12] SHEN Xiang-pei, DING Yan-rui. Multi-detector Fusion-based Depth Correlation Filtering Video Multi-target Tracking Algorithm [J]. Computer Science, 2022, 49(8): 184-190.
[13] YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[14] ZHANG Yuan, KANG Le, GONG Zhao-hui, ZHANG Zhi-hong. Related Transaction Behavior Detection in Futures Market Based on Bi-LSTM [J]. Computer Science, 2022, 49(7): 31-39.
[15] ZHANG Ying-tao, ZHANG Jie, ZHANG Rui, ZHANG Wen-qiang. Photorealistic Style Transfer Guided by Global Information [J]. Computer Science, 2022, 49(7): 100-105.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!