Computer Science ›› 2024, Vol. 51 ›› Issue (6A): 230800093-7.doi: 10.11896/jsjkx.230800093

• Big Data & Data Science • Previous Articles     Next Articles

Study on Three-level Short Video User Portrait Based on Improved Topic Model Method

HUANG Yumin, ZHAO Chanchan   

  1. College of Information Engineering,Inner Mongolia University of Technology,Huhhot 010051,China
  • Published:2024-06-06
  • About author:HUANG Yumin,born in 1998,postgra-duate.His main research interests include data mining and personalized re-commendations .
    ZHAO Chanchan,born in 1982,Ph.D,associate professor.Her main research interests include computer network and software defined network.
  • Supported by:
    Basic Scientific Research Business Fee Project of Colleges and Universities Directly under the Inner Mongolia Autonomous Region(ZTY2023022,JY20230082),Inner Mongolia Autonomous Region Postgraduate Research Innovation Project(S20231129Z) and Inner Mongolia Autonomous Region Natural Science Foundation Project(2023LHMS06016).

Abstract: Aiming at the problem of how to quickly extract accurate user interests from massive short video data,user data and interactive data,a three-level label user portrait construction method based on topic model is proposed.Based onthe topic construction method,the video topic words obtained by the fused LDA and GSDMM topic models are used as user interest expression vectors.Firstly,an LDA filter is built to eliminate the topic-independent text information by comparing the threshold,so as to reduce the scale of the text and reduce the influence of non-main corpus on the generation of interest expression vector.Then,the construction method of the feature word weight matrix combining semantic information and context information is proposed.The Bi-GRU neural network is used to calculate the context feature of the word vector as the context feature,and the word frequency weight calculated by the TF-IDF algorithm is used as the semantic feature.Combining context and semantic features to expand the meaning of feature words.Finally,the GSDMM model with interest weight distribution is used to learn the feature vector weight matrix,and the user interest tag generation and the interest weight correction under the influence of different user preferences are realized.Experiments show that this method can represent user portraits more completely and accurately,which is better than single topic construction method,and performs well in clustering effect.By constructing a complete user portrait,the user’s pain points could be accurately grasp,so as to provide services for subsequent personalized recommendation.

Key words: Short video, User portraits, Topic analysis model, Semantic weight, Context weight

CLC Number: 

  • TP391
[1]ZHAO Y H,LIU F L,LUO L.A Review of User Portrait Research in the Context of Big Data:Knowledge System and Research Prospects[J].Library Science Research,2019(24):13-24.
[2]SHAN X H,ZHANG X Y,LIU X Y.Research on User Por-traits Based on Online Reviews-A Case Study of Ctrip Hotel[J].Intelligence Theory and Practice,2018,41(4):99-104,149.
[3]WANG L X,SHEN Z,LI Y.Social Q & A community user portrait construction[J].Information theory and practice,2018,41(1):129-134.
[4]WANG Q F.Research on Bayesian network in user interestmodel construction[J].Wireless Internet Technology,2016(12):101-102.
[5]ZHANG Y.Practical analysis of statistical methods for userportraits in the context of big data[J].Modern Business,2020(6):9-10.
[6]WAN J P.Design and implementation of real-time game userportrait system based on big data[D].Beijing:China University of Geosciences,2021.
[7]ZHANG H X,SHENG F F,XU P Y,et al.Visualization of po-pulation characteristics based on mobile terminal log data[J].Journal of Software,2016,27(5):1174-1187.
[8]COOPER A.The inmates are running the asylum[M].Vieweg+Teubner Verlag,1999.
[9]GAO G S.A review of user portrait construction methods[J].DataAnalysis and Knowledge Discovery,2019,3(3):25-35.
[10]NIELSEN L.Personas-user focused design[M].London:Sprin-ger,2013.
[11]BLYTHE M A,WRIGHT P C.Pastiche scenarios:Fiction as a resource for user centred design[J].Interacting with Computers,2006,18(5):1139-1164.
[12]MIDDLETON S E,SHADBOLT N R,DE ROURE D C.Ontological user profiling in recommender systems[J].ACM Tran-sactions on Information Systems(TOIS),2004,22(1):54-88.
[13]LEUNG K W T,LEE D L.Deriving concept-based user profiles from search engine logs[J].IEEE Transactions on Knowledge and Data Engineering,2010,22(7):969-982.
[14]FENG Y,ZOU B X,XU H Y.Short video recommendationmodel based on video content features and barrage text[J].Journal of Liaoning University(Natural Science Edition),2021,48(2):108-115.
[15]HU Q,SHEN J J,JING G H,et al.Service clustering methodbased on describing context feature words and improved GSDMM model[J].Communication Journal,2021,42(8):176-187.
[16]ZU X,XIE F.A keyword extraction algorithm based on global and local feature representation[J].Journal of Yunnan University(Natural Science Edition),2023,45(4):825-836.
[17]CAI M D,SHEN G H,HUANG Z Q.A semi-supervised learning keyword extraction method without manual labeling[J].Journal of Chinese Computer Systems,2024,45(1):69-74.
[18]CHEN L Y,WU T.A short text sentiment analysis methodcombining topic model andself-attention mechanism[J].Fo-reign Electronic Measurement Technology,2021,40(11):18-23.
[19]FAN H,LI P F.Research on short text sentiment analysis based on FastText word vector and bidirectional GRU recurrent neural network-Taking Weibo comment text as an example[J].Information Science,2021,39(4):15-22.
[1] YU Xiao-ming, HUANG Hua. Research on Application of Improved GAN Network in Generating Short Video [J]. Computer Science, 2021, 48(11A): 625-629.
[2] LI Heng-chao, LIN Hong-fei, YANG Liang, XU Bo, WEI Xiao-cong, ZHANG Shao-wu and Gulziya ANIWAR. Two-level Stacking Algorithm Framework for Building User Portrait [J]. Computer Science, 2018, 45(1): 157-161.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!