Computer Science ›› 2022, Vol. 49 ›› Issue (6A): 291-296.doi: 10.11896/jsjkx.210800011

• Big Data & Data Science • Previous Articles     Next Articles

Microblog Popular Information Detection Based on Hidden Semi-Markov Model

XIE Bai-lin, LI Qi, KUANG Jiang   

  1. School of Information Science and Technology,Guangdong University of Foreign Studies,Guangzhou 510006,China
    School of Cyber Security,Guangdong University of Foreign Studies,Guangzhou 510006,China
  • Online:2022-06-10 Published:2022-06-08
  • About author:XIE Bai-lin,born in 1982,Ph.D,assistant professor,is a member of China Computer Federation.His main research interests include online social network,network security.
  • Supported by:
    Guangdong Basic and Applied Basic Research Foundation(2018A0303130045) and Science and Technology Program of Guangzhou(201904010334).

Abstract: In recent years,microblog has become great places for people to communicate with each other and share knowledge.However,microblog has also become the main grounds for rumors' transmission.If we can identify popular information in early stage,then we can identify and quell rumors early,we can also identify hot topics early in microblog.Therefore,the research on popular information detection is important.In this paper a new method is presented for identifying popular information based on hidden semi-Markov model(HSMM),from the perspective of the transmission processes of popular information in microblog.In this method,the observation value is constructed based on the influence level of the information forwarder and the time interval between two adjacent forwarders,and the influence level of the forwarder is automatically obtained by using the random forest classification algorithm.The proposed method includes a training phase and an identification phase.In the identification phase,the average log likelihood of every observation sequence is calculated,and the popularity of information is updated in real time.So this method can identify the popular information in early stage.An experiment based on real datasets of Sina Weibo and Twitter is conducted to evaluate this method.The experiment results validate the effectiveness of this method.

Key words: Hidden semi-Markov model, Microblog, Popular information, Popularity, Transmission process

CLC Number: 

  • TP391
