%A HUANG Jian-yi, LI Jian-jiang, WANG Zheng, FANG Ming-zhe %T Single-Pass Short Text Clustering Based on Context Similarity Matrix %0 Journal Article %D 2019 %J Computer Science %R 10.11896/j.issn.1002-137X.2019.04.008 %P 50-56 %V 46 %N 4 %U {https://www.jsjkx.com/CN/abstract/article_18108.shtml} %8 2019-04-15 %X Online social network has become an important channel and carrier,and it has formed a virtual society interacting with the real world.Numerous network events rapidly spread through social networks,and they can become hot spots in a short period of time.However,the negative events vibrate national security and social stability,and may cause a series of social problems.Therefore,mining hotspot information contained in social networks is of great significance both in public opinion supervision and public opinion early warning.Text clustering is an important method for mining hotspot information.However,when the traditional long text clustering algorithms process massive short texts,their accuracy rate will become lower and the complexity will increase sharply,which will lead to long time-consuming.The exis-ting short text clustering algorithms also have low accuracy and takes too much time.Based on the keywords of text,this paper presented an association model combining context and similarity matrix to determine the relevance between the current text and the previous text.In addition,the text keyword weights were modified according to the association model to further reduce the noise.Finally,a distributed short text clustering algorithm on Hadoop platform was implemented.Through the experiments,it is verified that the proposed algorithm has better results and performance compared with K-MEANS,SP-NN and SP-WC algorithms in terms of the speed of mining topics,the accuracy and the recall rate.