计算机科学 ›› 2015, Vol. 42 ›› Issue (9): 208-213.doi: 10.11896/j.issn.1002-137X.2015.09.040
孙晓,孙重远,任福继
SUN Xiao, SUN Chong-yuan and REN Fu-ji
摘要: 随着社交网络的发展,新的词汇不断出现。新词的出现往往表征了一定的社会热点,同时也代表了一定的公众情绪,新词的识别与情感倾向判定为公众情绪预测提供了一种新的思路。通过构建深层条件随机场模型进行序列标记,引入词性、单字位置和构词能力等特征,结合众包网络词典等第三方词典。传统的基于情感词典的方法难以对新词情感进行判定,基于神经网络的语言模型将单词表示为一个K维的词义向量,通过寻找新词词义向量空间中距离该新词最近的词,根据这些词的情感倾向以及与新词的词义距离,判断新词的情感倾向。通过在北京大学语料上的新词发现和情感倾向判定实验,验证了所提模型及方法的有效性,其中新词判断的F值为0.991,情感识别准确率为70%。
[1] 聂金慧,苏红旗,时志远.中文新词提取与过滤研究综述[J].中国科技博览,2013(30):209-210 Nie Jin-hui,Su Hong-qi,Shi Zhi-yuan.Survey of Chinese new words extracting and filtering[J].China Science and Technology Review,2013(30):209-210 [2] Sproat R,Emerson T.The First International Chinese WordSegmentation Bakeoff[C]∥Proceedings of the Second SIGHAN Workshop on Chinese Language Processing.Sapporo,Japan,2003:133-143 [3] 张海军,史树敏,朱朝勇,等.中文新词识别技术综述[J].计算机科学,2010,7(3):6-10 Zhang Hai-jun,Shi Shu-min,Zhu Chao-yong,et al.Survey of Chinese new words identification[J].Computer science,2010,7 (3):6-10 [4] Fu G,Luke K-k.Chinese Unknown Word Identification UsingClass based LM [C]∥Proceedings of The First International Joint Conference on Natural Language Processing.Hainan Island,China,2004:262-269 [5] Goh C-L,Asahara M,Matsumoto Y.Machine Learning-basedMethods to Chinese Unknown Word Detection and POS Tag Guessing[J].Journal of Chinese Language and Computing,2006,6(4):185-206 [6] Xu Yuan-fang,Gu Hui.New Word Recognition Based On Support Vector Machines And Constraints[C]∥Proceedings of 2013 IEEE International Conference on Computer Science and Automation Engineering.Singapore,2013:56-59 [7] Li Cheng-cheng,Xu Yuan-fang.Using on support vector andwordfeatures new word discovery research[M]∥Trustworthy Computing and Services.Springer Berlin Heidelberg,2013:287-294 [8] Zeng Hua-lin,Zhou Chang-le,Zheng Xu-ling.A New Word Detection Method for Chinese based on local context information[J].Journal of Donghua University(English version),2010,27(2):189-192 [9] 陈飞,刘奕群,魏超,等.基于条件随机场方法的开放领域新词发现[J].软件学报,2013,24(5):1051-1060 Chen Fei,Liu Yi-qun,Wei Chao,et al.Open Domain New WordDetection Based on Condition Random Field Method[J].Journal of Software,2013,24(5):1051-1060 [10] 张靖,金浩.汉语词语情感倾向自动判断研究[J].计算机工程,2010,6(23):194-196 Zhang Jing,Jin Hao.Study on Chinese word sentiment Polarity Automatic.Estimation[J].Computer Engineering,2010,36(23):194-196 [11] 郑文超,徐鹏.利用word2vec对中文词进行聚类的研究[J].软件,2013,4(12):160-162 Zheng Wen-chao,Xu Peng.Research on Chinese words Clustering with word2vec[J].Computer Engineering and Software,2013,4(12):160-162 [12] Dong Yu,Li Deng,Wang Shi-zhen.Learning in the deep-structured conditional random fields[C]∥Proc.NIPS Workshop.2009:1-8 [13] Peng Fu-chun,Feng Fang-fang,McCallum A.Chinese segmentation and new word detection using conditional random fields[C]∥Proceedings of the 20th International Conference on Computational Linguistics.2004:562-568 [14] 邱泉清,苗夺谦,张志飞.中文微博命名实体识别[J].计算机科学,2013,40(6):196-198Qiu Quan-qing,Miao Duo-qian,Zhang Zhi-fei.Named entity re-cognition on Chinese micro-blog [J].Computer science,2013,40(6):196-198 [15] Mikolov T,Chen K,Corrado G,et al.Efficient estimation ofword representations in vector space[J].arXiv preprint arXiv:1301.3781,2013 [16] Xu Wei,Rudnicky A.Can artificial neural networks learn lan-guage models?[C]∥The Proceedings of the 6th International Conference on Spoken Language Processing.2000:202-205 |
No related articles found! |
|