Computer Science ›› 2016, Vol. 43 ›› Issue (7): 234-239.doi: 10.11896/j.issn.1002-137X.2016.07.042

Previous Articles     Next Articles

Cross-domain Sentiment Classification Based on Optimizing Classification Model Progressively

ZHANG Jun and WANG Su-ge   

  • Online:2018-12-01 Published:2018-12-01

Abstract: Cross-domain sentiment classification has attracted more attention in natural language processing field.Given that tradition active learning can’t make use of the public information between domains and the bag of words model can’t filter these words not related with sentiment classification,a method of cross-domain sentiment classification based on optimizing classification model progressively was proposed.Firstly,this paper selected the public sentiment words as features to train classification model on the labeled source domain,then used the classification model to predict the initial category label for target domain and selected the texts with high confidence value as initial seed texts of the learning model.Secondly,we added the high confidence text and low confidence text to the training set at each iteration.Finally,the feature set was extracted to transform feature space based on the sentimental dictionary,evaluation collocation rules and assist feature words.The experimental results indicate that this method can not only improve the accuracy of cross domain sentiment classification effectively,but also reduce the manual annotation price to some extent.

Key words: Sentiment classification,Cross domain,Classification model,Feature extraction,Confidence

[1] Wang Su-ge,Li De-yu,Wei Ying-jie.A Method of Text Senti-ment Classification Based on Weighted Rough Membership[J].Journal of Computer Research and Development,2011,48(5):855-861(in Chinese) 王素格,李德玉,魏英杰.基于赋权粗糙隶属度的文本情感分类方法[J].计算机研究与发展,2011,8(5):855-861
[2] Zhao Yan-yan,Qin Bing,Liu Ting.Sentiment analysis[J].Journal of Software,2010,21(8):1834-1848(in Chinese) 赵妍妍,秦兵,刘挺.文本情感分析[J].软件学报,2010,21(8):1834-1848
[3] Pang B,Lee L,Vaithyanathan S.Thumbs up?:Sentiment Classification using Machine Learning Techniques[C]∥Proceedings of the Association of Computational Linguistics Conf on Empirical Methods in Natural Language Processing.Stroudsburg,PA:ACL,2002:79-86
[4] Olsson F.A Literature Survey of Active Machine Learning in the Context of Natural Language Processing[R].Swedish Institute of Computer Science,2009
[5] Chen Xiao.Chinese Organization Names Recognition Based onSupport Vector Machine[D].Shanghai:Shanghai Jiao Tong University,2007(in Chinese) 陈霄.基于支持向量机的中文组织机构名识别[D].上海:上海交通大学,2007
[6] Che Wan-xiang,Zhang Mei-shan,Liu Ting.Active Learning for Chinese Dependency Parsing[J].Journal of Chinese Information Processing,2012,26(2):18-22(in Chinese) 车万翔,张梅山,刘挺.基于主动学习的中文依存句法分析[J].中文信息学报,2012,26(2):18-22
[7] Tong S,Koller D.Support Vector Machine Active Learning with Applications to Text Classification[J].The Journal of Machine Learning Research,2002,2(1):45-66
[8] Li S,Xue Y,Wang Z,et al.Active Learning for Cross-Domain Sentiment Classification[C]∥Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence.Menlo Park,CA:AAAI Press,2013:2127-2133
[9] Blitzer J,Dredze M,Pereira F.Biographies,Bollywood,Boom-boxes and Blenders:Domain Adaptation for Sentiment Classification[J]∥ACL,2012,1(2):187-205
[10] Liu K,Zhao J.Cross-Domain Sentiment Classification Using a Two-Stage Method[C]∥Proceedings of the 18th ACM Confe-rence on Information and Knowledge Management.New York:ACM,2009:1717-1720
[11] Zhang Hong-yu,Zhou Quan,Hu Xue-gang.Feature Selection for Cross-Domain Sentiment Classification[J].Pattem Recognition and Aitificial Intelligence,2013,26(11) :1068-1072(in Chinese)张玉红,周全,胡学钢.面向跨领域情感分类的特征选择方法[J].模式识别与人工智能,2013,26(11) :1068-1072
[12] Wei Xian-hui,Zhang Shao-wu,Yang Liang,et al.Cross-Domain Sentiment Analysis Based on Weighted SimRank[J].Pattem Recognition and Aitificial Intelligence,2013,26(11):1004-1009(in Chinese) 魏现辉,张绍武,杨亮,等.基于加权SimRank的跨领域文本情感倾向性分析[J].模式识别与人工智能,2013,26(11):1004-1009
[13] Tan S,Wu G,Tang H,et al.A Novel Scheme for Domain-transfer Problem in the context of Sentiment Analysis[C]∥Procee-dings of the 16th ACM Conference on Information and Know-ledge Management.New York:ACM,2007:979-982
[14] Jiang J,Zhai C X.Instance Weighting for Domain Adaptation in NLP[C]∥Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics.Stroudsburg,PA:ACL,2007:264-271
[15] Dai W,Yang Q,Xue G R,et al.Boosting for Transfer Learning[C]∥Proceedings of the 24th International Conference on Machine Learning.Corvallis,Oregon,USA,2007:193-200
[16] Zhao Chuan-jun,Wang Su-ge,Li De-yu,et al.Cross-DomainText Sentiment Classification Based on Grouping-AdaBoost Ensemble[J].Journal of Computer Research and Development,2015,52(3):629-638(in Chinese) 赵传君,王素格,李德玉,等.基于分组提升集成的跨领域文本情感分类[J].计算机研究与发展,2015,52(3):629-638
[17] Liao X,Xue Y,Carin L.Logistic Regression with an Auxiliary Data Source[C]∥Proceedings of the 22nd International Confe-rence on Machine Learning.New York:ACM,2005:505-512
[18] Xu Lin-hong,Lin Hong-fei,Pang Yu,et al.Constructing the Affective Lexicon Ontology[J].Journal of the China Society for Scientific and Technical Information,2008,27(2):180-185(in Chinese) 徐琳宏,林鸿飞,潘宇,等.情感词汇本体的构造[J].情报学报,2008,27(2):180-185
[19] Chen S,Wang Y.Mining the Emotional Words from ChineseReviews Based on Part of Speech and Syntax[C]∥2012 2nd International Conference on Consumer Electronics,Communications and Networks (CECNet).IEEE,2012:1904-1907

No related articles found!
Full text



No Suggested Reading articles found!