计算机科学 ›› 2017, Vol. 44 ›› Issue (1): 60-64.doi: 10.11896/j.issn.1002-137X.2017.01.011

• 2016第六届中国数据挖掘会议 • 上一篇    下一篇

基于卷积神经网络的自适应权重multi-gram语句建模系统

张春云,秦鹏达,尹义龙   

  1. 山东财经大学计算机科学与技术学院 济南250014,北京邮电大学信息与通信工程学院 北京100876,山东大学计算机科学与技术学院 济南250101
  • 出版日期:2018-11-13 发布日期:2018-11-13
  • 基金资助:
    本文受国家自然科学基金重点项目:基于机器学习的多模态医学影像信息处理与分析(U1201258),山东省自然科学杰出青年基金项目:基于机器学习的生物特征识别研究(JQ201316)资助

Self-adaptation Multi-gram Weight Learning Strategy for Sentence Representation Based on Convolutional Neural Network

ZHANG Chun-yun, QIN Peng-da and YIN Yi-long   

  • Online:2018-11-13 Published:2018-11-13

摘要: 如今信息量呈爆炸式增长,自然语言处理得到了越来越广泛的重视。传统的自然语言处理系统过多地依赖昂贵的人工标注特征和语言分析工具的语法信息,导致预处理中语法信息的错误传递到系统训练和预测过程中。因此,深度学习的应用受到了学者们的关注。因为它能实现端对端预测并尽可能少地 依赖 外部信息。自然语言处理领域流行的深度学习框架为了更好地获取句子信息,采用multi-gram策略。但不同任务和不同数据集的信息分布状况不尽相同,而且这种策略并没有考虑到不同n-gram的重要性分布。针对该问题,提出了一种基于深度学习的自适应学习multi-gram权重的策略,从而根据各n-gram特征的贡献为其分配相应的权重;并且还提出了一种新的multi-gram特征向量结合方法,大大降低了系统复杂度。将该模型应用到电影评论正负倾向判断和关系分类两种分类任务中,实验结果证明采用的自适应multi-gram权重策略能够大大改善模型的分类效果。

关键词: 深度学习,自然语言处理,自适应权重,multi-gram

Abstract: Nowadays,with the explosive growth of the information,nature language processing has been paid more attention.The traditional nature language processing systems are overly dependent on the expensive handcrafted features annotated by experts and synatx information of language analysis tools.Deep neural network can achieve end-to-end learning even without costly features.In order to extract more information from input sentences,most neural networks of nature language processing combines with multi-gram strategy.However,due to various tasks or various datasets,the information distribution of diverse n-gram is different.With this consideration,this paper proposed a self-adaptation weight learning strategy of multi-gram,which generates the importance order of multi-gram by the training procedure of neural network.Moreover,a novel combination method of multi-gram feature vectors was exploited.Experimental results show that such method can not only reduce the complexity of network,but also can improve performances of positive and negative tendency classification of movie criticism,and relation classification.

Key words: Deep learning,Natural language processing,Self-adaptation,Multi-gram

[1] GRISHMAN R.Information extraction:Capabilities and challenges[Z].Lecture Notes of 2012 International Winter School in Language and Speech Technologies,Roviral Virgili,2012.
[2] WANG T,et al.End-to-end text recognition with convolutional neural networks[C]∥2012 21st International Conference on Pattern Recognition (ICPR).2012.
[3] LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521(7553):436-444.
[4] HINTON G E,Salakhutdinov R R.Reducing the dimensionality of data with neural networks[J].Science,2006,313(5786):504-507.
[5] HINTON G,et al.Deep neural networks for acoustic modeling in speech recognition:The shared views of four research groups[J].Signal Processing Magazine,IEEE,2012,29(6):82-97.
[6] NGUYEN T H,GRISHMAN R.Relation Extraction:Perspec-tive from Convolutional Neural Networks[C]∥Workshop on Vector Modeling for NLP.2015:39-48 .
[7] IYYER M,et al.A neural network for factoid question answering over paragraphs[C]∥Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP).2014.
[8] LECUN Y,BENGIO Y.Convolutional networks for images,s-peech,and time series[M]∥ The Handbook of Brain Theory and Neural Networks.MIT Press,1995.
[9] MOZER M C.A Focused Backpropagation Algorithm for Temporal Patern Recognition[M].Hillsdale,1995:137-169.
[10] COLLOBERT R,et al.Natural language processing (almost) from scratch[J].The Journal of Machine Learning Research,2011(12):2493-2537.
[11] KIM Y.Convolutional neural networks for sentence classification[J].arXiv preprint arXiv:1408.5882,2014.
[12] CHEN Y,et al.Event Extraction via Dynamic Multi-PoolingConvolutional Neural Networks[C]∥Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing.2015.
[13] ZENG D,et al.Relation classification via convolutional deepneural network[C]∥Proceedings of COLING.2014.
[14] ZHANG Y,Wallace B.A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification[J].arXiv preprint arXiv:1510.03820,2015.
[15] HINTON G E,et al.Improving neural networks by preventing o-adaptation of feature detectors[J].arXiv preprint arXiv: 1207.0580,2012.
[16] MIKOLOV T,YIH W T,ZWEIG G.Linguistic Regularities in Continuous Space Word Representations[C]∥HLT-NAACL.2013.
[17] GLOROT X,BORDES A,BENGIO Y.Deep sparse rectifierneural networks[C]∥International Conference on Artificial Intelligence and Statistics.2011.
[18] ZEILER M D.ADADELTA:An adaptive learning rate method[J].arXiv preprint arXiv:1212.5701,2012.
[19] PANG B,LEE L.Seeing stars:Exploiting class relationships for sentiment categorization with respect to rating scales[C]∥Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics.Association for Computational Linguistics,2005.
[20] HENDRICKX I,et al.Semeval-2010 task 8:Multi-way classification of semantic relations between pairs of nominals[C]∥Proceedings of the Workshop on Semantic Evaluations:Recent Achievements and Future Directions.Association for Computational Linguistics,2009.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!