用于短文本分类的BLSTM_MLPCNN模型

doi:10.11896/j.issn.1002-137X.2019.06.031

摘要/Abstract

摘要： 文本表示和文本特征提取是自然语言处理的基础工作,直接影响文本分类的性能。文中提出了以字符级向量联合词向量作为输入的BLSTM_MLPCNN神经网络模型。该模型首先将卷积神经网络(CNN)作用于字符以获取字符级向量,并将字符级向量联合词向量作为预训练词嵌入向量,也即双向长短时记忆网(BLSTM)模型的输入;然后联合BLSTM模型的前向输出、词嵌入向量、后向输出构成文档特征图;最后利用多层感知器卷积神经网络(MLPCNN)进行特征提取。在相关数据集上的实验结果表明:相比于CNN,RNN以及CNN与RNN的组合模型,BLSTM_MLPCNN模型具有更优的分类性能。

关键词: 词向量, 多层感知器(MLP), 多层感知器卷积网络(MLPCNN), 卷积神经网络(CNN), 双向长短时记忆神经网络(BLSTM), 字符级向量

Abstract: Text representation and text feature extraction are essential procedures in natural language processing and directly affect text classification performance.The major output of the present work is the establishment of the BLSTM_MLPCNN neural network model whose inputs are character-level vector integrated with word vector.In this model,firstly the character-level vector is obtained from character via convolutional neural network (CNN),and is integrated with the word vector to compose the pre-training words embedded vectors (also an input to BLSTM model).Then the combination of the BLSTM model’s forward output,word embedded vector and backward output forms the document feature map,and finally the MLPCNN model is used to extract feature.The experiments on the pertinent datasets prove the classification performance of BLSTM_MLPCNN model is superior to CNN model,RNN model and CNN/RNN combinatorial model.

Key words: Bidirectional long short-term memory network(BLSTM), Character-level vector, Convolutional neural network(CNN), Multi-layer perceptron convolutional neural network(MLPCNN), Multi-layer perceptron(MLP), Word vector

中图分类号:

TP391.1

郑诚, 洪彤彤, 薛满意. 用于短文本分类的BLSTM_MLPCNN模型[J]. 计算机科学, 2019, 46(6): 206-211. https://doi.org/10.11896/j.issn.1002-137X.2019.06.031

ZHENG Cheng, HONG Tong-tong, XUE Man-yi. BLSTM_MLPCNN Model for Short Text Classification[J]. Computer Science, 2019, 46(6): 206-211. https://doi.org/10.11896/j.issn.1002-137X.2019.06.031

参考文献

[1]HINTON G E.Learning distributed representations of concepts[C]∥Proceedings of the Eighth Annual Conference of the Cognitive Science Society.1986:1-12.
[2]CAI H P.Research on short text classification based on convolutional neural network[D].Chongqing:Southwest University,2016.(in Chinese)
蔡慧苹.基于卷积神经网络的短文本分类方法研究[D].重庆:西南大学,2016.
[3]WANG F,WANG Z,LI Z,et al.Concept-based short text classification and ranking[C]∥Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management.ACM,2014:1069-1078.
[4]NING Y H,FAN X H,WU Y.Short text classification based on domain word ontology[J].Computer Science,2009,36(3):142-145.(in Chinese)
宁亚辉,樊兴华,吴渝.基于领域词语本体的短文本分类[J].计算机科学,2009,36(3):142-145.
[5]RAO G Q,YU D,XUN E D.Unsupervised text feature extraction based on natural annotation information and implicit topic model[J].Journal of Chinese Information Processing,2015,29(6):141-149.(in Chinese)
饶高琦,于东,荀恩东.基于自然标注信息和隐含主题模型的无监督文本特征抽取[J].中文信息学报,2015,29(6):141-149.
[6]SRIRAM B,FUHRY D,DEMIR E,et al.Short text classification in Twitter to improve information filtering[C]∥Procee-dings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval.New York:ACM,2010:841-842.
[7]GODIN F,SLAVKOVIKJ V,DE N W,et al.Using topic models for Twitter hashtag recommendation[C]∥Proceedings of the 22nd International Conference on World Wide Web.New York:ACM,2013:593-596.
[8]MEHROTRA R,SANNER S,BUNTINE W,et al.Improving LDA topic models for Microblogs via Tweet pooling and automatic labeling[C]∥Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval.New York:ACM,2013:889-892.
[9]KALCHBRENNER N,GREFENSTETTE E,BLUNSOM P A.Convolutional neural network for modelling sentences[J].ar-Xiv:1404.2188,2014.
[10]LEI T,BARZILAY R,JAAKKOLA T.Molding CNNs for text:non-linear,non-consecutive convolutions[J].Indiana University Mathematics Journal,2015,58(3):1151-1186.
[11]WANG P.Semantic clustering and convolutional neural network for short text categorization[C]∥Proceeding of Meeting of the Association for Computional Linguistics.ACL,2015:352-357.
[12]JOULIN A,GRAVE E,BOJANOWSKI P,et al.Bag of tricks for efficient text classification[C]∥Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics.ACL,2017:427-431.
[13]AREVIAN G.Recurrent neural networks for robust real-world text classification[C]∥IEEE/WIC/ACM International Confe-rence on Web Intelligence.Washington,DC:IEEE,2007:326-329.
[14]TANG D Y,QIN B,LIU T.Document modeling with gated recurrent neural network for sentiment classification[C]∥Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.EMNL,2015:1422-1432.
[15]ZHOU C H,SUN C,LIU Z,et al.A C-LSTM neural network for text classification[J].Computer Science,2015,1(4):39-44.
[16]LAI S W,XU L H,LIU K,et al.Recurrent convolutional neural networks for text classification[C]∥Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence.AAAI,2016:2268-2273.

相关文章 15

[1]	侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木. 中文预训练模型研究进展 Advances in Chinese Pre-training Models 计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[2]	姜胜腾, 张亦弛, 罗鹏, 刘月玲, 曹阔, 赵海涛, 魏急波. 语义通信系统的性能度量指标分析 Analysis of Performance Metrics of Semantic Communication Systems 计算机科学, 2022, 49(7): 236-241. https://doi.org/10.11896/jsjkx.211200071
[3]	韩红旗, 冉亚鑫, 张运良, 桂婕, 高雄, 易梦琳. 基于共同子空间分类学习的跨媒体检索研究 Study on Cross-media Information Retrieval Based on Common Subspace Classification Learning 计算机科学, 2022, 49(5): 33-42. https://doi.org/10.11896/jsjkx.210200157
[4]	刘硕, 王庚润, 彭建华, 李柯. 基于混合字词特征的中文短文本分类算法 Chinese Short Text Classification Algorithm Based on Hybrid Features of Characters and Words 计算机科学, 2022, 49(4): 282-287. https://doi.org/10.11896/jsjkx.210200027
[5]	刘凯, 张宏军, 陈飞琼. 基于领域适应嵌入的军事命名实体识别 Name Entity Recognition for Military Based on Domain Adaptive Embedding 计算机科学, 2022, 49(1): 292-297. https://doi.org/10.11896/jsjkx.201100007
[6]	杨进才, 曹元, 胡泉, 沈显君. 基于Transformer模型与关系词特征的汉语因果类复句关系自动识别 Relation Classification of Chinese Causal Compound Sentences Based on Transformer Model and Relational Word Feature 计算机科学, 2021, 48(6A): 295-298. https://doi.org/10.11896/jsjkx.200500019
[7]	杨青, 张亚文, 朱丽, 吴涛. 基于注意力机制和BiGRU融合的文本情感分析 Text Sentiment Analysis Based on Fusion of Attention Mechanism and BiGRU 计算机科学, 2021, 48(11): 307-311. https://doi.org/10.11896/jsjkx.201000075
[8]	张玉帅, 赵欢, 李博. 基于BERT和BiLSTM的语义槽填充 Semantic Slot Filling Based on BERT and BiLSTM 计算机科学, 2021, 48(1): 247-252. https://doi.org/10.11896/jsjkx.191200088
[9]	程婧, 刘娜娜, 闵可锐, 康昱, 王新, 周扬帆. 一种低频词词向量优化方法及其在短文本分类中的应用 Word Embedding Optimization for Low-frequency Words with Applications in Short-text Classification 计算机科学, 2020, 47(8): 255-260. https://doi.org/10.11896/jsjkx.191000163
[10]	李舟军,范宇,吴贤杰. 面向自然语言处理的预训练技术研究综述 Survey of Natural Language Processing Pre-training Techniques 计算机科学, 2020, 47(3): 162-173. https://doi.org/10.11896/jsjkx.191000167
[11]	韩瑞, 顾春利, 李哲, 伍康, 高峰, 沈文海. 基于CNN-typhoon模型的全球台风报文收集方法研究 Global Typhoon Message Collection Method Based on CNN-typhoon Model 计算机科学, 2020, 47(11A): 11-17. https://doi.org/10.11896/jsjkx.201000038
[12]	霍丹, 张生杰, 万路军. 基于上下文的情感词向量混合模型 Context-based Emotional Word Vector Hybrid Model 计算机科学, 2020, 47(11A): 28-34. https://doi.org/10.11896/jsjkx.191100114
[13]	景丽, 李曼曼, 何婷婷. 结合扩充词典与自监督学习的网络评论情感分类 Sentiment Classification of Network Reviews Combining Extended Dictionary and Self-supervised Learning 计算机科学, 2020, 47(11A): 78-82. https://doi.org/10.11896/jsjkx.200400061
[14]	杨丹浩,吴岳辛,范春晓. 一种基于注意力机制的中文短文本关键词提取模型 Chinese Short Text Keyphrase Extraction Model Based on Attention 计算机科学, 2020, 47(1): 193-198. https://doi.org/10.11896/jsjkx.181202261
[15]	李舟军,王昌宝. 基于深度学习的机器阅读理解综述 Survey on Deep-learning-based Machine Reading Comprehension 计算机科学, 2019, 46(7): 7-12. https://doi.org/10.11896/j.issn.1002-137X.2019.07.002

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed