计算机科学 ›› 2022, Vol. 49 ›› Issue (11A): 211200181-6.doi: 10.11896/jsjkx.211200181
黄玉娇1, 詹李超1, 范兴刚1, 肖杰2, 龙海霞2
HUANG Yu-jiao1, ZHAN Li-chao1, FAN Xing-gang1, XIAO Jie2, LONG Hai-xia2
摘要: 文本情感分析常用于口碑分析、话题监控、舆情分析,是自然语言处理最活跃的研究领域之一。深度学习中的预训练语言模型能解决文本情感分类任务中的一词多义、受词性及其位置影响等问题。但其模型复杂,参数繁多,导致消耗巨大资源,模型难部署。针对上述问题,采用知识蒸馏的思想,使用ELECTRA预训练模型作为教师模型,BiLSTM作为学生模型,提出基于ELECTRA-base-BiLSTM的蒸馏模型。将文本“one-hot”编码的词向量表示作为蒸馏模型的输入,进行中文文本情感分类。通过实验验证,分别比较加入ALBERT-tiny,ALBERT-base,BERT-base,BERT-wwm-ext,ERNIE-1.0,ERNIE-GRAM,ELECTRA-base这7种教师模型的蒸馏结果。研究发现ELECTRA-base-BiLSTM蒸馏模型的准确率、精确率和综合评价指标最高,情感分类效果最好,可以获得接近ELECTRA语言模型的文本情感分类结果,比轻量级浅层网络BiLSTM模型的分类准确率高5.58%。此模型在降低ELECTRA模型复杂度,减少资源消耗的同时,提升了轻量级BiLSTM模型的中文文本情感分类效果,对后续文本情感分类的研究具有一定的参考价值。
中图分类号:
[1]DAI A M,LE Q V.Semi-supervised Sequence Learning[EB/OL].(2015-11-04) [2021-11-28].https://arxiv.org/ pdf/ 1511.01432.pdf. [2]PETERS M E,NEUMANN M,IYYER M,et al.Deep contextualized word representations[C]//Proceedings of NAACL- HLT.2018:2227-2237. [3]ZHENG X,LIANG P J.Chinese Sentiment Analysis Using Bidirectional LSTM with Word Embedding[C]//ICCCS.Proceedings of the 2nd International Conference on Cloud Computing and Security.Springer,2016:601-610. [4]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[C]//Proceedings of IEEE NAACL’19.IEEE Press,2019:4171-4186. [5]SUN Y,WANG S,LI Y,et al.ERNIE:Enhanced Representa-tion through Knowledge Integration[EB/OL].(2019-04-19) [2021-11-28].https://arxiv.org/ pdf/1904.09223v1.pdf. [6]LAN Z,CHEN M,GOODMAN S,et al.ALBERT:A LiteBERT for Self-supervised Learning of Language Representations[EB/OL].(2019-09-26) [2021-11-28].https://arxiv.org/pdf/1909.11942v3.pdf. [7]CLARK K,LUONG M T,LE Q V,et al.ELECTRA:Pretrain-ing Text Encoders as Discriminators Rather Than Generators[EB/OL].(2020-03-23) [2021-11-28].https://arxiv.org/pdf/2003.10555.pdf. [8]HINTON G,VINYALS O,DEAN J.Distilling the Knowledge in a Neural Network[J].Computer Science,2015,14(7):38-39. [9]TANG R,LU Y,LIU L,et al.Distilling Task-Specific Know-ledge from BERT into Simple Neural Networks[EB/OL].(2019-03-28) [2021-11-28].https://arxiv.org/pdf/1903.12136.pdf. [10]SUN S,CHENG Y,GAN Z,et al.Patient Knowledge Distillation for BERT Model Compression[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP).2019:4314-4323. [11]CUI Y,CHE W,LIU T,et al.Revisiting Pre-Trained Models for Chinese Natural Language Processing[EB/OL].(2020-11-02) [2021-11-28].https:// arxiv.org/ pdf/ 2004.13922v2.pdf. [12]TAN S,ZHANG J.An empirical study of sentiment analysis for chinese documents[J].Expert Systems with Applications,2008,34(4):2622-2629. |
[1] | 楚玉春, 龚航, 王学芳, 刘培顺. 基于YOLOv4的目标检测知识蒸馏算法研究 Study on Knowledge Distillation of Target Detection Algorithm Based on YOLOv4 计算机科学, 2022, 49(6A): 337-344. https://doi.org/10.11896/jsjkx.210600204 |
[2] | 程祥鸣, 邓春华. 基于无标签知识蒸馏的人脸识别模型的压缩算法 Compression Algorithm of Face Recognition Model Based on Unlabeled Knowledge Distillation 计算机科学, 2022, 49(6): 245-253. https://doi.org/10.11896/jsjkx.210400023 |
[3] | 邓朋飞, 官铮, 王宇阳, 王学. 基于迁移学习和模型压缩的玉米病害识别方法 Identification Method of Maize Disease Based on Transfer Learning and Model Compression 计算机科学, 2022, 49(11A): 211200009-6. https://doi.org/10.11896/jsjkx.211200009 |
[4] | 肖正业, 林世铨, 万修安, 方昱春, 倪兰. 基于时序信息对齐的连续手语跨模态知识蒸馏 Temporal Relation Guided Knowledge Distillation for Continuous Sign Language Recognition 计算机科学, 2022, 49(11): 156-162. https://doi.org/10.11896/jsjkx.220600036 |
[5] | 张洲, 朱俊国, 余正涛. 融合词性与声调特征的越南语语法错误检测 Incorporating Part of Speech and Tonal Features for Vietnamese Grammatical Error Detection 计算机科学, 2022, 49(11): 221-227. https://doi.org/10.11896/jsjkx.210900247 |
[6] | 苗壮, 王亚鹏, 李阳, 王家宝, 张睿, 赵昕昕. 一种鲁棒的双教师自监督蒸馏哈希学习方法 Robust Hash Learning Method Based on Dual-teacher Self-supervised Distillation 计算机科学, 2022, 49(10): 159-168. https://doi.org/10.11896/jsjkx.210800050 |
[7] | 黄仲浩, 杨兴耀, 于炯, 郭亮, 李想. 基于多阶段多生成对抗网络的互学习知识蒸馏方法 Mutual Learning Knowledge Distillation Based on Multi-stage Multi-generative Adversarial Network 计算机科学, 2022, 49(10): 169-175. https://doi.org/10.11896/jsjkx.210800250 |
[8] | 冯钧, 魏大保, 苏栋, 杭婷婷, 陆佳民. 文档级实体关系抽取方法研究综述 Survey of Document-level Entity Relation Extraction Methods 计算机科学, 2022, 49(10): 224-242. https://doi.org/10.11896/jsjkx.211000057 |
[9] | 陈志文, 王坤, 周广蕴, 王旭, 张晓丹, 朱虎明. 基于胶囊网络及其权重剪枝的SAR图像变化检测方法 SAR Image Change Detection Method Based on Capsule Network with Weight Pruning 计算机科学, 2021, 48(7): 190-198. https://doi.org/10.11896/jsjkx.200800225 |
[10] | 潘芳, 张会兵, 董俊超, 首照宇. 基于高效Transformer的中文在线课程评论方面情感分析 Aspect Sentiment Analysis of Chinese Online Course Review Based on Efficient Transformer 计算机科学, 2021, 48(6A): 264-269. https://doi.org/10.11896/jsjkx.200800116 |
[11] | 丁玲, 向阳. 基于分层次多粒度语义融合的中文事件检测 Chinese Event Detection with Hierarchical and Multi-granularity Semantic Fusion 计算机科学, 2021, 48(5): 202-208. https://doi.org/10.11896/jsjkx.200800038 |
[12] | 邹傲, 郝文宁, 靳大尉, 陈刚, 田媛. 基于预训练和深度哈希的大规模文本检索研究 Study on Text Retrieval Based on Pre-training and Deep Hash 计算机科学, 2021, 48(11): 300-306. https://doi.org/10.11896/jsjkx.210300266 |
[13] | 俞亮, 魏永丰, 罗国亮, 邬昌兴. 基于知识蒸馏的隐式篇章关系识别 Knowledge Distillation Based Implicit Discourse Relation Recognition 计算机科学, 2021, 48(11): 319-326. https://doi.org/10.11896/jsjkx.201000099 |
[14] | 王润正, 高见, 黄淑华, 仝鑫. 基于知识蒸馏的恶意代码家族检测方法 Malicious Code Family Detection Method Based on Knowledge Distillation 计算机科学, 2021, 48(1): 280-286. https://doi.org/10.11896/jsjkx.200900099 |
[15] | 周志一, 宋冰, 段鹏松, 曹仰杰. 基于WiFi信号的轻量级步态识别模型LWID LWID:Lightweight Gait Recognition Model Based on WiFi Signals 计算机科学, 2020, 47(11): 25-31. https://doi.org/10.11896/jsjkx.200200044 |
|