计算机科学 ›› 2022, Vol. 49 ›› Issue (11A): 210800197-5.doi: 10.11896/jsjkx.210800197
刘大为1, 车超1, 魏小鹏1,2
LIU Da-wei1, CHE Chao1, WEI Xiao-peng1,2
摘要: 在海关进出口商品文本信息中,往往会用不同的词语描述同一商品的特征,识别这些商品的特征同义词能更好地进行观点汇总,进而对同一类特征的商品进行涉税风险的防控。针对海关申报要素短语的特点,提出一种融合多层次信息的卷积神经网络模型,构建并训练了一个基于孪生和三级网络结构的Sentence-BERT,其对相近的要素短语具有更好的语义表示,弥补了word2vec短文本词嵌入特征离散稀疏的不足。利用多尺寸卷积核提取要素短语的不同特征。通过BiLSTM神经网络学习要素短语的语序信息,并利用注意力机制分配关键词权重。获得的全连接融合同义词语义特征和关键词特征,通过softmax层进行预测。实验证明,融合多层次信息的卷积模型比其他模型有更好的表现。
中图分类号:
[1]FEI H,TAN S,LI P.Hierarchical multi-task word embedding learning for synonym prediction[C]//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Disco-very & Data Mining.2019:834-842. [2]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-Training of Deep Bidirectional Transformers for Language Understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics.Minneapolis,Minnesota,2019:4171-4186. [3]CHANG W C,YU F X,CHANG Y W,et al.Pre-training tasks for embedding-based large-scale retrieval [C]//Proceedings of the 8th International Conference on Learning Representations(ICLR).2020. [4]REIMERS N,GUREVYCH I.Sentence-Bert:Sentence Embeddings Using Siamese Bert-Networks[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.2019:3973-3983. [5]HASAN S A,LIU B,LIU J,et al.Neural clinical paraphrasegeneration with attention[C]//Proceedings of the Clinical Na-tural Language Processing Workshop(ClinicalNLP).2016:42-53. [6]ZHAO Y,LIU Q,HU F,et al.Synonym Extraction in Shipping Industry with Distant Supervision and Deep Neural Network [J].Journal of Coastal Research,2019,94(sp1):455-459. [7]ZHANG Y,ROLLER S,WALLACE B.MGNC-CNN:A simple approach to exploiting multiple word embeddings for sentence classification [C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2016:1522-1527 [8]YU W,ZHENNI Z,JIE Y,et al.Weibo Sentiment Classification Based on Two Channels Text Convolution Neural Network with Multi-Feature [C]//2020 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery(CyberC).2020:152-60. [9]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient Estimation of Word Representations in VectorSpace[C]//Proceedings of the 1st International Conference on Learning Representations.Scottsdale,USA,2013. [10]PENNINGTON J,SOCHER R,MANNING C D.Glove:Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing(EMNLP).2014:1532-1543. [11]LAI S,XU L,LIU K,et al.Recurrent convolutional neural networks for text classification[C]//Proceedings of the Twenty-ninth AAAI Conference on Artificial Intelligence.2015:2267-2273 [12]ZHOU P,SHI W,TIAN J,et al.Attention-based bidirectional long short-term memory networks for relation classification[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.2016:207-212. |
[1] | 周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026 |
[2] | 陈泳全, 姜瑛. 基于卷积神经网络的APP用户行为分析方法 Analysis Method of APP User Behavior Based on Convolutional Neural Network 计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121 |
[3] | 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153 |
[4] | 檀莹莹, 王俊丽, 张超波. 基于图卷积神经网络的文本分类方法研究综述 Review of Text Classification Methods Based on Graph Convolutional Network 计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064 |
[5] | 李宗民, 张玉鹏, 刘玉杰, 李华. 基于可变形图卷积的点云表征学习 Deformable Graph Convolutional Networks Based Point Cloud Representation Learning 计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023 |
[6] | 张颖涛, 张杰, 张睿, 张文强. 全局信息引导的真实图像风格迁移 Photorealistic Style Transfer Guided by Global Information 计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036 |
[7] | 戴朝霞, 李锦欣, 张向东, 徐旭, 梅林, 张亮. 基于DNGAN的磁共振图像超分辨率重建算法 Super-resolution Reconstruction of MRI Based on DNGAN 计算机科学, 2022, 49(7): 113-119. https://doi.org/10.11896/jsjkx.210600105 |
[8] | 刘月红, 牛少华, 神显豪. 基于卷积神经网络的虚拟现实视频帧内预测编码 Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network 计算机科学, 2022, 49(7): 127-131. https://doi.org/10.11896/jsjkx.211100179 |
[9] | 徐鸣珂, 张帆. Head Fusion:一种提高语音情绪识别的准确性和鲁棒性的方法 Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition 计算机科学, 2022, 49(7): 132-141. https://doi.org/10.11896/jsjkx.210100085 |
[10] | 金方焱, 王秀利. 融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取 Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM 计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190 |
[11] | 孙福权, 崔志清, 邹彭, 张琨. 基于多尺度特征的脑肿瘤分割算法 Brain Tumor Segmentation Algorithm Based on Multi-scale Features 计算机科学, 2022, 49(6A): 12-16. https://doi.org/10.11896/jsjkx.210700217 |
[12] | 吴子斌, 闫巧. 基于动量的映射式梯度下降算法 Projected Gradient Descent Algorithm with Momentum 计算机科学, 2022, 49(6A): 178-183. https://doi.org/10.11896/jsjkx.210500039 |
[13] | 杨玥, 冯涛, 梁虹, 杨扬. 融合交叉注意力机制的图像任意风格迁移 Image Arbitrary Style Transfer via Criss-cross Attention 计算机科学, 2022, 49(6A): 345-352. https://doi.org/10.11896/jsjkx.210700236 |
[14] | 杨健楠, 张帆. 一种结合双注意力机制和层次网络结构的细碎农作物分类方法 Classification Method for Small Crops Combining Dual Attention Mechanisms and Hierarchical Network Structure 计算机科学, 2022, 49(6A): 353-357. https://doi.org/10.11896/jsjkx.210200169 |
[15] | 张嘉淏, 刘峰, 齐佳音. 一种基于Bottleneck Transformer的轻量级微表情识别架构 Lightweight Micro-expression Recognition Architecture Based on Bottleneck Transformer 计算机科学, 2022, 49(6A): 370-377. https://doi.org/10.11896/jsjkx.210500023 |
|