计算机科学 ›› 2022, Vol. 49 ›› Issue (5): 129-134.doi: 10.11896/jsjkx.210300180

• 数据库&大数据&数据科学 • 上一篇    下一篇

结合注意力机制与几何信息的特征融合框架

董奇达1, 王喆1, 吴松洋2   

  1. 1 华东理工大学信息科学与工程学院 上海200237
    2 公安部第三研究所 上海201204
  • 收稿日期:2021-03-17 修回日期:2021-08-10 出版日期:2022-05-15 发布日期:2022-05-06
  • 通讯作者: 王喆(wangzhe@ecust.edu.cn)
  • 作者简介:(1986360994@qq.com)
  • 基金资助:
    上海市科技计划项目(20511100600,21511100800);国家自然科学基金(62076094);信息网络安全公安部重点实验室开放课题项目(C20603)

Feature Fusion Framework Combining Attention Mechanism and Geometric Information

DONG Qi-da1, WANG Zhe1, WU Song-yang2   

  1. 1 School of Information Science and Engineering,East China University of Science and Technology,Shanghai 200237,China
    2 The Third Research Institute of Ministry of Public Security,Shanghai 201204,China
  • Received:2021-03-17 Revised:2021-08-10 Online:2022-05-15 Published:2022-05-06
  • About author:DONG Qi-da,born in 1996,postgra-duate,is a member of China Computer Federation.His main research interests include imbalance learning and deep learning.
    WANG Zhe,born in 1981,Ph.D,asso-ciate professor,is a member of China Computer Federation.His main research interests include pattern recognition and image processing.
  • Supported by:
    Shanghai Science and Technology Program(20511100600,21511100800),National Natural Science Foundation of China(62076094) and Key Lab of Information Network Security of Ministry of Public Security(C20603).

摘要: 不平衡问题在现实世界中普遍存在,而不平衡数据的分布不平衡性会严重影响模型的性能。不平衡数据通常从两方面影响模型性能:一方面是数量上的不平衡导致多数类的数据对参数有更多的更新,导致模型更加偏向多数类;另一方面是少数类样本特别少,多样性不足从而导致模型表征能力不足。针对上述问题,提出了一个结合注意力机制与几何信息的特征融合框架。具体而言,该模型首先通过预训练使模型学习数据的语义信息和判别性信息,并结合注意力机制发掘模型对不同类别数据的关注点。在第二阶段,模型通过几何信息挖掘边界特征,并且结合第一阶段得到的注意力权重对边界特征进行融合,从而对少数类的数据进行补充。基于长尾CIFAR10,CIFAR100和KDDCup99数据集的实验结果表明,所提的结合注意力机制与几何信息的特征融合框架能够有效提升对不平衡数据的分类性能,并且对于不同类型的数据,包括图像数据和结构化数据,都能有效提高分类性能。

关键词: 不平衡数据, 几何信息, 深度学习, 特征融合, 注意力机制

Abstract: The imbalanced problem is common in the real world,and the highly-skewed distribution of imbalanced data seriously affects the performance of the model.In general,the imbalanced data affects the model performance from two aspects.On the one hand,the imbalance in sample size leads to more updates of parameters in majority classes,which leads to the model biased to majority classes.On the other hand,the sample size of minority classes is too small,and the diversity is insufficient,which leads to the insufficient representation ability of the model.To solve these problems,this paper proposes a feature fusion framework combining attention mechanism and geometric information.Specifically,in the first stage,the model learns the semantic information and discriminative information of the data through pre-training,and combines the attention mechanism to discover where the mo-del pays more attention.In the second stage,the model uses geometric information to mine boundary features,and combines the attention weight obtained in the first stage to fuse the boundary features,so as to supplement minority classes.Experimental results on long tail CIFAR10,CIFAR100 and KDD Cup99 datasets show that the proposed feature fusion framework combining attention mechanism and geometric information can effectively improve the classification performance of imbalanced data,and can effectively improve the classification performance for different types of data,including image data and structured data.

Key words: Attention mechanism, Deep learning, Feature fusion, Geometric information, Imbalanced data

中图分类号: 

  • TP183
[1]FAYEK H M,LECH M,CAVEDON L.Evaluating deep lear-ning architectures for Speech Emotion Recognition[J].Neural Networks,2017,92(2):60-68.
[2]HE T,ZHANG Z,ZHANG H,et al.Bag of tricks for image classification with convolutional neural networks[C]//Procee-dings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE,2019:558-567.
[3]LIPPI M,MONTEMURRO M A,ESPOSTI D M,et al.Natural Language Statistical Features of LSTM-Generated Texts[J].IEEE Transactions Neural Networks and Learning Systems,2019,30(11):3326-3337.
[4]WANG Z,CAO C,ZHU Y.Entropy and Confidence-Based Undersampling Boosting Random Forests for Imbalanced Problems[J].IEEE Transactions on Neural Networks and Learning Systems,2020,31(12):5178-5191.
[5]ESTABROOKS A,JO T,JAPKOWICZ N.A multiple resampling method for learning from imbalanced data sets[J].Computational Intelligence,2004,20(1):18-36.
[6]LING C X,SHENG V S.Cost-sensitive learning and the class imbalance problem[J].Encyclopedia of Machine Learning,2008,2011:231-235.
[7]WANG S,MINKU L L,YAO X.Resampling-based ensemblemethods for online class imbalance learning[J].IEEE Transactions on Knowledge and Data Engineering,2014,27(5):1356-1368.
[8]ZHU T,LIN Y,LIU Y.Synthetic minority oversampling technique for multiclass imbalance problems[J].Pattern Recognition,2017,72:327-340.
[9]FANG L,AU O C,TANG K,et al.Antialiasing filter design for subpixel downsampling via frequency-domain analysis[J].IEEE Transactions Image Processing,2012,21(3):1391-1405.
[10]CHAWLA N V,BOWYER K W,HALL L O,et al.SMOTE:synthetic minority over-sampling technique[J].Journal of Artificial Intelligence Research,2002,16:321-357.
[11]HAN H,WANG W Y,MAO B H.Borderline-SMOTE:A New Over-Sampling Method in Imbalanced Data Sets Learning[C]//International Conference on Intelligent Computing.Berlin:Springer,2005:878-887.
[12]ZADROZNY B,LANGFORD J,ABE N.Cost-sensitive learning by cost-proportionate example weighting[C]//Third IEEE International Conference on Data Mining.New York:IEEE,2003:435-442.
[13]KHAN S H,HAYAT M,BENNAMOUN M,et al.Cost-sensitive learning of deep feature representations from imbalanced data[J].IEEE Transactions on Neural Networks and Learning Systems,2017,29(8):3573-3587.
[14]CHAWLA N V,LAZAREVIC A,HALL L O,et al.SMOTEBoost:Improving prediction of the minority class in boosting[C]//European Conference on Principles of Data Mining and Knowledge Discovery.Berlin:Springer,2003:107-119.
[15]SEIFFERT C,KHOSHGOFTAAR T M,VAN HULSE J,et al.RUSBoost:A hybrid approach to alleviating class imbalance[J].IEEE Transactions on Systems,Man,and Cybernetics-Part A:Systems and Humans,2009,40(1):185-197.
[16]FAN W,STOLFO S J,ZHANG J,et al.AdaCost:misclassification cost-sensitive boosting[C]//16th International Conference on Machine Learning.New York:ACM,1999:97-105.
[17]FREUND Y,SCHAPIRE R E.A decision-theoretic generalization of on-line learning and an application to boosting[J].Journal of Computer and System Sciences,1997,55(1):119-139.
[18]YE H J,CHEN H Y,ZHAN D C,et al.Identifying and compensating for feature deviation in imbalanced deep learning[J].ar-Xiv:2001.01385,2020.
[19]DONG Q,GONG S,ZHU X.Imbalanced deep learning by minority class incremental rectification[J].IEEE Transactions on Pattern analysis and Machine Intelligence,2018,41(6):1367-1381.
[20]ZHOU B,CUI Q,WEI X S,et al.Bbn:Bilateral-branch network with cumulative learning for long-tailed visual recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE,2020:9719-9728.
[21]KANG B,XIE S,ROHRBACH M,et al.Decoupling representation and classifier for long-tailed recognition[J].arXiv:1910.09217,2019.
[22]CUI Y,JIA M,LIN T Y,et al.Class-balanced loss based on effective number of samples[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE,2019:9268-9277.
[23]JAMAL M A,BROWN M,YANG M H,et al.Rethinking class-balanced methods for long-tailed visual recognition from a domain adaptation perspective[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE,2020:7610-7619.
[24]ZHOU P,ZHOU Z P,WANG L,et al.Intrusion detection me-thod based on autoencoder and ResNet[J].Application Research of Computers,2020,37(S2):224-226.
[25]ZHANG H,CISSE M,DAUPHIN Y N,et al.mixup:Beyondempirical risk minimization[J].arXiv:1710.09412,2017.
[26]CHOU H P,CHANG S C,PAN J Y,et al.Remix:Rebalanced Mixup[C]//European Conference on Computer Vision.Berlin:Springer,2020:95-110.
[27]WANG Y X,GIRSHICK R,HEBERT M,et al.Low-shot lear-ning from imaginary data[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.New York:IEEE,2018:7278-7286.
[28]ZOU Y,YU Z,KUMAR B V K,et al.Unsupervised domainadaptation for semantic segmentation via class-balanced self-training[C]//Proceedings of the European Conference on Computer Vision.Berlin:Springer,2018:289-305.
[29]GOODFELLOW I J,POUGET-ABADIE J,MIRZA M,et al.Generative adversarial networks[J].arXiv:1406.2661,2014.
[30]MARIANI G,SCHEIDEGGER F,ISTRATE R,et al.Bagan:Data augmentation with balancing gan[J].arXiv:1803.09655,2018.
[31]ZHOU F,YANG S,FUJITA H,et al.Deep learning fault diagnosis method based on global optimization GAN for unbalanced data[J].Knowledge-Based Systems,2020,187:104837.
[32]LI C,XU T,ZHU J,et al.Triple generative adversarial nets[C]//Advances in Neural Information Processing Systems.Massachusetts:MIT Press,2017:4088-4098.
[33]PUJOL O,MASIP D.Geometry-based ensembles:toward astructural characterization of the classification boundary[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2009,31(6):1140-1146.
[34]ZHU Z,WANG Z,LI D,et al.Geometric structural ensemble learning for imbalanced problems[J].IEEE Transactions on Cybernetics,2018,50(4):1617-1629.
[35]TORRES L C B,CASTRO C L,COELHO F,et al.Large Margin Gaussian Mixture Classifier With a Gabriel Graph Geometric Representation of Data Set Structure[J].IEEE Transactions on Neural Networks and Learning Systems,2020,32(3):1400-1406.
[36]GHASEMIGOL M,MONSEFI R,YAZDI H S.Ellipse support vector data description[C]//International Conference on Engineering Applications of Neural Networks.Berlin:Springer, 2009:257-268.
[37]ZHU Y,WANG Z,GAO D.Gravitational fixed radius nearestneighbor for imbalanced problem[J].Knowledge-Based Systems,2015,90:224-238.
[38]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision.New York:IEEE,2017:2980-2988.
[39]VERMA V,LAMB A,BECKHAM C,et al.Manifold mixup:Better representations by interpolating hidden states[C]//International Conference on Machine Learning.New York:ACM,2019:6438-6447.
[40]CAO C D,WEI C L,GAIDON A,et al.Learning imbalanced datasets with label distribution-aware margin loss[C]//Advances in Neural Information Processing Systems.Massachusetts:MIT Press,2019:1-18.
[41]SHRIVASTAVA A,GUPTA A,GIRSHICK R.Training re-gion-based object detectors with online hard example mining[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2016:761-769.
[1] 周芳泉, 成卫青.
基于全局增强图神经网络的序列推荐
Sequence Recommendation Based on Global Enhanced Graph Neural Network
计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[2] 戴禹, 许林峰.
基于文本行匹配的跨图文本阅读方法
Cross-image Text Reading Method Based on Text Line Matching
计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032
[3] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[4] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[5] 熊丽琴, 曹雷, 赖俊, 陈希亮.
基于值分解的多智能体深度强化学习综述
Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization
计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112
[6] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[7] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[8] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[9] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[10] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[11] 汪鸣, 彭舰, 黄飞虎.
基于多时间尺度时空图网络的交通流量预测模型
Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction
计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[12] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[13] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[14] 闫佳丹, 贾彩燕.
基于双图神经网络信息融合的文本分类方法
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[15] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!