计算机科学 ›› 2024, Vol. 51 ›› Issue (11A): 240300041-8.doi: 10.11896/jsjkx.240300041

• 大数据&数据科学 • 上一篇    下一篇

基于因果关系的领域泛化长尾学习

吕佳豪, 刘进锋   

  1. 宁夏大学信息工程学院 银川 750021
  • 出版日期:2024-11-16 发布日期:2024-11-13
  • 通讯作者: 刘进锋(jfliu@nxu.edu.cn)
  • 作者简介:(12022131956@stu.nxu.edu.cn)
  • 基金资助:
    宁夏自然科学基金(2023AAC03126)

Domain Generalization and Long-tailed Learning Based on Causal Relationships

LYU Jiahao, LIU Jinfeng   

  1. School of Information Engineering,Ningxia University,Yinchuan 750021,China
  • Online:2024-11-16 Published:2024-11-13
  • About author:LYU Jiahao,born in 1996,master.His research interests include image classification and domain generalization.
    LIU Jinfeng,born in 1971,Ph.D,professor,master supervisor.His main research interests include image proces-sing and heterogeneous computing.
  • Supported by:
    Natural Science Foundation of Ningxia,China(2023AAC03126).

摘要: 以深度学习为代表的机器学习方法已经得到了广泛的应用并取得了很多成就,数据集分布偏移和长尾分布问题会使传统的深度学习方法性能出现显著下降,而这两个问题也常常存在于真实场景的数据集中。虽然领域泛化和长尾学习研究已经使这两个问题单独得到了较好的解决,但在分布偏移和长尾分布相结合(LT-DS)的复杂场景下,单一的领域泛化和长尾学习方法效果并不太好。针对LT-DS问题,可以从因果关系出发,统一地去解决这两个问题。对于分布偏移,通过傅里叶变换进行因果干预及因果分解,并通过去相关加权来获得一个跨域不变的因果特征表示。对于长尾分布,通过去混淆训练构建一个因果效应分类器来消除动量所带来的偏差,并通过Balanced Softmax和logit调整来进一步消除长尾分布带来的影响。实验结果表明,该方法在LT-DS问题上比现有最好方法在AWA2-LTS数据集和ImageNet-LTS数据集上分别平均高出了8%和5%,表现出有竞争力的结果。

关键词: 深度学习, 领域泛化, 长尾学习, 因果推断

Abstract: Deep learning,as a representative of machine learning methods,has been widely applied and achieved many successes.However,problems such as dataset distribution shift and long-tailed distribution can significantly degrade the performance of traditional deep learning methods,and these two issues often exist in real-world datasets.Although domain generalization and long-tailed learning research have provided good solutions to these two problems separately,the effect of a single domain generalization or long-tailed learning method is not satisfactory in the complex scenario of combining distribution shift and long-tailed distribution(LT-DS).To address the LT-DS problem,a unified approach can be taken from a causal perspective to solve both issues si-multaneously.For distribution shift,causal intervention and decomposition can be achieved through Fourier transform,and cross-domain invariant causal feature representations can be obtained through decorrelation weighting.For long-tailed distribution,a causal effect classifier can be constructed through debiasing training to eliminate momentum-induced biases,and further eliminate the impact of long-tailed distribution through Balanced Softmax and logit adjustment.Experimental results show that this method outperforms the best existing methods by an average of 8% and 5% on the AWA2-LTS dataset and ImageNet-LTS dataset,respectively,demonstrating competitive results on the LT-DS problem.

Key words: Deep learning, Domain generalization, Long-tailed learning, Causal inference

中图分类号: 

  • TP391.41
[1]XU H,WANG Y,WU Z,et al.Embedding-based complex feature value coupling learning for detecting outliers in non-iid ca-tegorical data[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:5541-5548.
[2]ZHANG Y,KANG B,HOOI B,et al.Deep long-tailed learning:A survey[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2023,45(9):10795-10816.
[3]YAO L,CHU Z,LI S,et al.A survey on causal inference[J].ACM Transactions on Knowledge Discovery from Data(TKDD),2021,15(5):1-46.
[4]MEINSHAUSEN N.Causality from a distributional robustness point of view[C]//2018 IEEE DataScience Workshop(DSW).IEEE,2018:6-10.
[5]REN J,YU C,MA X,et al.Balanced meta-softmax for long-tailed visual recognition[J].Advances in neural information processing systems,2020,33:4175-4186.
[6]MENON A K,JAYASUMANA S,RAWAT A S,et al.Long-tail learning via logit adjustment[C]//International Conference on Learning Representations.2020.
[7]GU X,GUO Y,LI Z,et al.Tackling long-tailed category distribution under domain shifts[C]//European Conference on Computer Vision.Cham:Springer Nature Switzerland,2022:727-743.
[8]WANG J,LAN C,LIU C,et al.Generalizing to unseen domains:A survey on domain generalization[J].IEEE Transactions on Knowledge and Data Engineering,2022,35(8):8052-8072.
[9]ZHOU K,YANG Y,QIAO Y,et al.Mixstyle neural networks for domain generalization and adaptation[J].International Journal of Computer Vision,2024,132(3):822-836.
[10]MANCINI M,AKATA Z,RICCI E,et al.Towards recognizing unseen categories in unseen domains[C]//European Conference on Computer Vision.Cham:Springer International Publishing,2020:466-483.
[11]LI D,ZHANG J,YANG Y,et al.Episodic training for domain generalization[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:1446-1455.
[12]SHU Y,CAO Z,WANG C,et al.Open domain generalizationwith domain-augmented meta-learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:9624-9633.
[13]KANG B,XIE S,ROHRBACH M,et al.Decoupling Representation and Classifier for Long-Tailed Recognition[C]//International Conference on Learning Representations.2019.
[14]CHOU H P,CHANG S C,PAN J Y,et al.Remix:rebalanced mixup[C]//Computer Vision-ECCV 2020 Workshops:Glasgow,UK,Part VI 16.Springer International Publishing,2020:95-110.
[15]TAN J,WANG C,LI B,et al.Equalization loss for long-tailed object recognition[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2020:11662-11671.
[16]PEARL J.Direct and indirect effects[M]//Probabilistic andcausal inference:the works of Judea Pearl.2022:373-392.
[17]LV F,LIANG J,LI S,et al.Causality inspired representationlearning for domain generalization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:8046-8056.
[18]TANG K,HUANG J,ZHANG H.Long-tailed classification by keeping the good and removing the bad momentum causal effect[J].Advances in Neural Information Processing Systems,2020,33:1513-1524.
[19]ZHANG X,CUI P,XU R,et al.Deep stable learning for out-of-distribution generalization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:5372-5382.
[20]XU Q,ZHANG R,ZHANG Y,et al.A fourier-based framework for domain generalization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:14383-14392.
[21]KUANG K,XIONG R,CUI P,et al.Stable Prediction withModel Misspecification and Agnostic Distribution Shift[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:4485-4492.
[22]GRETTON A,FUKUMIZU K,TEO C,et al.A kernel statistical test of independence[C]//Proceedings of the 20th International Conference onNeural Information Processing Systems.2007:585-592.
[23]STROBL E V,ZHANG K,VISWESWARAN S.Approximate kernel-based conditional independence tests for fast non-parametric causal discovery[J].Journal of Causal Inference,2019,7(1):20180017.
[24]VANDERWEELE T J.A three-way decomposition of a totaleffect into direct,indirect,and interactive effects[J].Epidemio-logy(Cambridge,Mass.),2013,24(2):224.
[25]PEARL J,GLYMOUR M,JEWELL N P.Causal inference in statistics:A primer[M].John Wiley & Sons,2016.
[26]LECUN Y,CHOPRA S,HADSELL R,et al.A tutorial on energy-based learning[C]//Predicting Structured Data.2006.
[27]PEARL J,MACKENZIE D.The book of why:the new science of cause and effect [M]//Basic books,2018.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!