计算机科学 ›› 2024, Vol. 51 ›› Issue (11A): 240300041-8.doi: 10.11896/jsjkx.240300041
吕佳豪, 刘进锋
LYU Jiahao, LIU Jinfeng
摘要: 以深度学习为代表的机器学习方法已经得到了广泛的应用并取得了很多成就,数据集分布偏移和长尾分布问题会使传统的深度学习方法性能出现显著下降,而这两个问题也常常存在于真实场景的数据集中。虽然领域泛化和长尾学习研究已经使这两个问题单独得到了较好的解决,但在分布偏移和长尾分布相结合(LT-DS)的复杂场景下,单一的领域泛化和长尾学习方法效果并不太好。针对LT-DS问题,可以从因果关系出发,统一地去解决这两个问题。对于分布偏移,通过傅里叶变换进行因果干预及因果分解,并通过去相关加权来获得一个跨域不变的因果特征表示。对于长尾分布,通过去混淆训练构建一个因果效应分类器来消除动量所带来的偏差,并通过Balanced Softmax和logit调整来进一步消除长尾分布带来的影响。实验结果表明,该方法在LT-DS问题上比现有最好方法在AWA2-LTS数据集和ImageNet-LTS数据集上分别平均高出了8%和5%,表现出有竞争力的结果。
中图分类号:
[1]XU H,WANG Y,WU Z,et al.Embedding-based complex feature value coupling learning for detecting outliers in non-iid ca-tegorical data[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:5541-5548. [2]ZHANG Y,KANG B,HOOI B,et al.Deep long-tailed learning:A survey[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2023,45(9):10795-10816. [3]YAO L,CHU Z,LI S,et al.A survey on causal inference[J].ACM Transactions on Knowledge Discovery from Data(TKDD),2021,15(5):1-46. [4]MEINSHAUSEN N.Causality from a distributional robustness point of view[C]//2018 IEEE DataScience Workshop(DSW).IEEE,2018:6-10. [5]REN J,YU C,MA X,et al.Balanced meta-softmax for long-tailed visual recognition[J].Advances in neural information processing systems,2020,33:4175-4186. [6]MENON A K,JAYASUMANA S,RAWAT A S,et al.Long-tail learning via logit adjustment[C]//International Conference on Learning Representations.2020. [7]GU X,GUO Y,LI Z,et al.Tackling long-tailed category distribution under domain shifts[C]//European Conference on Computer Vision.Cham:Springer Nature Switzerland,2022:727-743. [8]WANG J,LAN C,LIU C,et al.Generalizing to unseen domains:A survey on domain generalization[J].IEEE Transactions on Knowledge and Data Engineering,2022,35(8):8052-8072. [9]ZHOU K,YANG Y,QIAO Y,et al.Mixstyle neural networks for domain generalization and adaptation[J].International Journal of Computer Vision,2024,132(3):822-836. [10]MANCINI M,AKATA Z,RICCI E,et al.Towards recognizing unseen categories in unseen domains[C]//European Conference on Computer Vision.Cham:Springer International Publishing,2020:466-483. [11]LI D,ZHANG J,YANG Y,et al.Episodic training for domain generalization[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:1446-1455. [12]SHU Y,CAO Z,WANG C,et al.Open domain generalizationwith domain-augmented meta-learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:9624-9633. [13]KANG B,XIE S,ROHRBACH M,et al.Decoupling Representation and Classifier for Long-Tailed Recognition[C]//International Conference on Learning Representations.2019. [14]CHOU H P,CHANG S C,PAN J Y,et al.Remix:rebalanced mixup[C]//Computer Vision-ECCV 2020 Workshops:Glasgow,UK,Part VI 16.Springer International Publishing,2020:95-110. [15]TAN J,WANG C,LI B,et al.Equalization loss for long-tailed object recognition[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2020:11662-11671. [16]PEARL J.Direct and indirect effects[M]//Probabilistic andcausal inference:the works of Judea Pearl.2022:373-392. [17]LV F,LIANG J,LI S,et al.Causality inspired representationlearning for domain generalization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:8046-8056. [18]TANG K,HUANG J,ZHANG H.Long-tailed classification by keeping the good and removing the bad momentum causal effect[J].Advances in Neural Information Processing Systems,2020,33:1513-1524. [19]ZHANG X,CUI P,XU R,et al.Deep stable learning for out-of-distribution generalization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:5372-5382. [20]XU Q,ZHANG R,ZHANG Y,et al.A fourier-based framework for domain generalization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:14383-14392. [21]KUANG K,XIONG R,CUI P,et al.Stable Prediction withModel Misspecification and Agnostic Distribution Shift[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:4485-4492. [22]GRETTON A,FUKUMIZU K,TEO C,et al.A kernel statistical test of independence[C]//Proceedings of the 20th International Conference onNeural Information Processing Systems.2007:585-592. [23]STROBL E V,ZHANG K,VISWESWARAN S.Approximate kernel-based conditional independence tests for fast non-parametric causal discovery[J].Journal of Causal Inference,2019,7(1):20180017. [24]VANDERWEELE T J.A three-way decomposition of a totaleffect into direct,indirect,and interactive effects[J].Epidemio-logy(Cambridge,Mass.),2013,24(2):224. [25]PEARL J,GLYMOUR M,JEWELL N P.Causal inference in statistics:A primer[M].John Wiley & Sons,2016. [26]LECUN Y,CHOPRA S,HADSELL R,et al.A tutorial on energy-based learning[C]//Predicting Structured Data.2006. [27]PEARL J,MACKENZIE D.The book of why:the new science of cause and effect [M]//Basic books,2018. |
|