计算机科学 ›› 2022, Vol. 49 ›› Issue (2): 198-203.doi: 10.11896/jsjkx.210100053

• 数据库&大数据&数据科学 • 上一篇    下一篇

一种基于信息瓶颈的因果关系挖掘方法

乔杰1, 蔡瑞初1, 郝志峰2   

  1. 1 广东工业大学计算机学院 广州510006
    2 佛山科学技术学院数学与大数据学院 广东 佛山528000
  • 收稿日期:2021-01-07 修回日期:2021-06-01 出版日期:2022-02-15 发布日期:2022-02-23
  • 通讯作者: 蔡瑞初(cairuichu@gmail.com)
  • 作者简介:qiaojie.chn@gmail.com
  • 基金资助:
    国家自然科学基金(61876043,61976052)

Mining Causality via Information Bottleneck

QIAO Jie1, CAI Rui-chu1, HAO Zhi-feng2   

  1. 1 School of Computer,Guangdong University of Technology,Guangzhou 510006,China
    2 School of Mathematics and Big Data,Foshan University,Foshan,Guangdong 528000,China
  • Received:2021-01-07 Revised:2021-06-01 Online:2022-02-15 Published:2022-02-23
  • About author:QIAO Jie,born in 1993,Ph.D student.His main research interests include machine learning and causality.
    CAI Rui-chu,born in 1983,Ph.D,professor,Ph.D supervisor.His main research interests include artificial intellectual and causality.
  • Supported by:
    National Natural Science Foundation of China(61876043,61976052).

摘要: 观测数据因果关系挖掘是很多学科的基础问题。然而基于约束与因果函数等的现有方法对数据的因果机制具有较强的假设,一般适用于低维数据,并不能很好地适用于存在隐变量的场景。为此,提出了一种基于信息瓶颈的因果关系挖掘方法,称为因果信息瓶颈方法。该方法将因果机制划分为压缩与提取两阶段,在压缩阶段,假设存在一个经过压缩的中间隐变量,在提取阶段,可能保留与结果变量相关的信息。在上述建模的基础上,通过推导其变分上界,设计了一种的基于变分自编码机的因果关系挖掘方法。实验结果表明,基于信息瓶颈的方法在合成数据中准确率提升了10%,在真实数据中准确率提升了4%。

关键词: 变分自编码机, 信息瓶颈, 因果发现, 因果关系挖掘, 因果信息瓶颈

Abstract: Causal discovery from observational data is a fundamental problem in many disciplines.However,existing methods such as constraint-based methods and causal function-based methods have strong assumptions on the causal mechanism of data,and are only applicable to low-dimensional data,and cannot be applied to scenarios with hidden variables.To this end,we propose a causality discovery method using information bottlenecks,called causal information bottleneck.This method divides the causal mechanism into two stages:compression and extraction.In the compression stage,we assume that there is a compressed hidden variable in the middle,while in the extraction stage,we extract the correlated information from effect variable as much as possible.Based on the causal information bottleneck,by deriving its variational upper bound,a causality discovery method based on the variational autoencoder is designed.The experimental results shows that the information bottleneck based method improves the accuracy by 10% in synthetic data and 4% in real world data.

Key words: Causal discovery, Causal information bottleneck, Information bottleneck, Mining causality, Variational autoencoder

中图分类号: 

  • TP301.6
[1]MCINERNEY J,BROST B,CHANDAR P,et al.Counterfactual Evaluation of Slate Recommendations with Sequential Reward Interactions[C]//The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.New York:ACM,2020:1779-1788.
[2]RUNGE J,BATHIANY S,BOLLT E,et al.Inferring causationfrom time series in Earth system sciences[J].Nature Communications,2019,10(1):1-13.
[3]CAI R,ZHANG Z,HAO Z,et al.Understanding social causali-ties behind human action sequences[J].IEEE Transactions on Neural Networks and Learning Systems,2016,28(8):1801-1813.
[4]WANG W J,DU X H,REN Z Y,et al.Reconstruction of Cloud Platform Attack Scenario Based on Causal Knowledge and Temporal-Spatial Correlation[J].Computer Science,48(2):317-323.
[5]CAI R C,CHEN W,ZHANG K,et al.A Survey on Non-Temporal Series Observational Data based Causal Discovery[J].Chinese Journal of Computers,2017,40(6):1470-1490.
[6]XIE F,CAI R,HUANG B,et al.Generalized Independent NoiseCondition for Estimating Latent Variable Causal Graphs[C]//Advances in Neural Information Processing Systems.New York:Curran Associates,Inc.,2020:14891-14902.
[7]GLYMOUR C,ZHANG K,SPIRTES P.Review of causal discovery methods based on graphical models[J].Frontiers in Genetics,2019,10:524.
[8]SHIMIZU S,HOYER P O,HYVÄRINEN A,et al.A linear non-Gaussian acyclic model for causal discovery[J].Journal of Machine Learning Research,2006,7(Oct):2003-2030.
[9]SHIMIZU S,INAZUMI T,SOGAWA Y,et al.DirectLiNGAM:A direct method for learning a linear non-Gaussian structural equation model[J].Journal of Machine Learning Research,2011,12(Apr):1225-1248.
[10]HOYER P,JANZING D,MOOIJ J M,et al.Nonlinear causal discovery with additive noise models[C]//Proceedings of the Twenty-Second Annual Conference on Neural Information Processing Systems.New York:NIPS,2008:689-696.
[11]ZHANG K,HYVÄRINEN A.On the Identifiability of the Post-Nonlinear Causal Model[C]//UAI 2009,Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence.Corvallis,USA:AUAI Press,2009:647-655.
[12]CAI R,QIAO J,ZHANG K,et al.Causal discovery from discrete data using hidden compact representation[C]//Advances in Neural Information Processing Systems.Calfornia,USA:NIPS,2018:2666-2674.
[13]CAI R,QIAO J,ZHANG K,et al.Causal discovery with cascade nonlinear additive noise models[C]//Proceedings of the 28th International Joint Conference on Artificial Intelligence.Palo Alto,CA:AAAI Press,2019:1609-1615.
[14]HUANG YL,LI P F,ZHU Q M.Joint Model of Events' Causal and Temporal Relations Identification[J].Computer Science,2018,45(6):204-207,234.
[15]SPIRTES P,GLYMOUR C N,SCHEINES R.Causation,prediction,and search[M].USA:MIT press,2000.
[16]TSAMARDINOS I,BROWN L E,ALIFERIS C F.The max-min hill-climbing Bayesian network structure learning algorithm[J].Machine Learning,2006,65(1):31-78.
[17]ANDERSSON S A,MADIGAN D,PERLMAN M D,et al.A characterization of Markov equivalence classes for acyclic digraphs[J].The Annals of Statistics,Institute of Mathematical Statistics,1997,25(2):505-541.
[18]SLONIM N,FRIEDMAN N,TISHBY N.Multivariate information bottleneck[J].Neural Computation,2006,18(8):1739-1789.
[19]ALEMI A A,FISCHER I,DILLON J V,et al.Deep Variational Information Bottleneck[C]//the 5th International Conference on Learning Representations.2017.
[20]KINGMA D P,WELLING M.Auto-Encoding Variational Bayes[C]//the 2nd International Conference on Learning Representations.2014.
[21]HORNIK K,STINCHCOMBE M B,WHITE H.Multilayerfeedforward networks are universal approximators[J].Neural Networks,1989,2(5):359-366.
[22]KINGMA D P,BA J.Adam:A Method for Stochastic Optimization[C]//the 3rd International Conference on Learning Representations.2014.
[23]BÜHLMANN P,PETERS J,ERNEST J,et al.CAM:Causaladditive models,high-dimensional order search and penalized regression[J].The Annals of Statistics,Institute of Mathematical Statistics,2014,42(6):2526-2556.
[24]GRETTON A,BOUSQUET O,SMOLA A J,et al.Measuring Statistical Dependence with Hilbert-Schmidt Norms[C]//Algorithmic Learning Theory,16th International Conference.Berlin,German:Springer,2005:63-77.
[25]MOOIJ J M,PETERS J,JANZING D,et al.Distinguishingcause from effect using observational data:methods and benchmarks[J].The Journal of Machine Learning Research,2016,17(1):1103-1204.
[1] 邵子灏, 杨世宇, 马国杰.
室内信息服务的基础——低成本定位技术研究综述
Foundation of Indoor Information Services:A Survey of Low-cost Localization Techniques
计算机科学, 2022, 49(9): 228-235. https://doi.org/10.11896/jsjkx.210900260
[2] 孙刚, 伍江江, 陈浩, 李军, 徐仕远.
一种基于切比雪夫距离的隐式偏好多目标进化算法
Hidden Preference-based Multi-objective Evolutionary Algorithm Based on Chebyshev Distance
计算机科学, 2022, 49(6): 297-304. https://doi.org/10.11896/jsjkx.210500095
[3] 李丹丹, 吴宇翔, 朱聪聪, 李仲康.
基于多种改进策略的改进麻雀搜索算法
Improved Sparrow Search Algorithm Based on A Variety of Improved Strategies
计算机科学, 2022, 49(6A): 217-222. https://doi.org/10.11896/jsjkx.210700032
[4] 胡聪, 何晓晖, 邵发明, 张艳武, 卢冠林, 王金康.
基于极大极稳定区域及SVM的交通标志检测
Traffic Sign Detection Based on MSERs and SVM
计算机科学, 2022, 49(6A): 325-330. https://doi.org/10.11896/jsjkx.210300117
[5] 杨健楠, 张帆.
一种结合双注意力机制和层次网络结构的细碎农作物分类方法
Classification Method for Small Crops Combining Dual Attention Mechanisms and Hierarchical Network Structure
计算机科学, 2022, 49(6A): 353-357. https://doi.org/10.11896/jsjkx.210200169
[6] 张嘉淏, 刘峰, 齐佳音.
一种基于Bottleneck Transformer的轻量级微表情识别架构
Lightweight Micro-expression Recognition Architecture Based on Bottleneck Transformer
计算机科学, 2022, 49(6A): 370-377. https://doi.org/10.11896/jsjkx.210500023
[7] 王方红, 范兴刚, 杨静静, 周杰, 王德恩.
一种基于有向感知区域调整的强栅栏构建算法
Strong Barrier Construction Algorithm Based on Adjustment of Directional Sensing Area
计算机科学, 2022, 49(6A): 612-618. https://doi.org/10.11896/jsjkx.210300291
[8] 张志龙, 史贤俊, 秦玉峰.
基于改进准深度算法的诊断策略优化方法
Diagnosis Strategy Optimization Method Based on Improved Quasi Depth Algorithm
计算机科学, 2022, 49(6A): 729-732. https://doi.org/10.11896/jsjkx.210700076
[9] 高元浩, 罗晓清, 张战成.
基于特征分离的红外与可见光图像融合算法
Infrared and Visible Image Fusion Based on Feature Separation
计算机科学, 2022, 49(5): 58-63. https://doi.org/10.11896/jsjkx.210200148
[10] 刘洋, 李凡长.
基于变分贝叶斯的纤维丛元学习算法
Fiber Bundle Meta-learning Algorithm Based on Variational Bayes
计算机科学, 2022, 49(3): 225-231. https://doi.org/10.11896/jsjkx.201100111
[11] 赵学磊, 季新生, 刘树新, 李英乐, 李海涛.
基于路径连接强度的有向网络链路预测方法
Link Prediction Method for Directed Networks Based on Path Connection Strength
计算机科学, 2022, 49(2): 216-222. https://doi.org/10.11896/jsjkx.210100107
[12] 魏昕, 冯锋.
基于高斯-柯西变异的帝国竞争算法优化
Optimization of Empire Competition Algorithm Based on Gauss-Cauchy Mutation
计算机科学, 2021, 48(11A): 142-146. https://doi.org/10.11896/jsjkx.201200071
[13] 王友卫, 朱晨, 朱建明, 李洋, 凤丽洲, 刘江淳.
基于用户兴趣词典和LSTM的个性化情感分类方法
User Interest Dictionary and LSTM Based Method for Personalized Emotion Classification
计算机科学, 2021, 48(11A): 251-257. https://doi.org/10.11896/jsjkx.201200202
[14] 肖满, 李伟东.
两点混合环上的半在线算法
Semi-online Algorithms for Mixed Ring with Two Nodes
计算机科学, 2021, 48(11A): 441-445. https://doi.org/10.11896/jsjkx.201100153
[15] 罗文聪, 郑嘉利, 全艺璇, 谢孝德, 林子涵.
基于改进型多目标樽海鞘群算法的RFID阅读器天线优化部署
Optimized Deployment of RFID Reader Antenna Based on Improved Multi-objective Salp Swarm Algorithm
计算机科学, 2021, 48(9): 292-297. https://doi.org/10.11896/jsjkx.200700167
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!