计算机科学 ›› 2023, Vol. 50 ›› Issue (5): 52-63.doi: 10.11896/jsjkx.221000044

• 可解释性人工智能 • 上一篇    下一篇

深度学习可解释性综述

陈冲, 陈杰, 张慧, 蔡磊, 薛亚茹   

  1. 中国石油大学(北京)信息科学与工程学院 北京 102249
  • 收稿日期:2022-10-09 修回日期:2023-02-27 出版日期:2023-05-15 发布日期:2023-05-06
  • 通讯作者: 陈冲(chenchong@cup.edu.cn)
  • 基金资助:
    国家自然科学基金(62006247);国家重点研发计划(2019YFC1510501,2022YFC2803704)

Review on Interpretability of Deep Learning

CHEN Chong, CHEN Jie, ZHANG Hui, CAI Lei, XUE Yaru   

  1. College of Information Science and Engineering,China University of Petroleum(Beijing),Beijing 102249,China
  • Received:2022-10-09 Revised:2023-02-27 Online:2023-05-15 Published:2023-05-06
  • About author:CHEN Chong,born in 1987,Ph.D,associate professor,master supervisor,is a member of China Computer Federation.His main research interests include machine learning,information fusion and machine learning interpretability.
  • Supported by:
    National Natural Science Foundation of China(62006247) and National Key R&D Program of China(2019YFC1510501,2022YFC2803704).

摘要: 随着数据量呈爆发式增长,深度学习理论与技术取得突破性进展,深度学习模型在众多分类与预测任务(图像、文本、语音和视频数据等)中表现出色,促进了深度学习的规模化与产业化应用。然而,深度学习模型的高度非线性导致其内部逻辑不明晰,并常常被视为“黑箱”模型,这也限制了其在关键领域(如医疗、金融和自动驾驶等)的应用。因此,研究深度学习的可解释性是非常必要的。首先对深度学习的现状进行简要概述,阐述深度学习可解释性的定义及必要性;其次对深度学习可解释性的研究现状进行分析,从内在可解释模型、基于归因的解释和基于非归因的解释3个角度对解释方法进行概述;然后介绍深度学习可解释性的定性和定量评估指标;最后讨论深度学习可解释性的应用以及未来发展方向。

关键词: 深度学习, 可解释性, 归因解释, 非归因解释, 评估方法

Abstract: With the explosive growth of data volume and the breakthrough of deep learning theory and technology,deep learning models perform well enough in many classification and prediction tasks(image,text,voice and video data,etc.),which promotes the large-scale and industrialized application of deep learning.However,due to the high nonlinearity of the deep learning model with undefined internal logic,it is often regarded as a “black box” model which restricts further applications in key fields(such as medical treatment,finance,autonomous driving).Therefore,it is necessary to study the interpretability of deep learning.Firstly,recent studies on deep learning,the definition and necessity of explaining deep learning models are overviewed and described.Secondly,recent studies on interpretation methods of deep learning,and its classifications from the perspective of intrinsic interpretable model and attribution-based/non-attribution-based interpretation are analyzed and summarized.Then,the qualitative and quantitative performance criteria of the interpretability of deep learning are introduced.Finally,the applications of deep learning interpretability and future research directions are discussed and recommended.

Key words: Deep learning, Interpretability, Attribution-based interpretation, Non-attribution-based interpretation, Evaluation method

中图分类号: 

  • TP181
[1]LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Na-ture,2015,521(7553):436-444.
[2]ZHANG Z,XIE Y,XING F,et al.Mdnet:A semantically and visually interpretable medical image diagnosis network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Washington:IEEE Computer Society,2017:6428-6436.
[3]LEE S M,SEO J B,YUN J,et al.Deep learning applications in chest radiography and computed tomography[J].Journal of Thoracic Imaging,2019,34(2):75-85.
[4]MONGA V,LI Y,ELDAR Y C.Algorithm unrolling:Interpre-table,efficient deep learning for signal and image processing[J].IEEE Signal Processing Magazine,2021,38(2):18-44.
[5]SAHBA A,DAS A,RAD P,et al.Image graph production by dense captioning[C]//2018 World Automation Congress(WAC).New Jersey:IEEE,2018:1-5.
[6]ALI M,YOUSUF N,RAHMAN M,et al.Machine translation using deep learning for universal networking language based on their structure[J].International Journal of Machine Learning and Cybernetics,2021,12(8):2365-2376.
[7]YU K.Deep Learning for Unsupervised Neural Machine Translation[C]//2021 2nd International Conference on Big Data & Artificial Intelligence & Software Engineering(ICBASE).New Jersey:IEEE,2021:614-617.
[8]GRIGORESCU S,TRASNEA B,COCIAS T,et al.A survey of deep learning techniques for autonomous driving[J].Journal of Field Robotics,2020,37(3):362-386.
[9]ARRIETA A B,DÍAZ-RODRÍGUEZ N,DEL SER J,et al.Explainable Artificial Intelligence(XAI):Concepts,taxonomies,opportunities and challenges toward responsible AI[J].Information Fusion,2020,58:82-115.
[10]MURDOCH W J,SINGH C,KUMBIER K,et al.Interpretable machine learning:definitions,methods,and applications[J].arXiv:1901.04592,2019.
[11]SAMEK W,MONTAVON G,LAPUSCHKIN S,et al.Explaining deep neural networks and beyond:A review of methods and applications[J].Proceedings of the IEEE,2021,109(3):247-278.
[12]SU J M,LIU H F,XIANG F T,et al.Survey of interpretation methods for deep neural networks[J].Computer Engineering,2020,46(9):1-15.
[13]ZENG C Y,YAN K,WANG Z F,et al.Survey of Interpretability Research on Deep Learning Models[J].Computer Enginee-ring and Application,2021,57(8):1-9.
[14]LEI X,LUO X L.Review on interpretability of deep learning[J].Journal of Computer Applications,2022,42(11):3588-3602.
[15]LI L M,HOU M R,CHEN K,et al.Survey on interpretability of deep learning[J].Journal of Computer Applications,2022,42(12):3639-3650.
[16]DOSHI-VELEZ F,KIM B.Towards a rigorous science of interpretable machine learning[J].arXiv:1702.08608,2017.
[17]LIPTON Z C.The mythos of model interpretability:In machine learning,the concept of interpretability is both important and slippery[J].Queue,2018,16(3):31-57.
[18]DINGEN D,VAN'T VEER M,HOUTHUIZEN P,et al.Regression Explorer:Interactive exploration of logistic regression models with subgroup analysis[J].IEEE Transactions on Visua-lization and Computer Graphics,2018,25(1):246-255.
[19]ZILKE J R,LOZA MENCÍA E,JANSSEN F.Deepred-rule extraction from deep neural networks[C]//International Confe-rence on Discovery Science.Berlin:Springer,2016:457-473.
[20]THRUN S.Extracting rules from artificial neural networks with distributed representations[M]//Cambridge:MIT Press,1994:505-512.
[21]NGUYEN D T,KASMARIK K E,ABBASS H A.Towards Interpretable Neural Networks:An Exact Transformation to Multi-Class Multivariate Decision Trees[J].arXiv:2003.04675,2020.
[22]BENÍTEZ J M,CASTRO J L,REQUENA I.Are artificial neural networks black boxes?[J].IEEE Transactions on neural networks,1997,8(5):1156-1164.
[23]YEGANEJOU M,DICK S,MILLER J.Interpretable deep con-volutional fuzzy classifier[J].IEEE Transactions on Fuzzy Systems,2019,28(7):1407-1419.
[24]KENENI B M,KAUR D,AL BATAINEH A,et al.Evolvingrule-based explainable artificial intelligence for unmanned aerial vehicles[J].IEEE Access,2019,7:17001-17016.
[25]ZHENG S,DING C.A group lasso based sparse KNN classifier[J].Pattern Recognition Letters,2020,131:227-233.
[26]SIMONYAN K,VEDALDI A,ZISSERMAN A.Deep insideconvolutional networks:Visualising image classification models and saliency maps[J].arXiv:1312.6034,2013.
[27]SHRIKUMAR A,GREENSIDE P,SHCHERBINA A,et al.Not just a black box:Learning important features through propagating activation differences[J].arXiv:1605.01713,2016.
[28]ZEILER M D,FERGUS R.Visualizing and understanding con-volutional networks[C]//European Conference on Computer Vision.Berlin:Springer,2014:818-833.
[29]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenetclassification with deep convolutional neural networks[J].Communications of the ACM,2017,60(6):84-90.
[30]SPRINGENBERG J T,DOSOVITSKIY A,BROX T,et al.Striving for simplicity:The all convolutional net[J].arXiv:1412.6806,2014.
[31]SUNDARARAJAN M,TALY A,YAN Q.Axiomatic attribution for deep networks[C]//International Conference on Machine Learning.New York:PMLR,2017:3319-3328.
[32]SMILKOV D,THORAT N,KIM B,et al.Smoothgrad:removing noise by adding noise[J].arXiv:1706.03825,2017.
[33]ZHOU B,KHOSLA A,LAPEDRIZA A,et al.Learning deep features for discriminative localization[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New Jersey:IEEE,2016:2921-2929.
[34]SELVARAJU R R,COGSWELL M,DAS A,et al.Grad-cam:Visual explanations from deep networks via gradient-based localization[C]//Proceedings of the IEEE International Confe-rence on Computer Vision.Washington:IEEE Computer Society,2017:618-626.
[35]CHATTOPADHAY A,SARKAR A,HOWLADER P,et al.Grad-cam++:Generalized gradient-based visual explanations for deep convolutional networks[C]//2018 IEEE Winter Conference on Applications of Computer Vision(WACV).New Jersey:IEEE,2018:839-847.
[36]BACH S,BINDER A,MONTAVON G,et al.On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation[J].PLoS One,2015,10(7):1-46.
[37]KINDERMANS P-J,SCHÜTT K T,ALBER M,et al.Learning how to explain neural networks:Pattern net and pattern attribution[J].arXiv:1705.05598,2017.
[38]DABKOWSKI P,GAL Y.Real time image saliency for blackbox classifiers[C]//Proceeding of the 31st International Conference on Neural Information Processing Systems.Long Beach:Curran Associates Inc,2017:6970-6979.
[39]RIBEIRO M T,SINGH S,GUESTRIN C.“Why should I trust you?” Explaining the predictions of any classifier[C]//Procee-dings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2016:1135-1144.
[40]WANG H,WANG Z,DU M,et al.Score-CAM:Score-weighted visual explanations for convolutional neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.New Jersey:IEEE,2020:24-25.
[41]PETSIUK V,DAS A,SAENKO K.Rise:Randomized inputsampling for explanation of black-box models[J].arXiv:1806.07421,2018.
[42]FONG R C,VEDALDI A.Interpretable explanations of black boxes by meaningful perturbation[C]//Proceedings of the IEEE International Conference on Computer Vision.New Jersey:IEEE,2017:3429-3437.
[43]SHAPLEY L S.A value for n-person games[J].Classics in game theory,1952,2(28):307-317.
[44]LUNDBERG S M,LEE S I.A unified approach to interpreting model predictions[C]//Proceeding of the 31st International Conference on Neural Information Processing Systems.Long Beach:Curran Associates Inc,2017:4768-4777.
[45]STRUMBELJ E,KONONENKO I.An efficient explanation ofindividual classifications using game theory[J].The Journal of Machine Learning Research,2010,11:1-18.
[46]KIM B,WATTENBERG M,GILMER J,et al.Interpretabilitybeyond feature attribution:Quantitative testing with concept activation vectors(tcav)[C]//International Conference on Machine Learning.New York:PMLR,2018:2668-2677.
[47]BIEN J,TIBSHIRANI R.Prototype selection for interpretableclassification[J].The Annals of Applied Statistics,2011,5(4):2403-2424.
[48]LI O,LIU H,CHEN C,et al.Deep learning for case-based reasoning through prototypes:A neural network that explains its predictions[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2018:3530-3537.
[49]WARGNIER-DAUCHELLE V,GRENIER T,DURAND-DU-BIEF F,et al.A more interpretable classifier for multiple sclerosis[C]//2021 IEEE 18th International Symposium on Biome-dical Imaging(ISBI).New Jersey:IEEE,2021:1062-1066.
[50]CHEN C,LI O,TAO D,et al.This looks like that:deep learning for interpretable image recognition[C]//Proceeding of the 33rd International Conference on Neural Information Processing Systems.Vancouver:Curran Associates Inc,2019:8930-8941.
[51]KIM E,KIM S,SEO M,et al.XProtoNet:diagnosis in chest radiography with global and local explanations[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington:IEEE Computer Society,2021:15719-15728.
[52]WACHTER S,MITTELSTADT B,RUSSELL C.Counterfa-ctual explanations without opening the black box:Automated decisions and the GDPR[J].Harvard Journal of Law & Technology,2017,31(2):841.
[53]MOTHILAL R K,SHARMA A,TAN C.Explaining machine learning classifiers through diverse counterfactual explanations[C]//Proceedings of the 2020 Conference on Fairness,Accoun-tability,and Transparency.2020:607-617.
[54]SHARMA S,HENDERSON J,GHOSH J.Certifai:Counterfactual explanations for robustness,transparency,interpretability,and fairness of artificial intelligence models[J].arXiv:1905.07857,2019.
[55]LUONG M T,PHAM H,MANNING C D.Effective approaches to attention-based neural machine translation[J].arXiv:1508.04025,2015.
[56]STAHLBERG F,SAUNDERS D,BYRNE B.An operation sequence model for explainable neural machine translation[J].arXiv:1808.09688,2018.
[57]GHADER H,MONZ C.What does attention in neural machine translation pay attention to?[J].arXiv:1710.03348,2017.
[58]MORI K,FUKUI H,MURASE T,et al.Visual explanation by attention branch network for end-to-end learning-based self-driving[C]//2019 IEEE Intelligent Vehicles Symposium(IV).New Jersey:IEEE,2019:1577-1582.
[59]WEI X,GALES M J,KNILL K M.Analysing bias in spoken language assessment using concept activation vectors[C]//2021 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2021).New Jersey:IEEE,2021:7753-7757.
[60]GRAZIANI M,ANDREARCZYK V,MÜLLER H.Under-standing and Interpreting Machine Learning in Medical Image Computing Applications[M]//New York:Springer,2018:124-132.
[61]ZHAI Z,ORTEGA J F M,MARTÍNEZ N L,et al.An efficient case retrieval algorithm for agricultural case-based reasoning systems,with consideration of case base maintenance[J].Agriculture,2020,10(9):387.
[62]ZHU P,OGINO M.Interpretability of Machine Intelligence in Medical Image Computing and Multimodal Learning for Clinical Decision Support[M]//New York:Springer,2019:39-47.
[63]KARPATHY A,LI F F.Deep visual-semantic alignments for generating image descriptions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:3128-3137.
[64]LEE H,KIM S T,RO Y M.Interpretability of machine intelligence in medical image computing and multimodal learning for clinical decision support[M].New York:Springer,2019:21-29.
[65]OVIEDO F,REN Z,SUN S,et al.Fast and interpretable classification of small X-ray diffraction datasets using data augmentation and deep neural networks[J].NPJ Computational Mate-rials,2019,5(1):1-9.
[66]ADEBAYO J,GILMER J,MUELLY M,et al.Sanity checks for saliency maps[C]//Proceeding of the 32nd International Conference on Neural Information Processing Systems.Montréal:Curran Associates Inc,2018:9525-9536.
[67]YEH C K,HSIEH C Y,SUGGALA A,et al.On the(in) fidelityand sensitivity of explanations[C]//Proceeding of the 33rd International Conference on Neural Information Processing Systems.Vancouver:Curran Associates Inc,2019:10967-10978.
[68]ANCONA M,CEOLINI E,ÖZTIRELI C,et al.Towards better understanding of gradient-based attribution methods for deep neural networks[J].arXiv:1711.06104,2017.
[69]NAM W J,GUR S,CHOI J,et al.Relative attributing propagation:Interpreting the comparative contributions of individual units in deep neural networks[C]//Procee-dings of the AAAI Conference on Artificial Intelligence.Menlo Park:AAAI,2020:2501-2508.
[70]JAKAB T,GUPTA A,BILEN H,et al.Self-supervised learning of interpretable keypoints from unlabelled videos[C]//Procee-dings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington:IEEE Computer Society,2020:8787-8797.
[71]ZHANG J,BARGAL S A,LIN Z,et al.Top-down neural attention by excitation backprop[J].International Journal of Computer Vision,2018,126(10):1084-1102.
[72]LIN C Y.Rouge:A package for automatic evaluation of summaries[C]//Text summarization branches out.2004:74-81.
[73]ALVAREZ-MELIS D,JAAKKOLA T.Towards robust inter-pretability with self-explaining neural networks[C]//Procee-dings of the 32nd International Conference on Neural Information Processing Systems.2018:7786-7795.
[74]CHU L,HU X,HU J,et al.Exact and consistent interpretation for piecewise linear neural networks:A closed form solution[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2018:1244-1253.
[75]LAKKARAJU H,ARSOV N,BASTANI O.Robust and stable black box explanations[C]//International Conference on Machine Learning.New York:PMLR,2020:5628-5638.
[76]AFSHAR P,PLATANIOTIS K N,MOHAMMADI A.Capsule networks' interpretability for brain tumor classification via radiomics analyses[C]//2019 IEEE International Conference on Image Processing(ICIP).New Jersey:IEEE,2019:3816-3820.
[77]WU J,ZHOU B,PECK D,et al.Deepminer:Discovering inter-pretable representations for mammogram classification and explanation[J].arXiv:1805.12323,2018.
[78]WANG Y,FENG C,GUO C,et al.Solving the sparsity problem in recommendations via cross-domain item embedding based on co-clustering[C]//Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining.2019:717-725.
[79]EL-SAPPAGH S,ALONSO J M,ISLAM S,et al.A multilayer multimodal detection and prediction model based on explainable artificial intelligence for Alzheimer's disease[J].Scientific reports,2021,11(1):1-26.
[80]CHEN J H,CHEN S Y C,TSAI Y C,et al.Explainable deep convolutional candlestick learner[J].arXiv:2001.02767,2020.
[81]HAN M,KIM J.Joint banknote recognition and counterfeit detection using explainable artificial intelligence[J].Sensors,2019,19(16):3607.
[82]MÜLLER J M.Comparing Technology Acceptance for Autonomous Vehicles,Battery Electric Vehicles,and Car Sharing-A Study across Europe,China,and North America[J].Sustai-nability,2019,11(16):4333.
[83]BOJARSKI M,CHOROMANSKA A,CHOROMANSKI K,et al.Visualbackprop:visualizing cnns for autonomous driving[J].arXiv:1611.05418,2016.
[84]ZENG W,LUO W,SUO S,et al.End-to-end interpretable neural motion planner[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.New Jersey:IEEE,2019:8660-8669.
[85]KIM J,ROHRBACH A,AKATA Z,et al.Toward explainable and advisable model for self-driving cars[J].Applied AI Letters,2021,2(4):1-13.
[86]OMEIZA D,WEB H,JIROTKA M,et al.Towards accountability:providing intelligible explanations in autonomous driving[C]//2021 IEEE Intelligent Vehicles Symposium(IV).New Jersey:IEEE,2021:231-237.
[87]LIGHTBOURNE J.Damned lies & criminal sentencing usingevidence-based tools[J].Duke Law & Technology Review,2016,15:327.
[88]TAN S,CARUANA R,HOOKER G,et al.Distill-and-compare:Auditing black-box models using transparent model distillation[C]//Proceedings of the 2018 AAAI/ACM Conference on AI,Ethics,and Society.2018:303-310.
[89]BERK R A,BLEICH J.Statistical procedures for forecastingcriminal behavior:A comparative assessment[J].Criminology &Pubulic Pollcy,2013,12(3):511.
[90]SUN Z,FAN C,HAN Q,et al.Self-explaining structures im-prove nlp models[J].arXiv:2012.01786,2020.
[91]BERTSIMAS D,DELARUE A,JAILLET P,et al.The price of interpretability[J].arXiv:1907.03419,2019.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!