Computer Science ›› 2023, Vol. 50 ›› Issue (5): 52-63.doi: 10.11896/jsjkx.221000044

Review on Interpretability of Deep Learning

CHEN Chong, CHEN Jie, ZHANG Hui, CAI Lei, XUE Yaru   

  1. College of Information Science and Engineering,China University of Petroleum(Beijing),Beijing 102249,China
  • Received:2022-10-09 Revised:2023-02-27 Online:2023-05-15 Published:2023-05-06
  • About author:CHEN Chong,born in 1987,Ph.D,associate professor,master supervisor,is a member of China Computer Federation.His main research interests include machine learning,information fusion and machine learning interpretability.
  • Supported by:
    National Natural Science Foundation of China(62006247) and National Key R&D Program of China(2019YFC1510501,2022YFC2803704).

Abstract: With the explosive growth of data volume and the breakthrough of deep learning theory and technology,deep learning models perform well enough in many classification and prediction tasks(image,text,voice and video data,etc.),which promotes the large-scale and industrialized application of deep learning.However,due to the high nonlinearity of the deep learning model with undefined internal logic,it is often regarded as a “black box” model which restricts further applications in key fields(such as medical treatment,finance,autonomous driving).Therefore,it is necessary to study the interpretability of deep learning.Firstly,recent studies on deep learning,the definition and necessity of explaining deep learning models are overviewed and described.Secondly,recent studies on interpretation methods of deep learning,and its classifications from the perspective of intrinsic interpretable model and attribution-based/non-attribution-based interpretation are analyzed and summarized.Then,the qualitative and quantitative performance criteria of the interpretability of deep learning are introduced.Finally,the applications of deep learning interpretability and future research directions are discussed and recommended.

Key words: Deep learning, Interpretability, Attribution-based interpretation, Non-attribution-based interpretation, Evaluation method

