Computer Science ›› 2024, Vol. 51 ›› Issue (6A): 230600121-8.doi: 10.11896/jsjkx.230600121

• Computer Software & Architecture • Previous Articles     Next Articles

Test Input Prioritization Approach Based on DNN Model Output Differences

ZHU Jin1, TAO Chuanqi1,2,3,4, GUO Hongjing1   

  1. 1 College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 210016,China
    2 Ministry Key Laboratory for Safety-Critical Software Development and Verification,Nanjing 210016,China
    3 State Key Laboratory for Novel Software Technology,Nanjing 210023,China
    4 Collaborative Innovation Center of Novel Software Technology and Industrialization,Nanjing 210016,China
  • Published:2024-06-06
  • About author:ZHU Jin,born in 1996,postgraduate.His main research interests include deep learning testing and so on.
    TAO Chuanqi,Ph.D,associate professor.His main research interests include intelligent software testing,regression testing,cloud-based mobile testing as a service and quality assurance for big data applications.

Abstract: Deep neural network(DNN) testing requires a large amount of test data to ensure the quality of DNN.However,most test inputs lack annotation information,and annotating test inputs is costly.Therefore,in order to address the issue of annotation costs,researchers have proposed a test input prioritization approach to screen high priority test inputs for annotation.However,most prioritization methods are influenced by limited scenarios,such as difficulty in filtering out high confidence misclassified inputs.To address the above challenges,this paper applies differential testing technology to test input prioritization and proposes a test input prioritization method based on DNN model output differences(DeepDiff).DeepDiff first constructs a contrast model that has the same functionality as the original model,then calculates the output differences between the test inputs on the original model and the contrast model,and finally assigns higher priority to the test inputs with larger output differences.For empirical evidence,we conduct a study on four widely used datasets and the corresponding eight DNN models.Experimental results demonstrate that DeepDiff is 13.06% higher on average in effectiveness compared to the baseline approaches on the original test set and 39.69% higher on the mixed test set.

Key words: Deep neural network testing, Test input prioritization, Differential testing, Model output differences

CLC Number: 

  • TP311
[1]WANG Z,YAN M,LIU S,et al.A Review of Deep Neural Network Testing Research[J].Journal of Software,2020,31(5):1255-1275.
[2]MA L,JUEFEI-XU F,ZHANG F,et al.Deepgauge:Multi-gra-nularity testing criteria for deep learning systems[C]//Procee-dings of the 33rd ACM/IEEE International Conference on Automated Software Engineering.2018:120-131.
[3]PEI K,CAO Y,YANG J,et al.Deepxplore:Automated whitebox testing of deep learning systems[C]//Proceedings of the 26th Symposium on Operating Systems Principles.2017:1-18.
[4]SUN Y,HUANG X,KROENING D,et al.Testing deep neural networks[J].arXiv:1803.04792,2018.
[5]ZHANG J J,ZHANG X H.Multi-branch Convolutional Neural Network Pulmonary Nodule Classification Method and Its Interpretability[J].Computer Science,2020,47(9):129-134.
[6]LI X Y,HE Z,LOU Y J,et al.Bilinear graph convolutional network for wetland remote sensing classification in the South China Sea region[J].Surveying and Mapping Bulletin,2023(5):44.
[7]FENG Y,SHI Q,GAO X,et al.Deepgini:prioritizing massive tests to enhance the robustness of deep neural networks[C]//Proceedings of the 29th ACM SIGSOFT International Sympo-sium on Software Testing and Analysis.2020:177-188.
[8]SHEN W,LI Y,CHEN L,et al.Multiple-boundary clustering and prioritization to promote neural network retraining[C]//Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering.2020:410-422.
[9]AL-QADASI H,WU C,FALCONE Y,et al.DeepAbstraction:2-Level Prioritization for Unlabeled Test Inputs in Deep Neural Networks[C]//2022 IEEE International Conference On Artificial Intelligence Testing(AITest).IEEE,2022:64-71.
[10]LI Y,LI M,LAI Q,et al.Testrank:Bringing order into unla-beled test instances for deep learning tasks[J].Advances in Neural Information Processing Systems,2021,34:20874-20886.
[11]KIM J,FELDT R,YOO S.Guiding deep learning system testing using surprise adequacy[C]//2019 IEEE/ACM 41st International Conference on Software Engineering(ICSE).IEEE,2019:1039-1049.
[12]BYUN T,SHARMA V,VIJAYAKUMAR A,et al.Input priori-tization for testing neural networks[C]//2019 IEEE International Conference On Artificial Intelligence Testing(AITest).IEEE,2019:63-70.
[13]TAO Y,TAO C,GUO H,et al.TPFL:Test input prioritization for deep neural networks based on fault localization[C]//International Conference on Advanced Data Mining and Applications.Cham:Springer Nature Switzerland,2022:368-383.
[14]MCKEEMAN W M.Differential testing for software[J].Digital Technical Journal,1998,10(1):100-107.
[15]MA L,ZHANG F,SUN J,et al.Deepmutation:Mutation testing of deep learning systems[C]//2018 IEEE 29th International Symposium on Software Reliability Engineering(ISSRE).IEEE,2018:100-111.
[16]HUMBATOVA N,JAHANGIROVA G,TONELLA P.DeepCrime:mutation testing of deep learning systems based on real faults[C]//Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis.2021:67-78.
[17]TAN P N,STEINBACH M,KUMAR V.Introduction to data mining[M].Pearson Education India,2016.
[18]GOODFELLOW I J,SHLENS J,SZEGEDY C.Explaining and harnessing adversarial examples[J].arXiv:1412.6572,2014.
[19]KURAKIN A,GOODFELLOW I J,BENGIO S.Adversarial examples in the physical world[M]//Artificial intelligence safety and security.Chapman and Hall/CRC,2018:99-112.
[20]PAPERNOT N,MCDANIEL P,JHA S,et al.The limitations of deep learning in adversarial settings[C]//2016 IEEE European Symposium on Security and Privacy(EuroS&P).IEEE,2016:372-387.
[21]CARLINI N,WAGNER D.Towards evaluating the robustness of neural networks[C]//2017 IEEE Symposium on Security and Privacy(PS).IEEE,2017:39-57.
[22]WANG Z,YOU H,CHEN J,et al.Prioritizing test inputs for deep neural networks via mutation analysis[C]//2021 IEEE/ACM 43rd International Conference on Software Engineering(ICSE).IEEE,2021:397-409.
[1] CHEN Bingting, ZOU Weiqin, CAI Biyu, LIU Wenjie. Bug Report Severity Prediction Based on Fine-tuned Embedding Model with Domain Knowledge [J]. Computer Science, 2024, 51(6A): 230400068-7.
[2] WANG Shuanqi, ZHAO Jianxin, LIU Chi, WU Wei, LIU Zhao. Fuzz Testing Method of Binary Code Based on Deep Reinforcement Learning [J]. Computer Science, 2024, 51(6A): 230800078-7.
[3] TIAN Hao, WANG Chao. Design and Implementation of SNMPv3 Security Mechanism Based on National Security SM3 andSM4 Algorithms [J]. Computer Science, 2024, 51(6A): 230500209-7.
[4] KANG Zhong, WANG Maoning, MA Xiaowen, DUAN Meijiao. New Design of Redactable Consortium Blockchain Scheme Based on Multi-user Chameleon Hash [J]. Computer Science, 2024, 51(6A): 230600004-6.
[5] GENG Qian, CHUAI Ziang, JIN Jian. Operational Consistency Model Based on Consortium Blockchain for Inter-organizational Data Exchange [J]. Computer Science, 2024, 51(6A): 230800145-9.
[6] CHENG Hongzheng, YANG Wenhua. Empirical Study on Dependencies and Updates of R Packages [J]. Computer Science, 2024, 51(6): 1-11.
[7] LIU Chunling, QI Xuyan, TANG Yonghe, SUN Xuekai, LI Qinghao, ZHANG Yu. Summary of Token-based Source Code Clone Detection Techniques [J]. Computer Science, 2024, 51(6): 12-22.
[8] JIANG Yanjie, DONG Chunhao, LIU Hui. Nonsense Variable Names Detection Method Based on Lexical Features and Data Mining [J]. Computer Science, 2024, 51(6): 23-33.
[9] WEI Linlin, SHEN Guohua, HUANG Zhiqiu, CAI Mengnan, GUO Feifei. Software Data Clustering Method Combining Code Snippets and Hybrid Topic Models [J]. Computer Science, 2024, 51(6): 44-51.
[10] LIU Lei, ZHOU Zhide, LIU Xingxiang, CHE Haoyang, YAO Lei, JIANG He. Automatic Tensorization for TPU Coarse-grained Instructions [J]. Computer Science, 2024, 51(6): 52-60.
[11] XU Yiran, ZHOU Yu. Prompt Learning Based Parameter-efficient Code Generation [J]. Computer Science, 2024, 51(6): 61-67.
[12] TIAN Shuaihua, LI Zheng, WU Yonghao, LIU Yong. Identifying Coincidental Correct Test Cases Based on Machine Learning [J]. Computer Science, 2024, 51(6): 68-77.
[13] LI Zhibo, LI Qingbao, LAN Mingjing. Method of Generating Test Data by Genetic Algorithm Based on ART Optimal Selection Strategy [J]. Computer Science, 2024, 51(6): 95-103.
[14] HAN Yujie, XU Zhijie, YANG Dingyu, HUANG Bo, GUO Jianmei. CDES:Data-driven Efficiency Evaluation Methodology for Cloud Database [J]. Computer Science, 2024, 51(6): 111-117.
[15] WU Nannan, GUO Zehao, ZHAO Yiming, YU Wei, SUN Ying, WANG Wenjun. Study on Anomalous Evolution Pattern on Temporal Networks [J]. Computer Science, 2024, 51(6): 118-127.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!