Computer Science ›› 2023, Vol. 50 ›› Issue (7): 27-37.doi: 10.11896/jsjkx.221100244

• Computer Software • Previous Articles     Next Articles

Rule-based Technique for Detecting Risky Dynamic Typing Code

CHEN Zhifei1, HAO Yang1, CHEN Lin2, XIAO Liang1   

  1. 1 School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210094,China
    2 State Key Laboratory for Novel Software Technology,Nanjing University,Nanjing 210023,China
  • Received:2022-11-29 Revised:2023-03-09 Online:2023-07-15 Published:2023-07-05
  • About author:CHEN Zhifei,born in 1990,Ph.D,associate professor.Her main research interests include program analysis,software testing,and software maintenance.
  • Supported by:
    National Key Research and Development Program of China(2022YFF0712100),Natural Science Foundation of Jiangsu Province,China(BK20220937) and Postgraduate Research & Practice Innovation Program of Jiangsu Province(KYCX22_0465).

Abstract: Python has experienced an explosive growth in application in recent years,especially in scientific computing and machine learning areas.While Python’s dynamically-typed nature provides developers with powerful programming abstractions,that same dynamic type system allows for type-related defects to accumulate in code bases.To address these issues,this paper investigates six types of risky dynamic typing code which could incur type-related defects.This paper formally describes the rule of each type of risky dynamic typing code and then proposes a rule-based technique for detecting them.We conduct a study on 25 real-world Python open-source software projects (with the total size of more than 945kLOC).The results show that risky dynamic typing code is widespread in open-source software projects and a single Python method could gather multiple instances of inconsistent variable types,and the rule-based detection technique achieves high detection accuracy and good performance.The technique for detecting risky dynamic typing code and findings from this work will provide a strong reference and support for healthy evolvement of dynamic typing feature and quality assurance of software projects.

Key words: Python, Dynamic typing, Type defects, Open-source software, Risky dynamic typing code, Detection technique

CLC Number: 

  • TP311
[1]KLEINSCHMAGER S,ROBBES R,STEFIK A,et al.Do static type systemsimprove the maintainability of software systems?An empirical study[C]//Proceedings of the 20th IEEE International Conference on Program Comprehension.2012:153-162.
[2]ZHANG X F,ZHU C.Empirical study of code smell impact on software evolution[J].Journal of Software,2019,30(5):1422-1437.
[3]MAYER C,HANENBERG S,ROBBES R,et al.An empirical study of the influence of static type systems on the usability of undocumented software[J].ACMSIGPLAN Notices,2012,47(10):683-702.
[4]GAO Z,BIRD C,BARR E T.To type or not to type;Quanti-fying detectable bugs in JavaScript[C]//Proceedings of the 39th International Conference on Software Engineering.IEEE,2017:758-769.
[5]MEYEROVICH L A,RABKIN A S.Empirical analysis of programming language adoption[C]//Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications.2013:1-18.
[6]RAY B,POSNETT D,FILKOV V,et al.A large scale study of programming languages and code quality in github[C]//Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering.New York,NY,USA:Association for Computing Machinery,2014:155-165.
[7]XU Z G,LIU P,ZHANG X Y,et al.Python predictive analysis for bug detection[C]//Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Enginee-ring.New York,NY,USA:Association for Computing Machinery,2016:121-132.
[8]TIAN Y C,LI K J,WANG T M,et al.Survey on code smells[J].Journal of Software,2023,34(1):150-170.
[9]CHEN Z F,LI Y H,CHEN B H,et al.An empirical study on dynamic typing related practices in Python systems[C]//Proceedings of the 28th International Conference on Program Comprehension.New York,NY,USA:Association for Computing Machinery,2020:83-93.
[10]PRADEL M,SEN K.The good,the bad,and the ugly;An empirical study of implicit type conversions in JavaScript[C]//Proceedings of the 29th European Conference on Object-Oriented Programming(ECOOP).Dagstuhl,Germany:Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik,2015:519-541.
[11]PRADEL M,SCHUH P,SEN K.TypeDevil:Dynamic type inconsistency analysis for JavaScript[C]//Proceedings of the 37th IEEE International Conference on Software Engineering.2015:314-324.
[12]CALLAÚ O,ROBBES R,TANTER É,et al.How(and why) developers use the dynamic features of programming languages;The case of Smalltalk[J].Empirical Software Engineering,2013,18(6):1156-1194.
[13]RICHARDS G,LEBRESNE S,BURG B,et al.An analysis of the dynamic behavior of JavaScript programs[C]//Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation.New York,NY,USA,Association for Computing Machinery,2010:1-12.
[14]DUFOUR B,GOARD C,HENDREN L,et al.Measuring thedynamic behaviour of AspectJ programs[C]//Proceedings of the 19th Annual ACM SIGPLAN Conference on Object-oriented Programming,Systems,Languages,and Applications.New York,NY,USA,Association for Computing Machinery,2004:150-169.
[15]HOLKNER A,HARLAND J.Evaluating the dynamic behaviour of Python applications[C]//Proceedings of the Thirty-Second Australasian Conference on Computer Science-Volume 91.AUS,Australian ComputerSociety,Inc.,2009:19-28.
[16]ÅKERBLOM B,STENDAHL J,TUMLIN M,et al.Tracing dynamic features inPython programs[C]//Proceedings of the 11th Working Conference on Mining Software Repositories.New York,NY,USA,Association for Computing Machinery,2014:292-295.
[17]PENG Y,ZHANG Y,HU M Z.An Empirical study for common language features used in Python projects[C]//Proceedings of IEEE International Conference on Software Analysis,Evolution and Reengineering(SANER).2021:24-35.
[18]JIANG C M,HUA B J,FAN Q L,et al.Empirical security study of native code in Python virtual machines[J].Computer Science,2022,49(6A):474-479.
[19]PARK J,LIM I,RYU S.Battles with false positives in staticanalysis of JavaScript web applications in the wild[C]//Procee-dings of the 38th International Conference on Software Engineering Companion(ICSE-C).2016:61-70.
[20]CHEN Z F,MA W W Y,LIN W,et al.A study on the changes of dynamic feature code when fixing bugs,Towards the benefits and costs of Python dynamic features[J].Science China Information Sciences,2018,61(1):012107.
[21]WANG B B,CHEN L,MA W W Y,et al.An empirical study on the impact of Python dynamic features on change-proneness[C]//Proceedings of 27th International Conference on Software Engineering and Knowledge Engineering.2015:134-139.
[22]KHAN F,CHEN B Q,VARRO D,et al.An empirical study of type-related defects in Python projects[J].IEEE Transactions on Software Engineering,2022,48(8):3145-3158.
[23]GONG L,PRADEL M,SRIDHARAN M,et al.DLint,Dynamically checking bad coding practices in JavaScript[C]//Procee-dings of the 2015 International Symposium on Software Testing and Analysis.New York,NY,USA,Association for Computing Machinery,2015:94-105.
[24]RAK-AMNOUYKIT I,MCCREVAN D,MILANOVA A,et al.Python 3 types in the wild,A tale of two type systems[C]//Proceedings of the 16th ACM SIGPLAN International Sympo-sium on Dynamic Languages.New York,NY,USA,Association for Computing Machinery,2020:57-70.
[25]KREJCIE R V,MORGAN D W.Determining sample size for research activities[J].Educational and Psychological Measurement,1970,30(3):607-610.
[1] JIANG Cheng-man, HUA Bao-jian, FAN Qi-liang, ZHU Hong-jun, XU Bo, PAN Zhi-zhong. Empirical Security Study of Native Code in Python Virtual Machines [J]. Computer Science, 2022, 49(6A): 474-479.
[2] ZHAO Jing-wen, FU Yan, WU Yan-xia, CHEN Jun-wen, FENG Yun, DONG Ji-bin, LIU Jia-qi. Survey on Multithreaded Data Race Detection Techniques [J]. Computer Science, 2022, 49(6): 89-98.
[3] JIANG Jing, PING Yuan, WU Qiu-di, ZHANG Li. Developer Recommendation Method for Crowdsourcing Tasks in Open Source Community [J]. Computer Science, 2022, 49(12): 99-108.
[4] GU Shuang-jia, LIU Wan-ping, HUANG Dong. Application of Express Information Encryption Based on AES and QR [J]. Computer Science, 2021, 48(11A): 588-591.
[5] ZHU Di-chen, XIA Huan, YANG Xiu-zhang, YU Xiao-min, ZHANG Ya-cheng and WU Shuai. Research on Mobile Game Industry Development in China Based on Text Mining and Decision Tree Analysis [J]. Computer Science, 2020, 47(6A): 530-534.
[6] XU Chuan-fu,WANG Xi,LIU Shu,CHEN Shi-zhao,LIN Yu. Large-scale High-performance Lattice Boltzmann Multi-phase Flow Simulations Based on Python [J]. Computer Science, 2020, 47(1): 17-23.
[7] XU Yan-fei, LIU Yuan and WU Wen-peng. Research and Application of Social Network Data Acquisition Technology [J]. Computer Science, 2017, 44(1): 277-282.
[8] DONG Li-peng CHEN Xing-yuan YANG Ying-jie SHI Wang. Implementation and Detection of Network Covert Channel [J]. Computer Science, 2015, 42(7): 216-221.
[9] SHAO Jing, CHEN Zuo-ning, YIN Hong-wu and XU Guo-chun. Design and Implementation of Information Flow Control Framework for PaaS [J]. Computer Science, 2015, 42(12): 257-262.
[10] ZHANG Xi-zhe, LUO Shi ,YIN Ying, ZHANG Bin. Analysis on Dynamic Behavior for Open-source Software Execution Network [J]. Computer Science, 2011, 38(Z10): 242-248.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!