计算机科学 ›› 2024, Vol. 51 ›› Issue (6): 78-84.doi: 10.11896/jsjkx.230300224
李孜1, 周宇1,2
LI Zi1, ZHOU Yu1,2
摘要: 在软件开发中,违反语义规则的源码可以正常地编译或运行,但却存在性能或功能上的缺陷。因此,如何准确地检测此类缺陷成为了一项挑战。已有的研究通常采用基于项集的规则挖掘与检测方法,但由于未能良好地结合源码的顺序信息与控制流信息,此类方法在检测能力以及准确率上都存在较大的提升空间。针对该问题,提出了一种基于序列的程序语义规则提取与违规检测方法 SPUME。该方法将程序源码转化为中间表示序列,使用序列规则挖掘算法从中提取语义规则,并基于语义规则对源码中的缺陷进行检测。为验证SPUME 的有效性,文中将其与3种基线方法进行了对比,包括PR-Miner,Tikanga以及Bugram。实验结果表明,相较于基于无序项集进行规则挖掘的PR-Miner,以及结合了图模型的Tikanga,SPUME在检测效果、检测速度以及准确率上都有显著提升。相比基于Ngram语言模型的Bugram方法,SPUME在准确率与其相当的情况下,高效地检测出了更多程序缺陷。
中图分类号:
[1]LI Z,ZHOU Y.PR-Miner:automatically extracting implicit programming rules and detecting violations in large software code[J].ACM SIGSOFT Software Engineering Notes,2005,30(5):306-315. [2]LIVSHITS B,ZIMMERMANN T.Dynamine:finding commonerror patterns by mining software revision histories[J].ACM SIGSOFT Software Engineering Notes,2005,30(5):296-305. [3]CHANG R Y,PODGURSKI A,YANG J.Finding what’s not there:a new approach to revealing neglected conditions in software[C]//Proceedings of the 2007 International Symposium on Software Testing and Analysis.2007:163-173. [4]ANDRZEJ W,ANDREAS Z,CHRISTIAN L.Detecting object usage anomalies[C]//Proceedings of the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Enginee-ring(ESEC-FSE’07).Association for Computing Machinery,New York,NY,USA,2007:35-44. [5]WASYLKOWSKI A,ZELLER A.Mining temporal specifica-tions from object usage[J].Automated Software Engineering,2011,18:263-292. [6]WANG S,CHOLLAK D,MOVSHOVITZ-ATTIAS D,et al.Bugram:bug detection with n-gram language models[C]//Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering.2016:708-719. [7]ZHOU Y,ZHAN W,LI Z,et al.DRIVE:Dockerfile Rule Mi-ning and Violation Detection[J].arXiv:2212.05648,2022. [8]LUNA J M,FOURNIER-VIGER P,VENTURA S.Frequent itemset mining:A 25 years review[J].Wiley Interdisciplinary Reviews:Data Mining and Knowledge Discovery,2019,9(6):e1329. [9]CHEN S F,GOODMAN J.An empirical study of smoothingtechniques for language modeling[J].Computer Speech & Language,1999:13(4):359-394. [10]KAUR J,MADAN N.Association rule mining:A survey[J].International Journal of Hybrid Information Technology,2015,8(7):239-242. [11]FOURNIER-VIGER P,NKAMBOU R,TSENG V S M.RuleGrowth:mining sequential rules common to several sequences by pattern-growth[C]//Proceedings of the 2011 ACM Sympo-sium on Applied Computing.2011:956-961. [12]MCNICHOLAS P D,MURPHY T B,O’REGAN M.Standardi-sing the lift of an association rule[J].Computational Statistics &Data Analysis,2008,52(10):4712-4721. [13]FOURNIER-VIGER P,GUENICHE T,TSENG V S.Usingpartially-ordered sequential rules to generate more accurate sequence prediction[C]//Advanced Data Mining and Applications:8th International Conference(ADMA 2012).Nanjing,China,Springer Berlin Heidelberg,2012:431-442. [14]PEI J,HAN J,MORTAZAVI-ASL B,et al.Mining sequential patterns by pattern-growth:The prefixspan approach[J].IEEE Transactions on knowledge and data engineering,2004,16(11):1424-1440. [15]NAYROLLES M,MOHA N,VALTCHEV P.Improving SOA antipatterns detection in Service Based Systems by mining execution traces[C]//2013 20th Working Conference on Reverse Engineering(WCRE).IEEE,2013:321-330. [16]KAMSU-FOGUEM B,RIGAL F,MAUGET F.Mining association rules for the quality improvement of the production process[J].Expert Systems with Applications,2013,40(4):1034-1045. [17]ELIBEN.PyCParser[EB/OL].https://github.com-/eliben/pycparser. [18]BAADEL S,THABTAH F,LU J.Overlapping clustering:A review[C]//2016 SAI Computing Conference(SAI).IEEE,2016:233-237. [19]IENCO D,BORDOGNA G.Fuzzy extensions of the DBScanclustering algorithm[J].Soft Computing,2018,22(5):1719-1730. [20]BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[J].arXiv:1409.0473,2014. [21]FENG Z,GUO D,TANG D,et al.Codebert:A pre-trained mo-del for programming and natural languages[J].arXiv:2002.08155,2020. [22]LAI W,ZHOU M,HU F,et al.A new DBSCAN parameters determination method based on improved MVO[J].IEEE Access,2019,7:104085-104095. [23]KIM I Y,DE WECK O L.Adaptive weighted-sum method for bi-objective optimization:Pareto front generation[J].Structural and Multidisciplinary Optimization,2005,29:149-158. |
|