计算机科学 ›› 2021, Vol. 48 ›› Issue (12): 94-99.doi: 10.11896/jsjkx.200800193
米庆, 郭黎敏, 陈军成
MI Qing, GUO Li-min, CHEN Jun-cheng
摘要: 对代码可读性进行定量、准确的评估是有效保障软件质量、降低沟通成本以及维护成本、提高软件开发和演化效率的重要途径。然而,现有的针对代码可读性评估的研究方案大多是基于特征工程的,受到源代码表征方式、技术手段等多方面因素影响,其评估准确率并不高。为此,文中采用深度学习作为主要技术手段,提出了一种基于多维度特征和混合神经网络的代码可读性评估方法,通过整合并运用各种单一神经网络的优势,从字符级、词条级等不同维度挖掘源代码中蕴含的结构信息和语义信息,最终实现对代码可读性的量化评估。实验表明,该方法能够获得高达84.6%的评估准确率,比单独使用卷积神经网络提升了9.2%,比单独使用循环神经网络模型提升了6.5%,并且其表现优于现有的5个可读性模型,验证了所提出的多维度特征和混合神经网络的有效性。
中图分类号:
[1]HOOIMEIJER P,WEIMER W.Modeling bug report quality [C]//Proc. Twenty-Second IEEE/ACM Int.Conf.Autom.Softw.Eng.(ASE '07).2007:34. [2]BUSE R P L,WEIMER W R.Learning a Metric for Code Rea- dability[J].IEEE Trans.Softw.Eng.,2010,3(4):546-558. [3]SIVAPRAKASAM P.Improving Software Quality Through the Development of Code Readability[J].International Journal of Advanced Research in Computer and Communication Enginee-ring,2012,1(6):472-477. [4]BUSE R P L,WEIMER W R.A metric for software readability [C]//Proceedings of the 2008 International Symposium on Software Testing and Analysis(ISSTA '08).2008:121. [5]BOSWELL D,FOUCHER T.The Art of Readable Code:Simple and Practical Techniques for Writing Better Code[C]//O'Reilly Media,Inc..2011. [6]FAKHOURY S,ROY D,HASSAN A,et al.Improving source code readability:theory and practice[C]//2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC).2019:2-12. [7]SANTOS R M D,GEROSA M A.Impacts of coding practices on readability[C]//Proc.Int.Conf.Softw.Eng..2018:277-285. [8]TASHTOUSH Y,ODAT Z,ALSMADI I,et al.Impact of Programming Features on Code Readability[J].Int.J.Softw.Eng.Its Appl.,2013,7(6):441-458. [9]POSNETT D,HINDLE A,DEVANBU P.A simpler model of software readability[C]//Proceeding of the 8th Working Conference on Mining Software Repositories(MSR '11).2011:73. [10]SCALABRINO S,LINARES-VASQUEZ M,POSHYVANYK D,et al.Improving code readability models with textual features[C]//2016 IEEE 24th International Conference on Program Comprehension (ICPC).2016:1-10. [11]DORN J.A General Software Readability Model[D].Virginia:Univ.Virginia,Charlottesville,2012. [12]CROOKES D.Generating readable software[J].Softw.Eng.J.,1987,2(3):64-70. [13]BAECKER R.Enhancing program readability and comprehensibility with tools for program visualization[OL].https://dl.acm.org/doi/10.5555/55823.55858. [14]BINKLEY D,DAVIS M,LAWRIE D,et al.To camelcase or under_score[C]//2009 IEEE 17th International Conference on Program Comprehension.2009:158-167. [15]SHARIF B,MALETIC J I.An Eye Tracking Study on Camelcase and Under_score Identifier Styles[C]//2010 IEEE 18th International Conference on Program Comprehension.2010:196-205. [16]BUSE R P L,ZIMMERMANN T.Information needs for software development analytics[C]//2012 34th International Conference on Software Engineering (ICSE).2012:987-996. [17]AGGARWAL K K,SINGH Y,CHHABRA J K.An integrated measure of software maintainability[C]//Annual Reliability and Maintainability Symposium.2002:235-241. [18]BÖRSTLER J,CASPERSEN M E,NORDSTRÖM M.Beauty and the beast:on the readability of object-oriented example programs[J].Softw.Qual.J.,2016,24(2):231-246. [19]MI Q,KEUNG J,XIAO Y,et al.Improving code readability classification using convolutional neural networks[J].Inf.Softw.Technol.,2018,104:60-71. [20]MAAS A L,HANNUN A Y,NG A Y.Rectifier Nonlinearities Improve Neural Network Acoustic Models[C]//Proc.30th Int.Conf.Mach.Learn..2013. [21]KINGMA D P,BA J.Adam:A Method for Stochastic Optimization[J].arXiv:1412.6980v5. [22]LIKERT R.A technique for the measurement of attitudes[OL].https://psycnet.apa.org/record/1933-01885-001. [23]NEUBERT K,BRUNNER E.A studentized permutation test for the non-parametric Behrens-Fisher problem[J].Comput.Stat.Data Anal.,2007,51(10):5192-5204. [24]WANG S,LIU T,TAN L.Automatically Learning Semantic Features for Defect Prediction[C]//2016 IEEE/ACM 38th International Confernce on Software Engineering.2016:297-308. |
[1] | 饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277 |
[2] | 汤凌韬, 王迪, 张鲁飞, 刘盛云. 基于安全多方计算和差分隐私的联邦学习方案 Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy 计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108 |
[3] | 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺. 时序知识图谱表示学习 Temporal Knowledge Graph Representation Learning 计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204 |
[4] | 王剑, 彭雨琦, 赵宇斐, 杨健. 基于深度学习的社交网络舆情信息抽取方法综述 Survey of Social Network Public Opinion Information Extraction Based on Deep Learning 计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099 |
[5] | 郝志荣, 陈龙, 黄嘉成. 面向文本分类的类别区分式通用对抗攻击方法 Class Discriminative Universal Adversarial Attack for Text Classification 计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077 |
[6] | 姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046 |
[7] | 孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061 |
[8] | 胡艳羽, 赵龙, 董祥军. 一种用于癌症分类的两阶段深度特征选择提取算法 Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification 计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092 |
[9] | 程成, 降爱莲. 基于多路径特征提取的实时语义分割方法 Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction 计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157 |
[10] | 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木. 中文预训练模型研究进展 Advances in Chinese Pre-training Models 计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018 |
[11] | 周慧, 施皓晨, 屠要峰, 黄圣君. 基于主动采样的深度鲁棒神经网络学习 Robust Deep Neural Network Learning Based on Active Sampling 计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044 |
[12] | 苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫. 小样本雷达辐射源识别的深度学习方法综述 Survey of Deep Learning for Radar Emitter Identification Based on Small Sample 计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138 |
[13] | 刘伟业, 鲁慧民, 李玉鹏, 马宁. 指静脉识别技术研究综述 Survey on Finger Vein Recognition Research 计算机科学, 2022, 49(6A): 1-11. https://doi.org/10.11896/jsjkx.210400056 |
[14] | 孙福权, 崔志清, 邹彭, 张琨. 基于多尺度特征的脑肿瘤分割算法 Brain Tumor Segmentation Algorithm Based on Multi-scale Features 计算机科学, 2022, 49(6A): 12-16. https://doi.org/10.11896/jsjkx.210700217 |
[15] | 康雁, 徐玉龙, 寇勇奇, 谢思宇, 杨学昆, 李浩. 基于Transformer和LSTM的药物相互作用预测 Drug-Drug Interaction Prediction Based on Transformer and LSTM 计算机科学, 2022, 49(6A): 17-21. https://doi.org/10.11896/jsjkx.210400150 |
|