计算机科学 ›› 2021, Vol. 48 ›› Issue (12): 117-124.doi: 10.11896/jsjkx.201100090
彭斌, 李征, 刘勇, 吴永豪
PENG Bin, LI Zheng, LIU Yong, WU Yong-hao
摘要: 自动化代码注释生成技术通过分析源代码的语义信息生成对应的自然语言描述文本,可以帮助开发人员更好地理解程序,降低软件维护的时间成本。大部分已有技术是基于递归神经网络(Recurrent Neural Network,RNN)的编码器和解码器神经网络实现的,但这种方法存在长期依赖问题,即在分析距离较远的代码块时,生成的注释信息的准确性不高。为此,文中提出了一种基于卷积神经网络(Convolutional Neural Network,CNN)的自动化代码注释生成方法来缓解长期依赖问题,以生成更准确的注释信息。具体而言,通过构造基于源代码的CNN和基于AST的CNN来捕获源代码的语义信息。实验结果表明,与DeepCom和Hybrid-DeepCom这两种最新的方法相比,在常用的BLEU和METEOR两种评测指标下,所提方法能更好地生成代码注释,且执行时间更短。
中图分类号:
[1]XIA X,BAO L,LO D,et al.Measuring program comprehen- sion:A large-scale field study with professionals[J].IEEE Transactions on Software Engineering,2017,44(10):951-976. [2]HU X,LI G,XIA X,et al.Deep code comment generation[C]//2018 IEEE/ACM 26th International Conference on Program Comprehension(ICPC).IEEE,2018:200-210. [3]CHEN X,YANG G,CUI Z Q,et al.State-of-the-Art survey of Automatic Code Comment Generation[J].Journal of Software,2021,32(7):2118-2141. [4]SONG X,SUN H,WANG X,et al.A survey of automatic gene- ration of source code comments:Algorithms and techniques[J].IEEE Access,2019,7:111411-111428. [5]ZHU Y,PAN M.Automatic Code Summarization:A Systematic Literature Review[J].arXiv:1909.04352,2019. [6]RODEGHERO P,LIU C,MCBURNEY P W,et al.An eye- tracking study of java programmers and application to source code summarization[J].IEEE Transactions on Software Engineering,2015,41(11):1038-1054. [7]MORENO L,APONTE J,SRIDHARA G,et al.Automatic ge- neration of natural language summaries for java classes[C]//2013 21st International Conference on Program Comprehension(ICPC).IEEE,2013:23-32. [8]LECLAIR A,JIANG S,MCMILLAN C.A neural model for generating natural language summaries of program subroutines[C]//2019 IEEE/ACM 41st International Conference on Software Engineering(ICSE).IEEE,2019:795-806. [9]SUTSKEVER I,VINYALS O,LE Q V.Sequence to sequence learning with neural networks[C]//Advances in Neural Information Processing Systems.2014:3104-3112. [10]SUN Z,ZHU Q,MOU L,et al.A grammar-based structural cnn decoder for code generation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019,33:7055-7062. [11]LECLAIR A,HAQUE S,WU L,et al.Improved code summarization via a graph neural network[J].arXiv:2004.02843,2020. [12]SHIDO Y,KOBAYASHI Y,YAMAMOTO A,et al.Automatic source code summarization with extended tree-lstm[C]//2019 International Joint Conference on Neural Networks(IJCNN).IEEE,2019:1-8. [13]CHEN Q,HU H,LIU Z.Code Summarization with Abstract Syntax Tree[C]//International Conference on Neural Information Processing.Cham:Springer,2019:652-660. [14]LECHNER M,HASANI R.Learning Long-Term Dependencies in Irregularly-Sampled Time Series[J].arXiv:2006.04418,2020. [15]HU X,LI G,XIA X,et al.Summarizing source code with trans- ferred api knowledge[C]//2018 27th International Joint Confe-rence on Artificial Intelligence.2018:1-9. [16]HU X,LI G,XIA X,et al.Deep code comment generation with hybrid lexical and syntactical information[J].Empirical Software Engineering,2020,25(3):2179-2217. [17]PAPINENI K,ROUKOS S,WARD T,et al.BLEU:a method for automatic evaluation of machine translation[C]//Procee-dings of the 40th Annual Meeting of the Association for Computational Linguistics.2002:311-318. [18]BANERJEE S,LAVIE A.METEOR:An automatic metric for MT evaluation with improved correlation with human judgments[C]//Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation AND/OR Summarization.2005:65-72. [19]XIA Q,YEH C H,CHEN X Y.A Deep Bidirectional Highway Long Short-Term Memory Network Approach to Chinese Semantic Role Labeling[C]//2019 International Joint Conference on Neural Networks(IJCNN).IEEE,2019:1-6. [20]CHUNG J,GULCEHRE C,CHO K H,et al.Empirical evaluation of gated recurrent neural networks on sequence modeling[J].arXiv:1412.3555,2014. [21]SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:1-9. [22]REN S,HE K,GIRSHICK R,et al.Faster r-cnn:Towards real-time object detection with region proposal networks[J].Advances in Neural Information Processing Systems,2015,28:91-99. [23]GUO B,ZHANG C,LIU J,et al.Improving text classification with weighted word embeddings via a multi-channel TextCNN model[J].Neurocomputing,2019,363:366-374. [24]SHEN Y,HE X,GAO J,et al.Learning semantic representa-tions using convolutional neural networks for web search[C]//Proceedings of the 23rd International Conference on World Wide Web.2014:373-374. [25]GU X,ZHANG H,ZHANG D,et al.Deep API learning[C]//Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering.2016:631-642. [26]BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[J].arXiv:1409.0473,2014. [27]LI Y,WANG Q,XIAO T,et al.Neural Machine Translation with Joint Representation[C]//AAAI.2020:8285-8292. [28]WEI B,LI G,XIA X,et al.Code generation as a dual task of code summarization[C]//Advances in Neural Information Processing Systems.2019:6563-6573. [29]DENKOWSKI M,LAVIE A.Meteor universal:Language specific translation evaluation for any target language[C]//Proceedings of the Ninth workshop on Statistical Machine Translation.2014:376-380. |
[1] | 周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026 |
[2] | 李宗民, 张玉鹏, 刘玉杰, 李华. 基于可变形图卷积的点云表征学习 Deformable Graph Convolutional Networks Based Point Cloud Representation Learning 计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023 |
[3] | 王馨彤, 王璇, 孙知信. 基于多尺度记忆残差网络的网络流量异常检测模型 Network Traffic Anomaly Detection Method Based on Multi-scale Memory Residual Network 计算机科学, 2022, 49(8): 314-322. https://doi.org/10.11896/jsjkx.220200011 |
[4] | 陈泳全, 姜瑛. 基于卷积神经网络的APP用户行为分析方法 Analysis Method of APP User Behavior Based on Convolutional Neural Network 计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121 |
[5] | 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153 |
[6] | 檀莹莹, 王俊丽, 张超波. 基于图卷积神经网络的文本分类方法研究综述 Review of Text Classification Methods Based on Graph Convolutional Network 计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064 |
[7] | 金方焱, 王秀利. 融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取 Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM 计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190 |
[8] | 赵冬梅, 吴亚星, 张红斌. 基于IPSO-BiLSTM的网络安全态势预测 Network Security Situation Prediction Based on IPSO-BiLSTM 计算机科学, 2022, 49(7): 357-362. https://doi.org/10.11896/jsjkx.210900103 |
[9] | 张颖涛, 张杰, 张睿, 张文强. 全局信息引导的真实图像风格迁移 Photorealistic Style Transfer Guided by Global Information 计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036 |
[10] | 戴朝霞, 李锦欣, 张向东, 徐旭, 梅林, 张亮. 基于DNGAN的磁共振图像超分辨率重建算法 Super-resolution Reconstruction of MRI Based on DNGAN 计算机科学, 2022, 49(7): 113-119. https://doi.org/10.11896/jsjkx.210600105 |
[11] | 刘月红, 牛少华, 神显豪. 基于卷积神经网络的虚拟现实视频帧内预测编码 Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network 计算机科学, 2022, 49(7): 127-131. https://doi.org/10.11896/jsjkx.211100179 |
[12] | 徐鸣珂, 张帆. Head Fusion:一种提高语音情绪识别的准确性和鲁棒性的方法 Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition 计算机科学, 2022, 49(7): 132-141. https://doi.org/10.11896/jsjkx.210100085 |
[13] | 孙福权, 崔志清, 邹彭, 张琨. 基于多尺度特征的脑肿瘤分割算法 Brain Tumor Segmentation Algorithm Based on Multi-scale Features 计算机科学, 2022, 49(6A): 12-16. https://doi.org/10.11896/jsjkx.210700217 |
[14] | 康雁, 徐玉龙, 寇勇奇, 谢思宇, 杨学昆, 李浩. 基于Transformer和LSTM的药物相互作用预测 Drug-Drug Interaction Prediction Based on Transformer and LSTM 计算机科学, 2022, 49(6A): 17-21. https://doi.org/10.11896/jsjkx.210400150 |
[15] | 吴子斌, 闫巧. 基于动量的映射式梯度下降算法 Projected Gradient Descent Algorithm with Momentum 计算机科学, 2022, 49(6A): 178-183. https://doi.org/10.11896/jsjkx.210500039 |
|