计算机科学 ›› 2022, Vol. 49 ›› Issue (7): 212-219.doi: 10.11896/jsjkx.210500075
熊罗庚, 郑尚, 邹海涛, 于化龙, 高尚
XIONG Luo-geng, ZHENG Shang, ZOU Hai-tao, YU Hua-long, GAO Shang
摘要: 软件自承认技术债(Self-admitted Technical Debt,SATD)由程序开发人员写入项目的源代码注释中,是开发人员为追求短期效益而刻意留下软件缺陷的说明,大量的SATD将不利于软件维护。近年来,越来越多的学者致力于软件SATD识别的研究,并提出了不同的识别方法,如基于自然语言处理或文本挖掘等检测方法。然而,大多数研究结果依赖于现有的词库或手工提取的特征,不仅耗费了大量的时间,而且增加了计算复杂度,识别结果并不理想。基于此,提出了一种基于双向门控循环单元(Gate Recurrent Unit,GRU)和注意力机制的软件自承认技术债识别方法,通过Word2vec中的Skip-gram模型获取词向量,构建双向GRU网络获取高级特征,并利用注意力机制自动发现对SATD分类起到关键作用的词,从而捕获最重要的语义信息。实验结果表明,本文方法在精确率、召回率和F1-score上均有较优表现,能够有效地识别软件SATD,避免了传统任务中复杂的特征工程。
中图分类号:
[1]GABRIELE B,BARBARA R.A large-scale empirical study on self-admitted technical debt[C]//Proceedings of the 13th International Workshop.IEEE,2016:315-326. [2]CUNNINGHAM W.The WyCash portfolio management system[J].Acm Sigplan Oops Messenger,1992,4(2):29-30. [3]HUANG C,XU K H,ZHENG S,et al.Software self-admitted technical debt identification approach based on cross oversampling[J].Journal of Jiangsu University of Science and Techno-logy Natural Science Edition,2020,182(5):55-60. [4]POTDAR A,SHIHAB E.An Exploratory Study on Self-Admitted Technical Debt[C]//2014 IEEE International Conference on Software Maintenance and Evolution.IEEE,2014:91-100. [5]JERNEJ F,VILI P.Enhanced Feature Selection Using WordEmbeddings for Self-Admitted Technical Debt Identification[C]//Proceedings of the 2018 44th Euromicro Conference on Software Engineering and Advanced Applications(SEAA).IEEE Computer Society,2018:230-233. [6]SIERRA G,SHIHAB E,KAMEI Y.A survey of self-admitted technical debt[J].Journal of Systems and Software,2019,152:70-82. [7]ZAMPETTI F,SEREBRENIK A,PENTA M.Was Self-Admitted Technical Debt Removal a Real Removal?An In-Depth Perspective[C]//IEEE/ACM International Conference on Mining Software Repositories.IEEE Computer Society,2018:526-536. [8]AVERSANO L,IAMMARINO M,CARAPELLA M,et al.On the Relationship between Self-Admitted Technical Debt Remo-vals and Technical Debt Measures[J].Algorithms,2020,13(7):1-16. [9]HUANG Q,SHIHAB E,XIA X,et al.Identifying self-admitted technical debt in open source projects using text mining[J].Empirical Software Engineering,2018,23(1):418-451. [10]MALDONADO E D S,SHIHAB E,TSANTALIS N.UsingNatural Language Processing to Automatically Detect Self-Admitted Technical Debt[J].IEEE Transactions on Software Engineering,2017,43(11):1044-1062. [11]MALDONADO E D S,SHIHAB E.Detecting and quantifyingdifferent types of self-admitted technical Debt[C]//IEEE International Workshop on Managing Technical Debt.IEEE Compu-ter Society,2015:9-15. [12]WEHAIBI S,SHIHAB E,GUERROUJ L.Examining the Impact of Self-Admitted Technical Debt on Software Quality[C]//Proceedings of the 2016 IEEE 23rd International Conference on Software Analysis,Evolution,and Reengineering(SANER).IEEE,2016:179-188. [13]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient Estimation of Word Representations in Vector Space[J].arXiv:1301.3781,2013. [14]HOCHREITER S,SCHMIDHUBER J.Long Short-Term Me-mory[J].Neural Computation,1997,9(8):1735-1780. [15]BI L,HU G,RAZA M M,et al.A Gated Recurrent Units(GRU)-Based Model for Early Detection of Soybean Sudden Death Syndrome through Time-Series Satellite Imagery[J].Remote Sensing,2020,12(21):1-20. [16]MIAO J,DUAN Y X,ZHANG Y Q,et al.Method for Extracting Event Trigger Words Based on the CNN-BiGRU Model[J].Computer Engineering,2021,47(9):69-74,83. [17]CHEN J J,PENG B Z,WU P Z.Malicious Code DetectionMethod Based on Dynamic Behavior and Machine Learning[J].Computer Engineering,2021,47(3):166-173. [18]SCHUSTER M,PALIWAL K K.Bidirectional recurrent neural networks[J].IEEE Transactions on Signal Processing,1997,45(11):2673-2681. [19]PENG Z,WEI S,TIAN J,et al.Attention-Based BidirectionalLong Short-Term Memory Networks for Relation Classification[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.Association for Computational Linguistics,2016:207-212. [20]WANG H,SHI J C,ZHANG Z W.Text semantic relation extraction of LSTM based on attention mechanism[J].Application Research of Computers,2018,319(5):143-146,166. [21]REN X X,XING Z C,XIA X,et al.Neural Network-based Detection of Self-Admitted Technical Debt:From Performance to Explainability[J].ACM Transactions on Software Engineering and Methodology,2019,28(3):1-45. [22]MAIPRADIT R,TREUDE C,HATA H,et al.Wait for it:identifying “On-Hold” self-admitted technical debt[J].Empirical Software Engineering,2020,25(5):3770-3798. [23]XIAO L,CAI Y,KAZMAN R,et al.Identifying and quantifying architectural debt[C]//IEEE/ACM 38th IEEE International Conference on Software Engineering.2016:488-498. [24]KIRK B S,PETERSON J W,STOGNER R H,et al.libMesh:a C++ library for parallel adaptive mesh refinement/coarsening simulations[J].Engineering with Computers,2006,22(3/4):237-254. |
[1] | 饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277 |
[2] | 周芳泉, 成卫青. 基于全局增强图神经网络的序列推荐 Sequence Recommendation Based on Global Enhanced Graph Neural Network 计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085 |
[3] | 戴禹, 许林峰. 基于文本行匹配的跨图文本阅读方法 Cross-image Text Reading Method Based on Text Line Matching 计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032 |
[4] | 周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026 |
[5] | 熊丽琴, 曹雷, 赖俊, 陈希亮. 基于值分解的多智能体深度强化学习综述 Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization 计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112 |
[6] | 姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046 |
[7] | 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153 |
[8] | 孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061 |
[9] | 闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042 |
[10] | 汪鸣, 彭舰, 黄飞虎. 基于多时间尺度时空图网络的交通流量预测模型 Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction 计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188 |
[11] | 金方焱, 王秀利. 融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取 Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM 计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190 |
[12] | 彭双, 伍江江, 陈浩, 杜春, 李军. 基于注意力神经网络的对地观测卫星星上自主任务规划方法 Satellite Onboard Observation Task Planning Based on Attention Neural Network 计算机科学, 2022, 49(7): 242-247. https://doi.org/10.11896/jsjkx.210500093 |
[13] | 张颖涛, 张杰, 张睿, 张文强. 全局信息引导的真实图像风格迁移 Photorealistic Style Transfer Guided by Global Information 计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036 |
[14] | 曾志贤, 曹建军, 翁年凤, 蒋国权, 徐滨. 基于注意力机制的细粒度语义关联视频-文本跨模态实体分辨 Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism 计算机科学, 2022, 49(7): 106-112. https://doi.org/10.11896/jsjkx.210500224 |
[15] | 徐鸣珂, 张帆. Head Fusion:一种提高语音情绪识别的准确性和鲁棒性的方法 Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition 计算机科学, 2022, 49(7): 132-141. https://doi.org/10.11896/jsjkx.210100085 |
|