计算机科学 ›› 2020, Vol. 47 ›› Issue (11A): 544-548.doi: 10.11896/jsjkx.191200010

• 软件工程&数据库 • 上一篇    下一篇

融合领域知识的API推荐模型

李浩, 钟声, 康雁, 李涛, 张亚钏, 卜荣景   

  1. 云南大学软件学院 昆明 650504
  • 出版日期:2020-11-15 发布日期:2020-11-17
  • 通讯作者: 康雁(kangyan@ynu.edu.cn)
  • 作者简介:12018002170@mail.ynu.edu.cn
  • 基金资助:
    国家自然科学基金(61762092,61762089);云南省软件工程重点实验室开放基金项目(2017SE204);材料基因工程-基于Metcloud的集成计算功能模块计算软件开发(2019CLJY06)

API Recommendation Model with Fusion Domain Knowledge

LI Hao, ZHONG Sheng, KANG Yan, LI Tao, ZHANG Ya-chuan, BU Rong-jing   

  1. College of Software,Yunnan University,Kunming 650504,China
  • Online:2020-11-15 Published:2020-11-17
  • About author:LI Hao,born in 1970,Ph.D,professor.His main research intere-sts include hybrid cloud computing,computer vision and robotics.
    KANG Yan,born in 1972,Ph.D,associate professor.Her main research interests include software engineering,system optimization,big data processing and mining.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China (61762092,61762089),Yunnan Provincial Key Laboratory of Software Engineering Open Fund Project (2017SE204) and Development of MatCloud-based High-throughput Computational Module for MGI Platform(2019CLJY06).

摘要: 应用程序接口(Application Programming Interfaces,API)在现代软件开发中起着重要的作用,开发人员经常需要为他们的编程任务搜索合适的API。但是随着信息产业的发展,API参考文档变得越发庞大,传统的搜索方式会因为互联网上的冗余和错误信息给工程师的查询带来不便。与此同时,由于编程任务的自然语言描述与API文档中的描述之间存在词汇和知识上的差距,很难找到合适的API。基于这些问题,提出一种融合领域知识的API推荐算法ARDSQ (Recommendation base on Documentation and Solved Question)。ARDSQ能够根据工程师对某个功能的自然语言描述去知识库里检索到最为贴近的API。实验表明,与两种先进的API推荐算法(BIKER,DeepAPILearning)比较,ARDSQ在推荐系统关键评价指数(Hit-n,MRR,MAP)上都有较大的优势。

关键词: 程序分析, 代码推荐, 深度学习, 信息检索, 应用编程接口

Abstract: Application Programming Interfaces (API) play an important role in modern software development,and developers often need to search for the appropriate API for their programming tasks.However,with the development of the information industry,API reference documents have become larger and larger,and traditional search methods have also caused inconvenience to engineers' queries because of redundant and erroneous information on the Internet.At the same time,due to the vocabulary and knowledge gap between the natural language description of programming tasks and the description in the API documentation,it is difficult to find a suitable API.Based on these issues,this paper proposes an algorithm called ARDSQ (Recommendation base on Documentation and Solved Question) which is an API recommendation algorithm that integrates domain knowledge.ARDSQ can retrieve the closest API in the knowledge base based on the natural language description given by the engineer.Experiments show that,compared with two advanced API recommendation algorithms(BIKER,DeepAPILearning),ARDSQ has greater advantages in the key evaluation index (Hit-n,MRR,MAP ) of the recommendation system.

Key words: API, Code recommendation, Deep learning, Information retrieval, Program analysis

中图分类号: 

  • TP311.5
[1] CAY S.Horstmann and Gary Cornell.Core java Volume 1Foundations[M].Prentice-Hall,2018:12.
[2] YU H,SONG W,MINE T.Apibook:an effective approach for finding APIs[C]//Asia Pacific Symposium on Internetware.2016:45-53.
[3] CHEN Z Z,JIANG R H,PAN M X,et al.Empirical Study of Code Query Technique Based on Constraint Solving on StackOverflow[J].Computer Science,2019,46(11):137-144.
[4] TIAN Y,THUNG F,SHARMA A,et al.Apibot:Question answering bot for api documentation[C]//32nd IEEE/ACM International Conference on Automated Software Engineering.2017:153-158.
[5] RAHMAN,MOHAMMAD M,ROY,et al.ACK:Automatic APIrecommendation using crowdsourced knowledge[C]//2016 IEEE 23rd International Conference on Software Analysis,Evolution,and Reengineering (SANER).2016:14-18.
[6] ZHANG J,JIANG H,REN Z,et al.Recommending apis for api related questions in stack overflow[J].IEEE Access,2018,6:6205-6219.
[7] HUANG Q,XIA X,XING Z C,et al.API Method Recommendation without Worrying about the Task-API Knowledge Gap[C]//Proceedings of the 2018 33rd ACM/IEEE International Conference on Automated Software Engineering (ASE'18).2018:3-7.
[8] GU X,ZHANG H,ZHANG D,et al.Deep API learning[C] //FSE 2016:Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering.2016:631-642.
[9] WALLACH,HANNA M.Topic modeling:beyond bag-of-words[C]//Proceedings of the 23rd International Conference on Machine Learning.ACM,2006.
[10] RAMOS J.Using tf-idf to determine word relevance in document queries[C]//Proceedings of the First Instructional Conference on Machine Learning.2003:133-142.
[11] GU X D,ZHANG H Y,KIM S H.Deep Code Search[C]//40th International Conference on Software Engineering(ICSE'18).2018:12.
[12] CAMPBELL B A,TREUDE C.NLP2Code:Code snippet con-tent assist via natural language tasks[J].arXiv:1701.05648,2017.
[13] MIKOLOV T,KARAFIÁT M,BURGETL L,et al.Recurrent neural network based language model[C]//Eleventh Annual Conference of the International Speech Communication Association.2010.
[14] DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018.
[15] MATTHEW E P,NEUMANN M,IYYER M,et al.Deep contextualized word representations[J].arXiv:1802.05365,2018.
[1] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[2] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[3] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[4] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[5] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[6] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[7] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[8] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[9] 程成, 降爱莲.
基于多路径特征提取的实时语义分割方法
Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction
计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157
[10] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[11] 周慧, 施皓晨, 屠要峰, 黄圣君.
基于主动采样的深度鲁棒神经网络学习
Robust Deep Neural Network Learning Based on Active Sampling
计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044
[12] 苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫.
小样本雷达辐射源识别的深度学习方法综述
Survey of Deep Learning for Radar Emitter Identification Based on Small Sample
计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138
[13] 刘伟业, 鲁慧民, 李玉鹏, 马宁.
指静脉识别技术研究综述
Survey on Finger Vein Recognition Research
计算机科学, 2022, 49(6A): 1-11. https://doi.org/10.11896/jsjkx.210400056
[14] 孙福权, 崔志清, 邹彭, 张琨.
基于多尺度特征的脑肿瘤分割算法
Brain Tumor Segmentation Algorithm Based on Multi-scale Features
计算机科学, 2022, 49(6A): 12-16. https://doi.org/10.11896/jsjkx.210700217
[15] 康雁, 徐玉龙, 寇勇奇, 谢思宇, 杨学昆, 李浩.
基于Transformer和LSTM的药物相互作用预测
Drug-Drug Interaction Prediction Based on Transformer and LSTM
计算机科学, 2022, 49(6A): 17-21. https://doi.org/10.11896/jsjkx.210400150
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!