计算机科学 ›› 2020, Vol. 47 ›› Issue (3): 34-40.doi: 10.11896/jsjkx.190300053

所属专题: 智能软件工程

• 智能软件工程 • 上一篇    下一篇



  1. (南京航空航天大学计算机科学与技术学院 南京210016)1;
    (南京航空航天大学高安全系统的软件开发与验证技术工信部重点实验室 南京211100)2
  • 收稿日期:2019-03-15 出版日期:2020-03-15 发布日期:2020-03-30
  • 通讯作者: 周宇(zhouyu@nuaa.edu.cn)
  • 基金资助:

Semantic Similarity Based API Usage Pattern Recommendation

ZHANG Yun-fan1,ZHOU Yu1,2,HUANG Zhi-qiu1,2   

  1. (College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China)1;
    (Ministry Key Laboratory for Safety-Critical Software Development and Verification, Nanjing University of Aeronautics and Astronautics, Nanjing 211100, China)2
  • Received:2019-03-15 Online:2020-03-15 Published:2020-03-30
  • About author:ZHANG Yun-fan,postgraduate.His research interests include software evolution analysis,artificial intelligence, and mining software repositories. ZHOU Yu,postdoctor,professor.His research interests mainly include software evolution analysis,mining software repositories,software architecture,and reliability analysis.
  • Supported by:
    This work was supported by the National Key R&D Program of China (2018YFB1003902), Fundamental Research Funds for the Central Universities (NS2019055) and Qing Lan Project.

摘要: 在软件开发过程中,复用应用程序编程接口(Application Programming Interface,API)可以提高软件开发效率,但是使用不熟悉的API是一项耗时且困难的挑战。已有的研究往往将API作为用户输入的查询,通过在语料库中搜索该API的使用模式来进行推荐,但这并不符合开发人员的查询习惯。文中提出了一种基于自然语言语义相似度的API使用模式推荐方法(Semantic Similazing Based API Recommendation,SSAPIR)。该方法使用层次聚类算法来提取API使用模式,然后通过计算查询信息和API使用模式来描述信息之间的语意相似度,向开发人员推荐相关度高且被广泛使用的API使用模式。为了验证SSAPIR的有效性,文中从GitHub的高质量Java项目中提取9个流行的第三方API库的API使用模式以及API使用模式的描述信息,并根据这9个流行的第三方API库的自然语言查询进行API使用模式推荐。通过计算推荐结果的Hit@K准确率来验证SSAPIR的有效性,实验结果表明,层次聚类能有效提高推荐准确率,且SSAPIR在Hit@10平均准确率上达到了85.02%,优于现有研究工作,能够很好地完成API使用模式推荐任务,为开发人员输入的自然语言查询提供精准的API使用模式。

关键词: API使用模式推荐, 层次聚类, 语义相似度

Abstract: In the process of software development,reusing application programming interface (API) can improve the efficiency of software development.However,it is difficult and time-consuming for developers to use unfamiliar APIs.Previous researches tend to take APIs as inputs to search corpus and recommend API usage patterns,which does not conform to the habits of developers searching for API usage patterns.This paper proposed a novel Semantic Similarity based API Usage Pattern Recommendation approach (SSAPIR).This approach first adopts hierarchical clustering algorithm to extract API usage patterns,and then calculates the semantic similarity between queries and API usage patterns’ description information,aiming to recommend highly relevant and widely used API usage patterns to developers.To verify the effectiveness of SSAPIR,Java projects are collected from GitHub,from which the API usage patterns related to the 9 popular third-party API libraries and their description information are extracted.Ultimately,this paper recommended API usage patterns based on natural language queries which are related to the 9 third-party API libraries.To verify the effectiveness of SSAPIR,this paper measured the Hit@K of the recommendation results.The experimental results demonstrate that SSAPIR can effectively improve the accuracy of recommendation results and achieves an average accuracy of 85.02% in terms of Hit@10,which outperforms the state-of-art work.SSAPIR can complete the API usage pattern recommendation task greatly and provide accurate API usage pattern recommendation for developers by taking natural language queries as inputs.

Key words: API usage pattern recommendation, Hierarchical clustering, Semantic similarity


  • TP391
