Computer Science ›› 2020, Vol. 47 ›› Issue (3): 34-40.doi: 10.11896/jsjkx.190300053

Special Issue: Intelligent Software Engineering

• Intelligent Software Engineering • Previous Articles     Next Articles

Semantic Similarity Based API Usage Pattern Recommendation

ZHANG Yun-fan1,ZHOU Yu1,2,HUANG Zhi-qiu1,2   

  1. (College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China)1;
    (Ministry Key Laboratory for Safety-Critical Software Development and Verification, Nanjing University of Aeronautics and Astronautics, Nanjing 211100, China)2
  • Received:2019-03-15 Online:2020-03-15 Published:2020-03-30
  • About author:ZHANG Yun-fan,postgraduate.His research interests include software evolution analysis,artificial intelligence, and mining software repositories. ZHOU Yu,postdoctor,professor.His research interests mainly include software evolution analysis,mining software repositories,software architecture,and reliability analysis.
  • Supported by:
    This work was supported by the National Key R&D Program of China (2018YFB1003902), Fundamental Research Funds for the Central Universities (NS2019055) and Qing Lan Project.

Abstract: In the process of software development,reusing application programming interface (API) can improve the efficiency of software development.However,it is difficult and time-consuming for developers to use unfamiliar APIs.Previous researches tend to take APIs as inputs to search corpus and recommend API usage patterns,which does not conform to the habits of developers searching for API usage patterns.This paper proposed a novel Semantic Similarity based API Usage Pattern Recommendation approach (SSAPIR).This approach first adopts hierarchical clustering algorithm to extract API usage patterns,and then calculates the semantic similarity between queries and API usage patterns’ description information,aiming to recommend highly relevant and widely used API usage patterns to developers.To verify the effectiveness of SSAPIR,Java projects are collected from GitHub,from which the API usage patterns related to the 9 popular third-party API libraries and their description information are extracted.Ultimately,this paper recommended API usage patterns based on natural language queries which are related to the 9 third-party API libraries.To verify the effectiveness of SSAPIR,this paper measured the Hit@K of the recommendation results.The experimental results demonstrate that SSAPIR can effectively improve the accuracy of recommendation results and achieves an average accuracy of 85.02% in terms of Hit@10,which outperforms the state-of-art work.SSAPIR can complete the API usage pattern recommendation task greatly and provide accurate API usage pattern recommendation for developers by taking natural language queries as inputs.

Key words: API usage pattern recommendation, Hierarchical clustering, Semantic similarity

CLC Number: 

  • TP391
[1]PiCCIONI M,FURIA C A,MEYER B.An empirical study of API usability[C]∥2013 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement.IEEE,2013:5-14.
[2]ZHOU Y,WANG C,YAN X,et al.Automatic Detection and Repair Recommendation of Directive Defects in Java API Documentation[J].IEEE Transactions on Software Engineering,2018.
[3]ZHANG J X,JIANG H,REN Z L,et al.Recommending APIs for API Related Questions in Stack Overflow[J].IEEE Access,2018,6:6205-6219.
[4]ZHONG H,XIE T,ZHANG L,et al.MAPO:Mining and recommending API usage patterns[C]∥Proceedings of the 23 rdEuropean Conference on ECOOP 2009-Object-Oriented Programming.Berlin:Springer,2009:318-343.
[5]BUSE R P L,WEIMER W.Synthesizing API usage examples [C]∥Proceedings of the 34th International Conference on Software Engineering.IEEE Press,2012:782-792.
[6]WANG J,DANG Y N,ZHANG H Y,et al.Mining succinct and high-coverage API usage patterns from source code[C]∥Proceedings of the 10th Working Conference on Mining Software Repositories.IEEE Press,2013:319-328.
[7]NIU H,KEIVANLOO I,ZOU Y.API usage pattern recommendation for software development[J].Journal of Systems and Software,2017,129(C):127-139.
[8]GU X D,ZHANG H Y,ZHANG D M,et al.Deep API learning[C]∥Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering.New York:ACM,2016:631-642.
[9]HUANG Q,XIA X,XING Z,et al.API method recommendation without worrying about the task-API knowledge gap[C]∥Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering.ACM,2018:293-304.
[10]LI X C,JIANG H,KAMEI Y,et al.Bridging Semantic Gaps between Natural Languages and APIs with Word Embedding[J].arXiv:1810.09723,2018.
[11]HELLENDOORN V J,DEVANBU P.Are deep neural net- works the best choice for modeling source code?[C]∥Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering.New York:ACM,2017:763-773.
[12]LU Y,HSIAO I H.Exploring Programming Semantic Analytics with Deep Learning Models[C]∥Proceedings of the 9th International Conference on Learning Analytics & Knowledge.ACM,2019:155-159.
[13]THUNG F,WANG S,LO D,et al.Automatic recommendation of API methods from feature requests[C]∥Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering.IEEE Press,2013:290-300.
[14]MANNING C,RAGHAVAN P,SCHÜTZE H.Introduction to information retrieval[J].Natural Language Engineering,2010,16(1):100-103.
[15]MANNING C,SURDEANU M,BAUER J,et al.The Stanford CoreNLP natural language processing toolkit[C]∥Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics:System Demonstrations.Baltimore:ACL,2014:55-60.
[16]WordNet English Stopword List[EB/OL]. http://www.d.
[17]RAMOS J.Using tf-idf to determine word relevance in document queries[C]∥Proceedings of the First Instructional Conference on Machine Learning.2003:133-142.
[18]JAIN A K,MURTY M N,FLYNN P J.Data clustering:a review[J].ACM computing surveys (CSUR),1999,31(3):264-323.
[19]PUDI V.Data mining:concepts and techniques[M].New York:Oxford University Press,2011.
[20]XU C Y,SUN X B,LI B,et al.MULAPI:Improving API method recommendation with API usage location[J].Journal of Systems and Software,2018,142:195-205.
[21]AVAZPOUR I,PITAKRAT T,GRUNSKE L,et al.Dimensions and metrics for evaluating recommendation systems[M]∥Re-commendation Systems in Software Engineering.Berlin:Sprin-ger,2014:245-273.
[22]MCMILLAN C,POSHYVANYK D,GRECHANIK M,et al. Portfolio:Searching for relevant functions and their usages in millions of lines of code[J].ACM Transactions on Software Engineering and Methodology (TOSEM),2013,22(4):37.
[1] ZENG Zhi-xian, CAO Jian-jun, WENG Nian-feng, JIANG Guo-quan, XU Bin. Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism [J]. Computer Science, 2022, 49(7): 106-112.
[2] LUO Yue-tong, WANG Tao, YANG Meng-nan, ZHANG Yan-kong. Historical Driving Track Set Based Visual Vehicle Behavior Analytic Method [J]. Computer Science, 2021, 48(9): 86-94.
[3] WANG Sheng, ZHANG Yang-sen, CHEN Ruo-yu, XIANG Ga. Text Matching Method Based on Fine-grained Difference Features [J]. Computer Science, 2021, 48(8): 60-65.
[4] CHEN Qing-chao, WANG Tao, FENG Wen-bo, YIN Shi-zhuang, LIU Li-jun. Unknown Binary Protocol Format Inference Method Based on Longest Continuous Interval [J]. Computer Science, 2020, 47(8): 313-318.
[5] MA Xiao-hui, JIA Jun-zhi, ZHOU Xiang-zhen, YAN Jun-ya. Semantic Similarity-based Method for Sentiment Classification [J]. Computer Science, 2020, 47(11): 275-279.
[6] XU Fei-xiang,YE Xia,LI Lin-lin,CAO Jun-bo,WANG Xin. Comprehensive Calculation of Semantic Similarity of Ontology Concept Based on SA-BP Algorithm [J]. Computer Science, 2020, 47(1): 199-204.
[7] ZHANG Xue-fu, ZENG Pan, JIN Min. Cancer Classification Prediction Model Based on Correlation and Similarity [J]. Computer Science, 2019, 46(7): 300-307.
[8] XIA Ying, LI Liu-jie, ZHANG XU, BAE Hae-young. Weighted Oversampling Method Based on Hierarchical Clustering for Unbalanced Data [J]. Computer Science, 2019, 46(4): 22-27.
[9] WU Yi-fan, CUI Yan-peng, HU Jian-wei. Alert Processing Method Based on Hierarchical Clustering [J]. Computer Science, 2019, 46(4): 203-209.
[10] TANG Jia-qi, WU Jing-li, LIAO Yuan-xiu, WANG Jin-yan. Prediction of Protein Functions Based on Bi-weighted Vote [J]. Computer Science, 2019, 46(4): 222-227.
[11] YANG Kai-ping, LI Ming-qi, QIN Si-yi. Lawyer Evaluation Method Based on Network Response [J]. Computer Science, 2018, 45(9): 237-242.
[12] WANG Shu-yi and DONG Dong. Mining of API Usage Pattern Based on Clustering and Partial Order Sequences [J]. Computer Science, 2017, 44(Z6): 486-490.
[13] LI Feng and XIE Si-hong. Study on Abnormal Diagnosis of Moving ECG Signals Based on Unsupervised Learning [J]. Computer Science, 2017, 44(Z11): 68-71.
[14] LI Han, TONG Ning and CHEN Feng. Hierarchical Clustering Based Software Architecture Recovery Approach [J]. Computer Science, 2017, 44(4): 75-78.
[15] LIN Jiang-hao, ZHOU Yong-mei, YANG Ai-min and CHENG Jin. Extraction Method of Sentimental Feature Vector Based on Semantic Similarity [J]. Computer Science, 2017, 44(10): 296-301.
Full text



No Suggested Reading articles found!