Computer Science ›› 2019, Vol. 46 ›› Issue (11): 137-144.doi: 10.11896/jsjkx.191100501C

• Software & Database Technology • Previous Articles     Next Articles

Empirical Study of Code Query Technique Based on Constraint Solving on StackOverflow

CHEN Zheng-zhao, JIANG Ren-he, PAN Min-xue, ZHANG Tian, LI Xuan-dong   

  1. (State Key Laboratory for Novel Software Technology,Nanjing University,Nanjing 210093,China)
  • Received:2018-10-07 Online:2019-11-15 Published:2019-11-14

Abstract: Code query plays an important role in code reuse,and the Q&A about code on StackOverflow which is a professionalquestion-and-answer site for programmers is a typical scenario for code reuse.In practice,the manual way is adopted to answer questions,which usually has the disadvantages of poor real-time,incorrect description of problems,and low availability of answers.If the process of code query and search can be automated and replace manual answering, it will save a lot of manpower and time cost.Now there are already many code query technologies,but most lack experie-nce of application in the real case.Based on the ideas of Satsy,this paper implemented the code query technology based on constraint solving for Java language,and designed the empirical study.This paper used StackOverflow as the research object,and mainly studied how to apply the code query technology based on constraint solving of Q&A about code on the website.First of all,the problems on the website are analyzed,and 35 problems with high trafficin Java language are extracted as query problems.Then,about 30000 lines of code are captured from GitHub,and they are converted into the form of constraints as well as built as a large code base to support code query.Finally,through the analysis of the query results of these 35 questions,the practical application effect of the technology on StackOverflow was evalua-ted.The results show that the proposed technology has good practical application effect on the specific questions and code scale studied,and can replace the manual answer on a considerable scale.

Key words: Code query, Constraint solving, Opensource code database, Empirical study

CLC Number: 

  • TP311.5
[1] ZENG Z,ZHAO J H.Code Query Technology Based on Program Analysis[J].Computer Science,2012,39(2):143-147.(in Chinese)曾锃,赵建华.基于程序分析的代码查询技术[J].计算机科学,2012,39(2):143-147.
[2] GRECHANIK M,FU C,XIE Q,et al.Exemplar:EXEcutableexaMPLesARchive[C]∥Acm/ieee International Conference on Software Engineering.IEEE,2010.
[3] MCMILLAN C,GRECHANIK M,POSHYVANYK D,et al.Exemplar:A Source Code Search Engine for Finding Highly Relevant Applications[J].IEEE Transactions on Software Engineering,2012,38(5):1069-1087.
[4] LEMOS O A L,BAJRACHARYA S K,OSSHER J,et al.CodeGenie:using test-cases to search and reuse source code[C]∥Proceedings of the Twenty-second IEEE/ACM International Conference on Automated Software Engineering.ACM,2007:525-526.
[5] SUSHIL B,TRUNG N,ERIK L,et al.Sourcerer:a search engine for open source code supporting structure-based search[C]∥In Companion to the 21st ACM SIGPLAN Symposium on Object-oriented Programming Systems,Languages,and Applications(OOPSLA ’06).New York,2006:681-682.
[6] LINSTEAD E,BAJRACHARYA S,NGO T,et al.Sourcerer:mining and searching internet-scale software repositories[J].Data Mining and Knowledge Discovery,2009,18(2):300-336.
[7] STOLEE K T,ELBAUM S,DOBOS D.Solving the Search for Source Code[J].ACM Transactions on Software Engineering and Methodology,2014,23(3):1-45.
[8] Stack Overflow.About us[OL].https://stackoverflow.com/company.
[9] GitHub.GitHub is how people build software[OL].https://github.com/about.
[10] Wikipedia.CFG:control flow graph[OL].https://en.wikipedia.org/wiki/Control_flow_graph.
[11] VALLÉE-RAI R,CO P,GAGNON E,et al.Soot-a Java bytecode optimization framework[C]∥Conference of the Centre for Advanced Studies on Collaborative Research.IBM Press,1999:214-224.
[12] Wikipedia.JSON[OL].https://en.wikipedia.org/wiki/JSON.
[13] Open Hub.Koders[OL].https://code.openhub.net.
[14] KIM K,KIM D,BISSYANDE T F,et al.FaCoY-A Code-toCode Search Engine[C]∥International Conference on Software Engineering.IEEE Computer Society,2018:946-957.
[15] Kent Beck.Test-Driven Development by Example[M].Boston,United States:Addison-Wesley Professional,2002.
[1] YE Zhi-bin,YAN Bo. Survey of Symbolic Execution [J]. Computer Science, 2018, 45(6A): 28-35.
[2] ZHANG Yong-gang, CHENG Zhu-yuan. Research Progress on Max Restricted Path Consistency Constraint Propagation Algorithms [J]. Computer Science, 2018, 45(6A): 41-45, 62.
[3] LI Hang, ZANG Lie, GAN Lu. Search of Speculative Symbolic Execution Path Based on Ant Colony Algorithm [J]. Computer Science, 2018, 45(6): 145-150.
[4] JIANG Ren-he, ZHENG Xiao-mei, ZHU Xiao-qian, PAN Min-xue and ZHANG Tian. Method of Java Code Repository Construction Based on UML Relationship [J]. Computer Science, 2017, 44(11): 69-79.
[5] ZHANG Kai, SUN Xiao-bing, PENG Xin and ZHAO Wen-yun. Empirical Study of Reopened Security Bugs on Mozilla [J]. Computer Science, 2017, 44(11): 41-49.
[6] CHEN Xiang,GU Qing,CHEN Dao-xu and JIANG Zheng-zheng. Systematic Review of Test Suite Minimization for Regression Testing [J]. Computer Science, 2014, 41(9): 196-204.
[7] . Symbolic Execution Based on Branch Confusion Algorithm [J]. Computer Science, 2012, 39(9): 115-119.
[8] ZENG Zeng ,ZHAO Jian-hua. Code Query Technology Based on Program Analysis [J]. Computer Science, 2012, 39(2): 148-153.
[9] SUN Li-juan,JIN Ying-hao. Strategy of Feature Interaction Based on the Sufficiency Principle [J]. Computer Science, 2010, 37(8): 270-272.
[10] YANG Yang,ZHANG Huan-guo,WANG Hou-zhen. Full-automatic Detection of Memory Safety Violations for C Programs [J]. Computer Science, 2010, 37(6): 155-158.
[11] . [J]. Computer Science, 2007, 34(5): 208-209.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[2] ZHOU Yan-ping and YE Qiao-lin. L1-norm Distance Based Least Squares Twin Support Vector Machine[J]. Computer Science, 2018, 45(4): 100 -105, 130 .
[3] LIU Bo-yi, TANG Xiang-yan and CHENG Jie-ren. Recognition Method for Corn Borer Based on Templates Matching in Muliple Growth Periods[J]. Computer Science, 2018, 45(4): 106 -111, 142 .
[4] GUO Jun-xia, GUO Ren-fei, XU Nan-shan and ZHAO Rui-lian. Study on Construction of EFSM Model for Web Application Based on Session[J]. Computer Science, 2018, 45(4): 203 -207, 214 .
[5] SUO Yan-feng, WANG Shao-jie, QIN Yu, LI Qiu-xiang, FENG Da-jun and LI Jing-chun. Summary of Security Technology and Application in Industrial Control System[J]. Computer Science, 2018, 45(4): 25 -33 .
[6] LI Hui, ZHOU Lin and XIN Wen-bo. Optimization of Networked Air-defense Operational Formation Structure Based on Bilevel Programming[J]. Computer Science, 2018, 45(4): 266 -272, 300 .
[7] ZHAO Li-bo, LIU Qi, FU Fang-ling and HE Ling. Automatic Detection of Hypernasality Grades Based on Discrete Wavelet Transformation and Cepstrum Analysis[J]. Computer Science, 2018, 45(4): 278 -284 .
[8] DENG Xia, CHANG Le, LIANG Jun-bin, JIANG Chan. Survey on Multicast Routing in Mobile Opportunistic Networks[J]. Computer Science, 2018, 45(6): 19 -26 .
[9] XIANG Ying-zhuo, TAN Ju-xian, HAN Jie-si, SHI Hao. Survey of Graph Matching Algorithms[J]. Computer Science, 2018, 45(6): 27 -31,45 .
[10] HU Qing-cheng, ZHANG Yong, XING Chun-xiao. K-clique Heuristic Algorithm for Influence Maximization in Social Network[J]. Computer Science, 2018, 45(6): 32 -35 .