计算机科学 ›› 2014, Vol. 41 ›› Issue (9): 91-95.doi: 10.11896/j.issn.1002-137X.2014.09.017

• 2013’服务化软件 • 上一篇    下一篇

基于自然语言的软件信息检索工具

叶挺,陈秀招,邹艳珍,赵俊峰,谢冰   

  1. 北京大学信息科学技术学院软件研究所 北京100871高可信软件技术教育部重点实验室 北京100871;北京大学信息科学技术学院软件研究所 北京100871高可信软件技术教育部重点实验室 北京100871;北京大学信息科学技术学院软件研究所 北京100871高可信软件技术教育部重点实验室 北京100871;北京大学信息科学技术学院软件研究所 北京100871高可信软件技术教育部重点实验室 北京100871;北京大学信息科学技术学院软件研究所 北京100871高可信软件技术教育部重点实验室 北京100871
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受国家863计划:网构化软件生产、构造和复用技术与工具(2012AA011202),国家自然科学基金:软件构件自动标签及其应用技术研究(61103024)资助

Design and Implementation of Natural Language Based Software Information Retrieval Tool

YE Ting,CHEN Xiu-zhao,ZOU Yan-zhen,ZHAO Jun-feng and XIE Bing   

  • Online:2018-11-14 Published:2018-11-14

摘要: 随着开源软件项目规模的增大,如何快速地学习、理解一个软件项目成为基于复用的软件开发活动中的一个重要环节。这些开源软件项目的源代码和文档集的数量都比较庞大,开发人员在学习过程中查找和阅读这些软件信息需要花费大量的时间和精力。为此,提出一种基于自然语言的软件信息检索方法,以帮助开发人员快速地检索并理解其需要的软件信息。基于该方法,设计并实现了NaLSiSe工具。NaLSiSe工具在中国计算机学会主办的第一届软件研究成果原型竞赛中荣获优秀奖。以Lucene为例,验证了该工具可以有效减少开发人员阅读源代码和文档的工作量,同时具备简洁的用户界面和友好的用户体验。

关键词: 软件复用,软件项目,信息检索,自然语言提问

Abstract: Open source software projects have become a kind of important sources in software reuse.When the source code and code related documents in a project are very large,it is time-consuming for end-users to find the right information (code or document) segments.We proposed a natural language based software information retrieval approach,which helps developers to get the software information they need more quickly and conveniently.Based on this approach,we designed and implemented a natural language based software information search engine (NaLSiSe).We took Lucene as an example to illustrate how NaLSiSe automatically analyses and organizes the related software information,constructs query in natural language based search engine so as to improve the precision.

Key words: Software reuse,Software project,Information retrieval,Natural language question

[1] Wang H,Wang C.Open source software adoption:A status report[J].IEEE Software,2001,18:90-96
[2] Madnmohan T R,De’ R.Open source reuse in commercial firms [J].IEEE Software,2004,21:62-69
[3] Marcus A,Antoniol G.On the use of text retrieval techniques in software engineering[C]∥Proceedings of IEEE/ACM International Conference on Software Engineering,Technical Briefing.2012
[4] Bernstein A,Kaufmann E,Kaiser C.Querying the Semantic Web with Ginseng A Guided Input Natural Language Search Engine[C]∥Proceedings of the 15th Workshop on Information Technology and Systems.2005:45-50
[5] de Alwis B,Murphy G C.Answering Conceptual Queries withFerret[C]∥Proceedings of IEEE/ACM International Confe-rence on Software Engineering.2008:21-30
[6] Begel A,Phang K Y,Zimmermann T.Codebook:Discoveringand Exploiting Relationships in Software Repositories[C]∥Proceedings of IEEE/ACM International Conference on Software Engineering.2010:125-134
[7] Würsch M,Ghezzi G,Reif G,et al.Supporting Developers with Natural Language Queries[C]∥Proceedings of IEEE/ACM International Conference on Software Engineering.2010:165-174
[8] Berners-Lee T,Hendler J,Lassila O.The Semantic Web [J].Scientific American,2001,284(5):34-43
[9] Nivre J.Dependency grammar and dependency parsing [R].Technical Report MSI report 05133.2005

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!