计算机科学 ›› 2017, Vol. 44 ›› Issue (11): 69-79.doi: 10.11896/j.issn.1002-137X.2017.11.011

• 2016 年全国软件与应用学术会议 • 上一篇    下一篇

一种基于UML关系的Java代码库构造方法

姜人和,郑晓梅,朱晓倩,潘敏学,张天   

  1. 南京大学计算机软件新技术国家重点实验室 南京210023,南京中医药大学信息技术学院 南京210023,南京大学计算机软件新技术国家重点实验室 南京210023,南京大学计算机软件新技术国家重点实验室 南京210023,南京大学计算机软件新技术国家重点实验室 南京210023
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受基于MDE的异构数据建模及转换研究(61472180),基于场景规约的中断驱动系统的建模与验证技术研究(61502228),基于SysML和MARTE的异构数据模型转换方法研究(BK20141322),中断驱动系统的建模与分析(BK20150589)资助

Method of Java Code Repository Construction Based on UML Relationship

JIANG Ren-he, ZHENG Xiao-mei, ZHU Xiao-qian, PAN Min-xue and ZHANG Tian   

  • Online:2018-12-01 Published:2018-12-01

摘要: 关系信息是体现代码结构和语义的最重要的一类信息,如继承、聚合、组合、依赖、调用和创建实例等。为了更好地支持开源代码的理解与复用,提出了一种基于UML2关系的代码库构造方法。它以图数据库为实现平台,采用语言工程中经典的抽象语法树作为基础,并针对Java语言的特性和机制,设计富语义的Java代码属性图数据模型,在此基础上使得Java代码的图结构持久化。同时,为了屏蔽各种编程语言社区对代码中关系信息理解的差异性,采用UML2.4国际标准版本中定义的关系类型及语义解释,设计相应的代码关系抽取算法,为图节点添加对应的关系边。针对代码图化后的膨胀及代码库的空间存储消耗情况,选取9个常见的开源项目进行了实验评估。最后,给出了基于此代码库的查询应用实例。

关键词: 代码库,UML,图数据库,代码查询

Abstract: Relationship information is the most important information representing code structure and semantics,such as inheritance,aggregation,composition,dependency,call and creation.This paper provided a method constructing source repositories based on relationships defined in UML2 for better comprehension and reuse of the open source code.The graph database is used as the implementation plantform of the approach,and the abstract syntax tree is adopted as the base of graph schema.In addition,the schema is designed specifically according to the Java language so that semantics can be well presented.The key point of the approach is that the relationship definitions are prescribed strictly according to the UML2.4 specifications,which is an ISO version,so that the differentiation between different language communities about the comprehension of relationship can be eliminated.Each category of the relationships was studied and the corresponding recognition algorithms were designed.During the construction of the repository,the relationships are added as different kind of edges.The evaluation experiments of 9 open source projects were conducted to illustrate the expansion of the code graph and the consumed space.At last,two simple case studies of querying on the repository were given.

Key words: Code repository,UML,Graph database,Code query

[1] Neo4j.http://neo4j.com.
[2] Eclipse JDT.http://www.eclipse.org/jdt.
[3] OMG UML,Superstructure.Version2.4.1.http://www.omg.org/spec/UML/2.4.1.
[4] Cypher.http://neo4j.com/docs/stable/cypher-query-lang.html.
[5] FEIJS L,KRIKHAAR R,VAN OMMERING R.A relational approach to support software architecture analysis[J].Software:Practice and Experience,1998,8(4):371-400.
[6] VERBAERE M,HAJIYEV E,DE MOOR O.Improve software quality with semmlecode:an eclipse plugin for semantic code search[C]∥Companion to the 22nd ACM SIGPLAN Confe-rence on Object-oriented Programming Systems and Applications Companion.ACM,2007:880-881.
[7] YAMAGUCHI F,GOLDE N,ARP D,et al.Modeling and discovering vulnerabilities with code property graphs[C]∥Proceedings of the 2014 IEEE Symposium on Security and Privacy.IEEE,2014:590-604.
[8] PANCHENKO,KARSTENS J,PLATTNER H,et al.Precise and scalable querying of syntactical source code patterns using sample code snippets and a database[C]∥Proceedings of the 2011 IEEE 19th International Conference on Program Comprehension.IEEE,2011:41-50.
[9] LINTON M A.Implementing relational views of programs[C]∥Proceedings of the First ACM SIGSOFT/SIGPLAN Software Engineering Symposium on Practical Software Development Environments.ACM,1984:132-140.
[10] CHEN Y,NISHIMOTO M Y,RAMAMOORTHY C V.The c information abstraction system[J].IEEE Transactions on Software Engineering,1990,16(3):325.
[11] CHEN Y,GANSNER E R,KOUTSOFIOS E.A C++ data model supporting reachability analysis and dead code detection[C]∥Proceedings of the 6th European Software Engineering Conference Held Jointly with the 5th ACM SIGSOFT International Symposium on Foundations of Software Engineering.Springer,1997:414-431.
[12] HOLMES R,WALKER R J.Approximate structural contextmatching:an approach to recommend relevant examples[J].IEEE Transactions on Software Engineering,2006,32(12):952-970.
[13] BAJRACHARYA S,NGO T,LINSTEAD E,et al.Sourcerer:a search engine for open source code supporting structure-based search[C]∥Companion to the 21st ACM SIGPLAN Symposium on Object-oriented Programming Systems,Languages,and Applications.ACM,2006:681-682.
[14] BAJRACHARYA S,OSSHER J,OSSHER J.Sourcerer:an infrastructure for largescale collection and analysis of open-source code[J].Science of Computer Programming,2014,9:241-259.
[15] BADROS G.Javaml:A markup language for java source code[J].Computer Networks,2000,3(1):159-177.
[16] AGUIAR,DAVID G,BADROS G.Javaml 2.0:enriching the markup language for java source code[M]∥XML:Aplicac Oese Tecnologias Associadas,2004:1-12.
[17] EICHBERG M,HAUPT M,MEZINI M,et al.Comprehensive software understanding with sextant[C]∥Proceedings of the 21st IEEE International Conference on Software Maintenance.IEEE,2005:315-324.
[18] PANCHENKO,KARSTENS J,PLATTNER H,et al.Precise and scalable querying of syntactical source code patterns using sample code snippets and a database[C]∥Proceedings of the 2011 IEEE 19th International Conference on Program Comprehension.IEEE,2011:41-50.
[19] YAMAGUCHI F,GOLDE N,ARP D,et al.Modeling and discovering vulnerabilities with code property graphs[C]∥Proceedings of the 2014 IEEE Symposium on Security and Privacy.IEEE,2014:590-604.
[20] URMA R,MYCROFT A.Source-code queries with graph databases-with application to programming language usage and evolution[J].Science of Computer Programming,2015,97:127-134.
[21] ZHANG T,PAN M,ZHAO J,et al.An open framework for semantic code queries on heterogeneous repositories[C]∥Proceedings of the 2015 International Symposium on Theoretical Aspects of Software Engineering.IEEE,2015:39-46.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!