计算机科学 ›› 2017, Vol. 44 ›› Issue (11): 69-79.doi: 10.11896/j.issn.1002-137X.2017.11.011

• 2016 年全国软件与应用学术会议 • 上一篇    下一篇

一种基于UML关系的Java代码库构造方法

姜人和,郑晓梅,朱晓倩,潘敏学,张天   

  1. 南京大学计算机软件新技术国家重点实验室 南京210023,南京中医药大学信息技术学院 南京210023,南京大学计算机软件新技术国家重点实验室 南京210023,南京大学计算机软件新技术国家重点实验室 南京210023,南京大学计算机软件新技术国家重点实验室 南京210023
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受基于MDE的异构数据建模及转换研究(61472180),基于场景规约的中断驱动系统的建模与验证技术研究(61502228),基于SysML和MARTE的异构数据模型转换方法研究(BK20141322),中断驱动系统的建模与分析(BK20150589)资助

Method of Java Code Repository Construction Based on UML Relationship

JIANG Ren-he, ZHENG Xiao-mei, ZHU Xiao-qian, PAN Min-xue and ZHANG Tian   

  • Online:2018-12-01 Published:2018-12-01

摘要: 关系信息是体现代码结构和语义的最重要的一类信息,如继承、聚合、组合、依赖、调用和创建实例等。为了更好地支持开源代码的理解与复用,提出了一种基于UML2关系的代码库构造方法。它以图数据库为实现平台,采用语言工程中经典的抽象语法树作为基础,并针对Java语言的特性和机制,设计富语义的Java代码属性图数据模型,在此基础上使得Java代码的图结构持久化。同时,为了屏蔽各种编程语言社区对代码中关系信息理解的差异性,采用UML2.4国际标准版本中定义的关系类型及语义解释,设计相应的代码关系抽取算法,为图节点添加对应的关系边。针对代码图化后的膨胀及代码库的空间存储消耗情况,选取9个常见的开源项目进行了实验评估。最后,给出了基于此代码库的查询应用实例。

关键词: 代码库,UML,图数据库,代码查询

Abstract: Relationship information is the most important information representing code structure and semantics,such as inheritance,aggregation,composition,dependency,call and creation.This paper provided a method constructing source repositories based on relationships defined in UML2 for better comprehension and reuse of the open source code.The graph database is used as the implementation plantform of the approach,and the abstract syntax tree is adopted as the base of graph schema.In addition,the schema is designed specifically according to the Java language so that semantics can be well presented.The key point of the approach is that the relationship definitions are prescribed strictly according to the UML2.4 specifications,which is an ISO version,so that the differentiation between different language communities about the comprehension of relationship can be eliminated.Each category of the relationships was studied and the corresponding recognition algorithms were designed.During the construction of the repository,the relationships are added as different kind of edges.The evaluation experiments of 9 open source projects were conducted to illustrate the expansion of the code graph and the consumed space.At last,two simple case studies of querying on the repository were given.

Key words: Code repository,UML,Graph database,Code query

[1] Neo4j.http://neo4j.com.
[2] Eclipse JDT.http://www.eclipse.org/jdt.
[3] OMG UML,Superstructure.Version2.4.1.http://www.omg.org/spec/UML/2.4.1.
[4] Cypher.http://neo4j.com/docs/stable/cypher-query-lang.html.
[5] FEIJS L,KRIKHAAR R,VAN OMMERING R.A relational approach to support software architecture analysis[J].Software:Practice and Experience,1998,8(4):371-400.
[6] VERBAERE M,HAJIYEV E,DE MOOR O.Improve software quality with semmlecode:an eclipse plugin for semantic code search[C]∥Companion to the 22nd ACM SIGPLAN Confe-rence on Object-oriented Programming Systems and Applications Companion.ACM,2007:880-881.
[7] YAMAGUCHI F,GOLDE N,ARP D,et al.Modeling and discovering vulnerabilities with code property graphs[C]∥Proceedings of the 2014 IEEE Symposium on Security and Privacy.IEEE,2014:590-604.
[8] PANCHENKO,KARSTENS J,PLATTNER H,et al.Precise and scalable querying of syntactical source code patterns using sample code snippets and a database[C]∥Proceedings of the 2011 IEEE 19th International Conference on Program Comprehension.IEEE,2011:41-50.
[9] LINTON M A.Implementing relational views of programs[C]∥Proceedings of the First ACM SIGSOFT/SIGPLAN Software Engineering Symposium on Practical Software Development Environments.ACM,1984:132-140.
[10] CHEN Y,NISHIMOTO M Y,RAMAMOORTHY C V.The c information abstraction system[J].IEEE Transactions on Software Engineering,1990,16(3):325.
[11] CHEN Y,GANSNER E R,KOUTSOFIOS E.A C++ data model supporting reachability analysis and dead code detection[C]∥Proceedings of the 6th European Software Engineering Conference Held Jointly with the 5th ACM SIGSOFT International Symposium on Foundations of Software Engineering.Springer,1997:414-431.
[12] HOLMES R,WALKER R J.Approximate structural contextmatching:an approach to recommend relevant examples[J].IEEE Transactions on Software Engineering,2006,32(12):952-970.
[13] BAJRACHARYA S,NGO T,LINSTEAD E,et al.Sourcerer:a search engine for open source code supporting structure-based search[C]∥Companion to the 21st ACM SIGPLAN Symposium on Object-oriented Programming Systems,Languages,and Applications.ACM,2006:681-682.
[14] BAJRACHARYA S,OSSHER J,OSSHER J.Sourcerer:an infrastructure for largescale collection and analysis of open-source code[J].Science of Computer Programming,2014,9:241-259.
[15] BADROS G.Javaml:A markup language for java source code[J].Computer Networks,2000,3(1):159-177.
[16] AGUIAR,DAVID G,BADROS G.Javaml 2.0:enriching the markup language for java source code[M]∥XML:Aplicac Oese Tecnologias Associadas,2004:1-12.
[17] EICHBERG M,HAUPT M,MEZINI M,et al.Comprehensive software understanding with sextant[C]∥Proceedings of the 21st IEEE International Conference on Software Maintenance.IEEE,2005:315-324.
[18] PANCHENKO,KARSTENS J,PLATTNER H,et al.Precise and scalable querying of syntactical source code patterns using sample code snippets and a database[C]∥Proceedings of the 2011 IEEE 19th International Conference on Program Comprehension.IEEE,2011:41-50.
[19] YAMAGUCHI F,GOLDE N,ARP D,et al.Modeling and discovering vulnerabilities with code property graphs[C]∥Proceedings of the 2014 IEEE Symposium on Security and Privacy.IEEE,2014:590-604.
[20] URMA R,MYCROFT A.Source-code queries with graph databases-with application to programming language usage and evolution[J].Science of Computer Programming,2015,97:127-134.
[21] ZHANG T,PAN M,ZHAO J,et al.An open framework for semantic code queries on heterogeneous repositories[C]∥Proceedings of the 2015 International Symposium on Theoretical Aspects of Software Engineering.IEEE,2015:39-46.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 雷丽晖,王静. 可能性测度下的LTL模型检测并行化研究[J]. 计算机科学, 2018, 45(4): 71 -75, 88 .
[2] 夏庆勋,庄毅. 一种基于局部性原理的远程验证机制[J]. 计算机科学, 2018, 45(4): 148 -151, 162 .
[3] 厉柏伸,李领治,孙涌,朱艳琴. 基于伪梯度提升决策树的内网防御算法[J]. 计算机科学, 2018, 45(4): 157 -162 .
[4] 王欢,张云峰,张艳. 一种基于CFDs规则的修复序列快速判定方法[J]. 计算机科学, 2018, 45(3): 311 -316 .
[5] 孙启,金燕,何琨,徐凌轩. 用于求解混合车辆路径问题的混合进化算法[J]. 计算机科学, 2018, 45(4): 76 -82 .
[6] 张佳男,肖鸣宇. 带权混合支配问题的近似算法研究[J]. 计算机科学, 2018, 45(4): 83 -88 .
[7] 伍建辉,黄中祥,李武,吴健辉,彭鑫,张生. 城市道路建设时序决策的鲁棒优化[J]. 计算机科学, 2018, 45(4): 89 -93 .
[8] 刘琴. 计算机取证过程中基于约束的数据质量问题研究[J]. 计算机科学, 2018, 45(4): 169 -172 .
[9] 钟菲,杨斌. 基于主成分分析网络的车牌检测方法[J]. 计算机科学, 2018, 45(3): 268 -273 .
[10] 史雯隽,武继刚,罗裕春. 针对移动云计算任务迁移的快速高效调度算法[J]. 计算机科学, 2018, 45(4): 94 -99, 116 .