Intelligent Software Engineering

Select

Web Service Crowdtesting Task Assignment Approach Based onReinforcement Learning

TANG Wen-jun,ZHANG Jia-li,CHEN Rong,GUO Shi-kai

Computer Science 2020, 47 (3): 54-60. DOI: 10.11896/jsjkx.191100085

Abstract （593）

PDF（pc）（1783KB）（1131）

Save

How to assign tasks to appropriate workers to get better testing results at a lower cost is an important problem.This paper modeled the CWS testing task assignment as a Markov decision process-based problem,and used Deep Q Network to learn and perform real-time online testing task assignment.The proposed approach based on reinforcement learning is named WTA-C.In addition,this paper calculated the probability of the testing worker completing the task within the duration through statistical conditional probability in accordance with the time of the worker’s historical execution of tasks,and used it as the workers’ reputation value to reflect their quality.The worker’s reputation is updated after each assignment.The experimental results show that WTA-C is superior to other real-time assignment methods based on heuristic strategies in controlling the “quality-cost” trade-off of testing tasks and ensuring worker quality,and its assignment effect is more than 18% higher than that of each heuristic strategy,which demonstrates that WTA-C can better adapt to the structure of the CWS and the characteristics of Crowdsourcing environment.

Reference | Related Articles | Metrics

Select

Taxonomy of Uncertainty Factors in Intelligence-oriented Cyber-physical Systems

YANG Wen-hua,XU Chang,YE Hai-bo,ZHOU Yu,HUANG Zhi-qiu

Computer Science 2020, 47 (3): 11-18. DOI: 10.11896/jsjkx.191100052

Abstract （609）

PDF（pc）（1853KB）（1237）

Save

Cyber-physical systems are increasingly presenting the characteristic of intelligence,while uncertainty is pervasive and intrinsic in them,e.g.,the sensors contain inevitable errors when the systems sense the environment through them.If the uncertainty is not properly handled,it will affect the correct running of the systems and bring a series of problems.Therefore,it is critical to study how to deal with uncertainty in cyber-physical systems.The premise of handling uncertainty is that we first need to understand and recognize it comprehensively.However,the existing work on the uncertainty of cyber-physical systems is still in its infancy.To address this issue,this paper studied the taxonomy of uncertainty in cyber-physical systems.Specifically,this paper classified the uncertainty based on the widely recognized 5C technology architecture in cyber-physical systems and introduced the possible uncertainties at each level of the technology architecture with illustrating examples in typical cyber-physical systems.Meanwhile,to help understand the current research status of uncertainty handling in the field of cyber-physical systems,this paper summarized the current research work and presented an outlook of future research directions for intelligence-oriented cyber-physical systems.

Reference | Related Articles | Metrics

Select

Approach of Automatic Fork Summary Generation in Open Source Community Based on Feature Extraction

ZHANG Chao,MAO Xin-jun,LU Yao

Computer Science 2020, 47 (3): 25-33. DOI: 10.11896/jsjkx.191000087

Abstract （734）

PDF（pc）（2504KB）（994）

Save

At present,distributed collaborative development based on P/R has become the dominant software development me-thod in open source community.Because of the openness,transparency and parallelism of the software development in P/R mo-del,it is difficult for developers to obtain the complete Fork profile of the whole project,and know whether other developers have accomplished the same or similar development tasks,which are prone to duplicate contributions and redundant development.To solve this problem,this paper proposed an automatic generation method of Fork summary to help project managers strengthen project management,avoid redundant contributions,and enhance cooperation and communication among developers.The proposed method firstly crawls Issue data with feature and Bug label information in open source community,and trains a classifier model with random forest method to classify Fork features.Then,it collects the data of Fork branch’s software development activities and uses TextRank algorithm to generate detailed Fork information to explain the main purpose of Fork activity.Finally,a set of combination rules and corresponding algorithm are designed to integrate Fork’s categories,features and other information to form a complete Fork summary.In order to validate the effectiveness of the proposed method,30 groups of manual tests and 60 groups of actual live study were conducted on Github.The results show that the accuracy of Fork summary generated by this method is 67.2%.In the experiment,76% of project managers believe that Fork summary can help to better manage projects,and strengthen communication and cooperation.

Reference | Related Articles | Metrics

Select

Study on Optimization of Design Pattern Combination Operation

JI Cheng-yu,ZHU Xue-feng

Computer Science 2020, 47 (3): 19-24. DOI: 10.11896/jsjkx.190100046

Abstract （378）

PDF（pc）（1622KB）（809）

Save

As a summary of software design experience,proper use of design patterns can effectively improve the reusability of software systems and ensure the quality of the final software products.However,in practical applications,people rarely use a single design pattern,and usually software designers need to use experience to combine multiple patterns according to the actual application scenarios,which may lead to uncertainty results and seriously affect the quality of software products.Although the exi-sting formal method of pattern combination can effectively express the result of pattern combination,the combination method has complex logic and contains a large number of redundant operations,which is difficult for designers to be familiar with and adopt.Aiming at the problems existing in the above pattern combination process,this paper deeply discussed the combination relationship between multiple patterns.Starting from the formal representation of design patterns,combined with the characteristics of Z language,this paper studied the existing formal methods of pattern combination in depth,and optimized the existing pattern combination operators.Based on the existing set of operators,the constraint,superposition and extension operators are proposed,the exact semantics of pattern composition are defined by the operators,and the algebraic reasoning process is used to verify that the optimized method can effectively replace the existing formal method of pattern combination,it can overcome the problems of redundant operators and low efficiency caused by too many operators in the existing formal methods of pattern combination.Finally,the effectiveness of the proposed method is verified by a case study of pattern combination.

Reference | Related Articles | Metrics

Select

Software Requirements Clustering Algorithm Based on Self-attention Mechanism and Multi- channel Pyramid Convolution

KANG Yan,CUI Guo-rong,LI Hao,YANG Qi-yue,LI Jin-yuan,WANG Pei-yao

Computer Science 2020, 47 (3): 48-53. DOI: 10.11896/jsjkx.190700146

Abstract （408）

PDF（pc）（1892KB）（954）

Save

With the rapid increasing in the number of software and the increasing variety of types,how to mine the text characteristics of software requirements and cluster the characteristics of software requirements has become a major challenge in the field of software engineering.The clustering of software requirements texts provides a reliable guarantee for the software development process while reducing the potential risks and negative impacts of the requirements analysis phase.However,the software requirements text has the characteristics of high dispersion,high noise,and sparse data.At present,the work related to clustering is limited to a single type of text,and the functional semantics of software requirements are rarely considered.In view of the characteristics of the demand text and the limitations of the traditional clustering method,this paper proposed a software demand clustering algorithm (SA-MPCN&SOM) combining the self-attention mechanism and multi-channel pyramid convolution.The method captures the global features through the self-attention mechanism,and then extract the required text features from the depth of the different windows based on multi-channel pyramid convolution.Thus,the perceived text fragments are multiplied,and finally the multiplexed text features are clustered using SOM.The experimental results on the software demand data show that the proposed method can better mine the demand features,cluster the demand features,and outperform other feature extraction methods and clustering algorithms.

Reference | Related Articles | Metrics

Select

Semantic Similarity Based API Usage Pattern Recommendation

ZHANG Yun-fan,ZHOU Yu,HUANG Zhi-qiu

Computer Science 2020, 47 (3): 34-40. DOI: 10.11896/jsjkx.190300053

Abstract （694）

PDF（pc）（1878KB）（1343）

Save

In the process of software development,reusing application programming interface (API) can improve the efficiency of software development.However,it is difficult and time-consuming for developers to use unfamiliar APIs.Previous researches tend to take APIs as inputs to search corpus and recommend API usage patterns,which does not conform to the habits of developers searching for API usage patterns.This paper proposed a novel Semantic Similarity based API Usage Pattern Recommendation approach (SSAPIR).This approach first adopts hierarchical clustering algorithm to extract API usage patterns,and then calculates the semantic similarity between queries and API usage patterns’ description information,aiming to recommend highly relevant and widely used API usage patterns to developers.To verify the effectiveness of SSAPIR,Java projects are collected from GitHub,from which the API usage patterns related to the 9 popular third-party API libraries and their description information are extracted.Ultimately,this paper recommended API usage patterns based on natural language queries which are related to the 9 third-party API libraries.To verify the effectiveness of SSAPIR,this paper measured the Hit@K of the recommendation results.The experimental results demonstrate that SSAPIR can effectively improve the accuracy of recommendation results and achieves an average accuracy of 85.02% in terms of Hit@10,which outperforms the state-of-art work.SSAPIR can complete the API usage pattern recommendation task greatly and provide accurate API usage pattern recommendation for developers by taking natural language queries as inputs.

Reference | Related Articles | Metrics

Select

Code Quality Recognition and Analysis Based on User’s Comments

XU Hai-yan,JIANG Ying

Computer Science 2020, 47 (3): 41-47. DOI: 10.11896/jsjkx.191100132

Abstract （429）

PDF（pc）（1729KB）（1363）

Save

With the development of IT community and code hosting platforms,the number of user’s comment about the code increasings dramatically.The comments given by users after using the code contain plenty of static and dynamic code quality information.The extraction and analysis of code quality information will help developers to understand the code quality information concerned by users and improve the quality of code.It is also helpful to users choose the code to meet the requirements.To this end,this paper proposed a code quality model including static and dynamic characteristics and a method to identify and analyze the code quality information in user’s comments.Firstly,the users’ comments with code quality are identified according to the evalua-tion objects and the evaluation sentence pattern rules.Secondly,the representations of the code quality attribute are extracted by using the evaluation objects and opinions.Finally,the related results of static and dynamic code quality are gained after analyzing the quality attributes representations and emotional tendency of code in user’s comments.The experimental results show that the proposed method can effectively analyze the code quality information in user’s comments.

Reference | Related Articles | Metrics

Select

Survey of Code Similarity Detection Methods and Tools

ZHANG Dan,LUO Ping

Computer Science 2020, 47 (3): 5-10. DOI: 10.11896/jsjkx.190500148

Abstract （880）

PDF（pc）（1428KB）（6049）

Save

Source code opening has become a new trend in the information technology field.While code cloning improves code quality and reduces software development cost to some extent,it also affects the stability,robustness and maintainability of a software system.Therefore,code similarity detection plays an important role in the development of computer and information security.To overcome the various hazards brought by code cloning,many code similarity detection methods and corresponding tools have been developed by academic and industrial circles.According to the manner of processing source code,these detection methodscould be roughly divided into five categories:text analysis based,lexical analysis based,grammar analysis based,semantics analysis based and metrics based.These detection tools can provide good detection performance in many application scenarios,but are also facing a series of challenges brought by ever-increasing data in this big data era.This paper firstly introduced code cloning problem andmade a detailed comparison between code similarity detection methods divided into five categories.Then,it classified and organized currently available code similarity detection tools.Finally,it comprehensively evaluated the detection performance of detection tools based on various evaluation criteria.Furthermore,the future research direction of code similarity detection was prospected.

Reference | Related Articles | Metrics

Select

Research Status and Development Trend of Identifier Normalization

ZHANG Jing-xuan, JIANG He

Computer Science 2020, 47 (3): 1-4. DOI: 10.11896/jsjkx.200200009

Abstract （675）

PDF（pc）（1397KB）（1648）

Save

As an important research content of source code analysis and comprehension,identifier normalization is the leading field of the current research of software engineering.Identifier normalization aims to parse identifiers into natural language terms so as to improve the understandability and maintainability of source code.There are generally two challenging steps in identifier normalization:identifier splitting and identifier expansion.This paper introduced the research status of identifier normalization in detail,conducted an in-depth analysis of the research status,and summarized the difficulties and deficiencies of the existing work.At the same time,in order to solve the difficulties and challenges in identifier normalization,this paper summarized and prospected the feasible solutions and future development trends in this field,hoping to guide more researchers into this important research field.

Reference | Related Articles | Metrics

Select

Computer Science 2020, 47 (3): 2-2.

Abstract （203）

PDF（pc）（375KB）（1296）

Save