Computer Science

Contents

Computer Science. 2020, 47 (3): 0-0.

Abstract

PDF(279KB) ( 879 )

RelatedCitation | Metrics

Research Status and Development Trend of Identifier Normalization

ZHANG Jing-xuan, JIANG He

Computer Science. 2020, 47 (3): 1-4. doi:10.11896/jsjkx.200200009

Abstract

PDF(1397KB) ( 2042 )

References | Related Articles | Metrics

As an important research content of source code analysis and comprehension,identifier normalization is the leading field of the current research of software engineering.Identifier normalization aims to parse identifiers into natural language terms so as to improve the understandability and maintainability of source code.There are generally two challenging steps in identifier normalization:identifier splitting and identifier expansion.This paper introduced the research status of identifier normalization in detail,conducted an in-depth analysis of the research status,and summarized the difficulties and deficiencies of the existing work.At the same time,in order to solve the difficulties and challenges in identifier normalization,this paper summarized and prospected the feasible solutions and future development trends in this field,hoping to guide more researchers into this important research field.

Survey of Code Similarity Detection Methods and Tools

ZHANG Dan,LUO Ping

Computer Science. 2020, 47 (3): 5-10. doi:10.11896/jsjkx.190500148

Abstract

PDF(1428KB) ( 7296 )

References | Related Articles | Metrics

Source code opening has become a new trend in the information technology field.While code cloning improves code quality and reduces software development cost to some extent,it also affects the stability,robustness and maintainability of a software system.Therefore,code similarity detection plays an important role in the development of computer and information security.To overcome the various hazards brought by code cloning,many code similarity detection methods and corresponding tools have been developed by academic and industrial circles.According to the manner of processing source code,these detection methodscould be roughly divided into five categories:text analysis based,lexical analysis based,grammar analysis based,semantics analysis based and metrics based.These detection tools can provide good detection performance in many application scenarios,but are also facing a series of challenges brought by ever-increasing data in this big data era.This paper firstly introduced code cloning problem andmade a detailed comparison between code similarity detection methods divided into five categories.Then,it classified and organized currently available code similarity detection tools.Finally,it comprehensively evaluated the detection performance of detection tools based on various evaluation criteria.Furthermore,the future research direction of code similarity detection was prospected.

Taxonomy of Uncertainty Factors in Intelligence-oriented Cyber-physical Systems

YANG Wen-hua,XU Chang,YE Hai-bo,ZHOU Yu,HUANG Zhi-qiu

Computer Science. 2020, 47 (3): 11-18. doi:10.11896/jsjkx.191100052

Abstract

PDF(1853KB) ( 1684 )

References | Related Articles | Metrics

Cyber-physical systems are increasingly presenting the characteristic of intelligence,while uncertainty is pervasive and intrinsic in them,e.g.,the sensors contain inevitable errors when the systems sense the environment through them.If the uncertainty is not properly handled,it will affect the correct running of the systems and bring a series of problems.Therefore,it is critical to study how to deal with uncertainty in cyber-physical systems.The premise of handling uncertainty is that we first need to understand and recognize it comprehensively.However,the existing work on the uncertainty of cyber-physical systems is still in its infancy.To address this issue,this paper studied the taxonomy of uncertainty in cyber-physical systems.Specifically,this paper classified the uncertainty based on the widely recognized 5C technology architecture in cyber-physical systems and introduced the possible uncertainties at each level of the technology architecture with illustrating examples in typical cyber-physical systems.Meanwhile,to help understand the current research status of uncertainty handling in the field of cyber-physical systems,this paper summarized the current research work and presented an outlook of future research directions for intelligence-oriented cyber-physical systems.

Study on Optimization of Design Pattern Combination Operation

JI Cheng-yu,ZHU Xue-feng

Computer Science. 2020, 47 (3): 19-24. doi:10.11896/jsjkx.190100046

Abstract

PDF(1622KB) ( 1114 )

References | Related Articles | Metrics

As a summary of software design experience,proper use of design patterns can effectively improve the reusability of software systems and ensure the quality of the final software products.However,in practical applications,people rarely use a single design pattern,and usually software designers need to use experience to combine multiple patterns according to the actual application scenarios,which may lead to uncertainty results and seriously affect the quality of software products.Although the exi-sting formal method of pattern combination can effectively express the result of pattern combination,the combination method has complex logic and contains a large number of redundant operations,which is difficult for designers to be familiar with and adopt.Aiming at the problems existing in the above pattern combination process,this paper deeply discussed the combination relationship between multiple patterns.Starting from the formal representation of design patterns,combined with the characteristics of Z language,this paper studied the existing formal methods of pattern combination in depth,and optimized the existing pattern combination operators.Based on the existing set of operators,the constraint,superposition and extension operators are proposed,the exact semantics of pattern composition are defined by the operators,and the algebraic reasoning process is used to verify that the optimized method can effectively replace the existing formal method of pattern combination,it can overcome the problems of redundant operators and low efficiency caused by too many operators in the existing formal methods of pattern combination.Finally,the effectiveness of the proposed method is verified by a case study of pattern combination.

Approach of Automatic Fork Summary Generation in Open Source Community Based on Feature Extraction

ZHANG Chao,MAO Xin-jun,LU Yao

Computer Science. 2020, 47 (3): 25-33. doi:10.11896/jsjkx.191000087

Abstract

PDF(2504KB) ( 1344 )

References | Related Articles | Metrics

At present,distributed collaborative development based on P/R has become the dominant software development me-thod in open source community.Because of the openness,transparency and parallelism of the software development in P/R mo-del,it is difficult for developers to obtain the complete Fork profile of the whole project,and know whether other developers have accomplished the same or similar development tasks,which are prone to duplicate contributions and redundant development.To solve this problem,this paper proposed an automatic generation method of Fork summary to help project managers strengthen project management,avoid redundant contributions,and enhance cooperation and communication among developers.The proposed method firstly crawls Issue data with feature and Bug label information in open source community,and trains a classifier model with random forest method to classify Fork features.Then,it collects the data of Fork branch’s software development activities and uses TextRank algorithm to generate detailed Fork information to explain the main purpose of Fork activity.Finally,a set of combination rules and corresponding algorithm are designed to integrate Fork’s categories,features and other information to form a complete Fork summary.In order to validate the effectiveness of the proposed method,30 groups of manual tests and 60 groups of actual live study were conducted on Github.The results show that the accuracy of Fork summary generated by this method is 67.2%.In the experiment,76% of project managers believe that Fork summary can help to better manage projects,and strengthen communication and cooperation.

Semantic Similarity Based API Usage Pattern Recommendation

ZHANG Yun-fan,ZHOU Yu,HUANG Zhi-qiu

Computer Science. 2020, 47 (3): 34-40. doi:10.11896/jsjkx.190300053

Abstract

PDF(1878KB) ( 2089 )

References | Related Articles | Metrics

In the process of software development,reusing application programming interface (API) can improve the efficiency of software development.However,it is difficult and time-consuming for developers to use unfamiliar APIs.Previous researches tend to take APIs as inputs to search corpus and recommend API usage patterns,which does not conform to the habits of developers searching for API usage patterns.This paper proposed a novel Semantic Similarity based API Usage Pattern Recommendation approach (SSAPIR).This approach first adopts hierarchical clustering algorithm to extract API usage patterns,and then calculates the semantic similarity between queries and API usage patterns’ description information,aiming to recommend highly relevant and widely used API usage patterns to developers.To verify the effectiveness of SSAPIR,Java projects are collected from GitHub,from which the API usage patterns related to the 9 popular third-party API libraries and their description information are extracted.Ultimately,this paper recommended API usage patterns based on natural language queries which are related to the 9 third-party API libraries.To verify the effectiveness of SSAPIR,this paper measured the Hit@K of the recommendation results.The experimental results demonstrate that SSAPIR can effectively improve the accuracy of recommendation results and achieves an average accuracy of 85.02% in terms of Hit@10,which outperforms the state-of-art work.SSAPIR can complete the API usage pattern recommendation task greatly and provide accurate API usage pattern recommendation for developers by taking natural language queries as inputs.

Code Quality Recognition and Analysis Based on User’s Comments

XU Hai-yan,JIANG Ying

Computer Science. 2020, 47 (3): 41-47. doi:10.11896/jsjkx.191100132

Abstract

PDF(1729KB) ( 1718 )

References | Related Articles | Metrics

With the development of IT community and code hosting platforms,the number of user’s comment about the code increasings dramatically.The comments given by users after using the code contain plenty of static and dynamic code quality information.The extraction and analysis of code quality information will help developers to understand the code quality information concerned by users and improve the quality of code.It is also helpful to users choose the code to meet the requirements.To this end,this paper proposed a code quality model including static and dynamic characteristics and a method to identify and analyze the code quality information in user’s comments.Firstly,the users’ comments with code quality are identified according to the evalua-tion objects and the evaluation sentence pattern rules.Secondly,the representations of the code quality attribute are extracted by using the evaluation objects and opinions.Finally,the related results of static and dynamic code quality are gained after analyzing the quality attributes representations and emotional tendency of code in user’s comments.The experimental results show that the proposed method can effectively analyze the code quality information in user’s comments.

Software Requirements Clustering Algorithm Based on Self-attention Mechanism and Multi- channel Pyramid Convolution

KANG Yan,CUI Guo-rong,LI Hao,YANG Qi-yue,LI Jin-yuan,WANG Pei-yao

Computer Science. 2020, 47 (3): 48-53. doi:10.11896/jsjkx.190700146

Abstract

PDF(1892KB) ( 1266 )

References | Related Articles | Metrics

With the rapid increasing in the number of software and the increasing variety of types,how to mine the text characteristics of software requirements and cluster the characteristics of software requirements has become a major challenge in the field of software engineering.The clustering of software requirements texts provides a reliable guarantee for the software development process while reducing the potential risks and negative impacts of the requirements analysis phase.However,the software requirements text has the characteristics of high dispersion,high noise,and sparse data.At present,the work related to clustering is limited to a single type of text,and the functional semantics of software requirements are rarely considered.In view of the characteristics of the demand text and the limitations of the traditional clustering method,this paper proposed a software demand clustering algorithm (SA-MPCN&SOM) combining the self-attention mechanism and multi-channel pyramid convolution.The method captures the global features through the self-attention mechanism,and then extract the required text features from the depth of the different windows based on multi-channel pyramid convolution.Thus,the perceived text fragments are multiplied,and finally the multiplexed text features are clustered using SOM.The experimental results on the software demand data show that the proposed method can better mine the demand features,cluster the demand features,and outperform other feature extraction methods and clustering algorithms.

Web Service Crowdtesting Task Assignment Approach Based onReinforcement Learning

TANG Wen-jun,ZHANG Jia-li,CHEN Rong,GUO Shi-kai

Computer Science. 2020, 47 (3): 54-60. doi:10.11896/jsjkx.191100085

Abstract

PDF(1783KB) ( 1487 )

References | Related Articles | Metrics

How to assign tasks to appropriate workers to get better testing results at a lower cost is an important problem.This paper modeled the CWS testing task assignment as a Markov decision process-based problem,and used Deep Q Network to learn and perform real-time online testing task assignment.The proposed approach based on reinforcement learning is named WTA-C.In addition,this paper calculated the probability of the testing worker completing the task within the duration through statistical conditional probability in accordance with the time of the worker’s historical execution of tasks,and used it as the workers’ reputation value to reflect their quality.The worker’s reputation is updated after each assignment.The experimental results show that WTA-C is superior to other real-time assignment methods based on heuristic strategies in controlling the “quality-cost” trade-off of testing tasks and ensuring worker quality,and its assignment effect is more than 18% higher than that of each heuristic strategy,which demonstrates that WTA-C can better adapt to the structure of the CWS and the characteristics of Crowdsourcing environment.

Neighborhood Knowledge Distance Measure Model Based on Boundary Regions

YANG Jie,WANG Guo-yin,LI Shuai

Computer Science. 2020, 47 (3): 61-66. doi:10.11896/jsjkx.190500174

Abstract

PDF(1389KB) ( 1020 )

References | Related Articles | Metrics

Uncertainty measure of rough sets plays an important role in knowledge acquisition.In neighborhood rough sets,the current researches on uncertainty measure mainly focus on measuring the uncertainty of a single knowledge space and its monotonicity with the changing granularities.However,there are still some shortcomings.Firstly,the uncertainty of neighborhood rough set comes from elements belonging to target concept and elements not belonging to target concept in neighborhood granules,but current researches do not consider the two parts of each neighborhood information granule at the same time.Secondly,the difference between different knowledge spaces for describing the target concept is hard to reflect.Thirdly,the current knowledge distance measures are too fine,which contains granularity information and is inaccurate in some applications,i.e.heuristic search in attribute reduction.Therefore,based on the granularity measure of neighborhood information granules,this paper constructed the neighborhood entropy which is monotonic with the granularity being finer.In order to reflect the difference between different neighborhood information granule for describing the target concept,this paper proposed a neighborhood granule distance with approximate description ability,which is called relative neighborhood granule distance (RNGD).Then,several important properties were presented.The neighborhood knowledge distance based on boundary regions was established based on the RNGD,which can reflect the difference between different neighborhood knowledge spaces for describing the target concept.Finally,the validity of neighborhood knowledge distance based on decision regions was verified by experiments.

Attribute Reduction of Fuzzy Rough Set Based on Distance Ratio Scale

CHEN Yi-ning,CHEN Hong-mei

Computer Science. 2020, 47 (3): 67-72. doi:10.11896/jsjkx.190100196

Abstract

PDF(2194KB) ( 989 )

References | Related Articles | Metrics

Attribute reduction can effectively remove the unnecessary attributes in order to improve the performance of the classifiers.Fuzzy rough set theory is an important formal of processing the uncertain information.In the fuzzy rough set model,the approximations of an object may be affected by uncertain distribution of samples.Consequently,acquiring effective attribute reduction may be influenced.In order to effectively define approximations,this paper proposed a novel fuzzy rough set model named distance ratio scale based fuzzy rough set.The definition of samples based on distance ratio scale is introduced.The influence of uncertain distribution of samples to approximations is avoided by controlling the distance ratio scale.The basic properties of this fuzzy rough set model are presented and the new dependent function is defined.Furthermore,the algorithm for attribute reduction is designed.SVM,NaiveBayes,and J48 were used as test classifier executed on UCI data sets to verify the performance of the proposed algorithm.The experimental results show that attribute reduction can be effectively obtained by the proposed attribute reduction algorithm and the classification precisions of classifiers are improved.

Attribute Reduction Algorithm Based on Optimized Discernibility Matrix and Improving Discernibility Information Tree

XU Yi,TANG Jing-xin

Computer Science. 2020, 47 (3): 73-78. doi:10.11896/jsjkx.190500125

Abstract

PDF(1409KB) ( 980 )

References | Related Articles | Metrics

Discernibility matrix expresses the distinguishing information of all objects in the information system with matrix elements,which provides a new idea for attribute reduction.However,the traditional discernibility matrix uses the core attributes to eliminate redundant element items after the construction is finished,ignoring the role of the core attributes in the matrix construction process.In response to this problem,the following research is done.Firstly,the definition of the discernibility matrix is optimized.Before calculating the distinguishing information of any two objects,it is first determined whether the values on the core attributes are equal.If not,the corresponding element items are directly recorded as ø,and the judgment of other attributes is ignored.Secondly,the concept of attribute weighted importance is proposed.The ratio of each condition attribute to the non-empty element term in the discernibility matrix (called macro importance) and the contribution of each attribute to the distinguishing object (called micro Importance) are comprehensively considered,and the rationality of the measurement method is illustrated by an example.Thirdly,aiming at the disadvantages that there are a lot of redundant elements and empty sets in the optimized discernibility matrix,by combining the concept of discernibility information tree,discernibility information tree based on optimized discernibility matrix and attribute weighted importance is proposed.All non-empty element items in the optimized discernibility matrix are sorted according to attribute weighted importance,so that attributes with high importance are shared by more nodes.Element items that do not contain core attributes are mapped to a path in the tree during the build process,while element items that contain core attributes are ignored.Finally,a reduction algorithm HSDI-tree based on optimized discernibility matrix and improving discernibility information tree is proposed.This paper compared the reduction results and the number of nodes of the HSDI-tree algorithm,CDI-tree,DI-tree and IDI-tree algorithms on the five data sets of UCI.The experimental results show that the HSDI-tree algorithm can effectively find the minimum attribute reduction and has better space compression ability.

Clustering Algorithm by Fast Search and Find of Density Peaks for Complex High-dimensional Data

CHEN Jun-fen,ZHANG Ming,ZHAO Jia-cheng

Computer Science. 2020, 47 (3): 79-86. doi:10.11896/jsjkx.190400123

Abstract

PDF(4067KB) ( 1399 )

References | Related Articles | Metrics

Unsupervised clustering in machine learning is widely applied in various object recognition tasks.A novel clustering algorithm based on density peaks (DPC) can find out cluster center points quickly in decision graph and the number of clusters.However,when dealing with the data of complex distribution shape and high-dimensional image data,there are still some problems in DPC algorithm,such as difficult to determine the cluster center points and few clusters.In order to improve its robustness in dealing with complex high-dimensional data,an improved DPC clustering algorithm (AE-MDPC) was presented,which employs an autoencoder,a kind of unsupervised learning method,to obtain the optimal feature representation from input data,and manifold similarity of pairwise data to describe the global consistence.The autoencoder can reduce feature noises via reducing dimension of the high-dimensional image data,whilst manifold distance can lead to the densities of the potential cluster centers become global peaks.AE-MDPC algorithm was compared with K-means,DBSCAN,DPC and DPC combined PCA on four artificial datasets and four real face image datasets.The experimental results demonstrate that AE-MDPC outperforms the other clustering algorithms on clustering accuracy,adjusted mutual information and adjusted rand index,meanwhile AE-MDPC provides better clustering visualization.Overall,the proposed AE-MDPC algorithm can effectively handle complex manifold data and high-dimensional image data.

Attribute Reduction Based on Local Adjustable Multi-granulation Rough Set

HOU Cheng-jun,MI Ju-sheng,LIANG Mei-she

Computer Science. 2020, 47 (3): 87-91. doi:10.11896/jsjkx.190500162

Abstract

PDF(1387KB) ( 1014 )

References | Related Articles | Metrics

In classical multi-granulation rough set models,multiple equivalent relations (multiple granular structures) are used to approximate a target set.According to optimistic and pessimistic strategies,there are two types of common multi-granulation called optimistic multi-granulation and pessimistic multi-granulation respectively.The two combination rules seem to lack of practicability since one is too restrictive and the other too relaxed.In addition,multi-granulation rough set model is highly time-consuming because it is necessary to scan all the objects when approximating a concept.To overcome this disadvantage and enlarge the using range of multi-granulation rough set model,this paper firstly introduced the adjustable multi-granulation rough set model in incomplete information system and defined the local adjustable multi-granulation rough set model.Secondly,this paper proved that local adjustable multi-granulation rough set and adjustable multi-granulation rough set have the same upper and lower approximations.By defining the concepts of lower approximation cosistent set,lower approximation reduction,lower approximation quality,lower approximation quality reduction,and importance of internal and external,a local adjustable multi-granulation rough set model for attribute reduction was proposed.Furthermore,a heuristic algorithm of attribute reduction was constructed based on granular significance.Finally,the effectiveness of the method was illustrated through examples.The experimental results show that local adjustable size rough set model can accurately process the data of incomplete information system,and it can reduce the complexity of the algorithm.

Class-specific Distribution Preservation Reduction in Interval-valued Decision Systems

YANG Wen-jing,ZHANG Nan,TONG Xiang-rong,DU Zhen-bin

Computer Science. 2020, 47 (3): 92-97. doi:10.11896/jsjkx.190500180

Abstract

PDF(2103KB) ( 1045 )

References | Related Articles | Metrics

Attribute reduction is one of the important areas in rough set theory.A minimal set of attributes which preserves a certain classification ability in decision tables is solved through a process of attribute reduction,and the process is to remove the redundant feature attributes and select the useful feature subset.A distribution reduct can preserve the distribution of all decision classes in decision tables,but the reducts of all decision classes may not be necessary in the practice.To solve the above problems,this paper proposed the concept of class-specific distribution preservation reduction based on α-tolerance relations in interval-valued decision systems.Some theorems of class-specific distribution preservation reduction were proved and the relevant discerni-bility matrix of class-specific distribution preservation reduction was constructed.And then this paper proposed class-specific distribution preservation reduction algorithm based on discernibility matrices (CDRDM),and analyzed the relationship between the set of non-empty elements in the discernibility matrices constructed by class-specific distribution preservation reduction algorithm and distribution preservation reduction algorithm (DRDM).In the experiment,six sets of UCI data sets were selected and the interval parameter was introduced.When the interval parameter is 1.2 and threshold is 0.5,the results and average length of reducts in DRDM algorithm and CDRDM algorithm were compared.When the interval parameter is 1.2 and 1.6 and threshold is 0.4 and 0.5 respectively,the changes of reduction time of DRDM algorithm and CDRDM algorithm with the number of objects and attributes were given.Moreover,the experiment indicates that CDRDM algorithm has different results for different decision classes.And when there are more than one decision class in decision tables,the average length of reducts of CDRDM algorithm is less than or equal to the average length of reducts of DRDM algorithm,the reduction efficiency based on different decision classes in CDRDM algorithm is improved in varying degrees.

Judgment Methods of Interval-set Consistent Sets of Dual Interval-set Concept Lattices

GUO Qing-chun,MA Jian-min

Computer Science. 2020, 47 (3): 98-102. doi:10.11896/jsjkx.190500098

Abstract

PDF(1524KB) ( 946 )

References | Related Articles | Metrics

The dual interval-set concept lattice is generated by introducing the interval set into the dual concept lattice.It extends the extension and intension of the dual concept from the classical sets to the interval sets,which makes it to be a mathematical tool to describe uncertain concepts.As one of the core topics of data mining,attribute reduction is a method to study the essential characteristics of concept lattice.It simplifies the representation of the concept by removing redundant attributes.This paper mainly discussed the judgment approaches of the interval-set consistent sets of the dual interval-set concept lattices.Firstly,based on the isomorphisim for the structure of the dual interval-set concept lattices,interval-set consistent sets were defined,and a series of judgment theorems were then investigated for the dual interval-set concept lattices.Then,the method about obtaining attribute reduction interval-set by using the interval-set consistent set was described.

Session-based Recommendation Algorithm Based on Recurrent Temporal Convolutional Network

LI Tai-song,HE Ze-yu,WANG Bing,YAN Yong-hong,TANG Xiang-hong

Computer Science. 2020, 47 (3): 103-109. doi:10.11896/jsjkx.190500183

Abstract

PDF(1616KB) ( 2102 )

References | Related Articles | Metrics

Since the Recurrent Neural Network (RNN) generally models transition patterns,ignores the inner connection of items and can’t model the long-term evolving patterns of sequential data in session-based recommendations.A Recurrent Temporal Convolutional Network (RTCN) was proposed.Firstly,each item in the sequence is embedded as a vector,the multi-layer casual convolutions and dilated convolutions are applied so that the receptive field is improved and the long-term connections are established.A residual network is stacked to extract features from different layers.Therefore,the gradient vanishing or even disappearing in back propagation can be solved.With above operations,a well-designed Temporal Convolutional Network (TCN) is established.It extracts local features from sequence items,maps item information into latent space and generates fine-grained feature vectors as results.To further explore the connections between items in macroscopic way,the feature vectors are feed into Gated Recurrent Unit (GRU).After multiple iterations and updates to hidden states,the model can make a prediction of the next item.RTCN can extract long-time,multi-dimension,fine-grained local features from inputs by adapting temporal convolutional network.It also models the long-distance connections between items,captures the transition patterns and infers the next items by using GRU networks.The experimental results demonstrate that the RTCN model outperforms 6%～13% than RNN-based model and 9%～59% than other traditional recommendation methods under the metrics of Recall and Mean Reciprocal Rank (MRR).By comparing different definitions of loss,RTCN performs best under the cross entropy loss function.Meanwhile,due to the TCN multi-channel structure,the proposed model has a high potential capacity to embedding context features of items and users when the dataset information is rich.

Keywords Extraction Method Based on Semantic Feature Fusion

GAO Nan,LI Li-juan,Wei-william LEE,ZHU Jian-ming

Computer Science. 2020, 47 (3): 110-115. doi:10.11896/jsjkx.190700041

Abstract

PDF(2044KB) ( 1969 )

References | Related Articles | Metrics

Keyword extraction is widely used in the field of text mining,which is the prerequisite technology of text automatic summarization,classification and clustering.Therefore,it is very important to extract high quality keywords.At present,most researches on keyword extraction methods only consider some statistical features,but not the implicit semantic features of words,which leads to the low accuracy of extraction results and the lack of semantic information of keywords.To solve this problem,this paper designed a quantification method of the features between words and text themes.First,the word vector method is used to mine the context semantic relations of words.Then the main semantic features of the text is extracted by clustering.Finally,the distance between the words and the topic with the similar distance method is calculated.It is regarded as the semantic features of word.In addition,by combining the semantic features of word with the features of word frequency,length,location,language and other various description of words,a keywords extraction method of short text with semantic features was proposed,namely SFKE method.This method analyzes the importance of words from the statistical and semantic aspects,thus can extract the most relevant keyword set by integrating many factors.Experimental results show that the keyword extraction method integrating multiple features has significant improvement compared with TFIDF,TextRank,Yake,KEA,AE methods.The F-Score of this methodhas improved by 9.3% compared with AE.In addition,this paper used the method of information gain to evaluate the importance of features.The experimental results show that the F-Score of the model is increased by 7.2% after adding semantic feature.

Review of Maritime Target Detection in Visible Bands of Optical Remote Sensing Images

LIU Jun-qi,LI Zhi,ZHANG Xue-yang

Computer Science. 2020, 47 (3): 116-123. doi:10.11896/jsjkx.190300102

Abstract

PDF(1769KB) ( 3081 )

References | Related Articles | Metrics

Maritime target detection based on visible bands of optical remote sensing images is a research hotspot in the field of remote sensing.In order to promote the development of maritime target detection based on visible bands of optical remote sen-sing images,this paper summarized the current major methods.Firstly,this paper introduced the target characteristics of visible bands of optical remote sensing images and the basic process of image target detection,and analyzed the research status of remote sensing image target detection.Secondly,aiming at the problem of rapid detection of maritime target,this paper introduced the research status of visual saliency method in remote sensing image target detection.Thirdly,aiming at the problem of remote sensing image classification and recognition,this paper introduced the research status of convolutional neural network in remote sensing image target detection.Finally,this paper summarized the existing problems and future research directions of the current methods for maritime target detection.

Fusion of Infrared and Color Visible Images Based on Improved BEMD

ZHU Ying,XIA Yi-li,PEI Wen-jiang

Computer Science. 2020, 47 (3): 124-129. doi:10.11896/jsjkx.190100038

Abstract

PDF(2266KB) ( 1294 )

References | Related Articles | Metrics

Image fusion between the infrared and color visible images can enhance vision and improve the situation awareness.A direct use of the bidimensional empirical mode decomposition (BEMD) method for image fusion suffers from a high computation cost.Therefore,this paper proposed an improved BEMD for a fast and adaptive image fusion of infrared and color visible images.It is achieved by using order statistics filter and modified Gaussian filter to calculate the mean envelope directly,so as to accelerate the sifting process within the original BEMD.Firstly,the color visible image is transformed into IHS components.Secondly,the intensity component and the infrared image are decomposed into high frequency components and the low frequency components by means of the improved BEMD.Then,the adaptive local weighted fusion rule and the arithmetic mean rule are respectively applied to fuse the high frequency components and the low frequency components.Finally,the new intensity is transformed back into RGB.The proposed image fusion scheme is not only fast but also able to achieve the best fusion result,which merges edge details in the infrared image and the spectral information in the color visible image well.

Grid-driven Bi-directional Image Stitching Algorithm

PANG Rong,LAI Lin-jing,ZHANG Lei

Computer Science. 2020, 47 (3): 130-136. doi:10.11896/jsjkx.190100239

Abstract

PDF(6934KB) ( 1455 )

References | Related Articles | Metrics

Image stitching is to merge multiple images from different views into one image with a wider view.This requires the minimum of both ghosting in the overlapping region and distortion in the non-overlapping region.This paper proposed a grid-drivenbi-directional image stitching algorithm based on the Moving DLT.As for the overlapping region,this paper uses bi-directional Moving DLT to align feature points and judge the way of image overlapping by the quantitative evaluation,which has accurate stitching and less ghosting.As for the non-overlapping region,the interpolation of mesh vertices after homography transformation and similarity transformation is used to correct,thus reducing the distortion of the non-overlapping region.The experimental results show that the proposed bi-directional image stitching method is more accurate than the one directional image stitching method,the average absolute error (MAE) of the corresponding points has a decline about 0.2,and the stitching result is more natural and smooth.

Adaptive Levenberg-Marquardt Cloud Registration Method for 3D Reconstruction

ZENG Jun-fei,YANG Hai-qing,WU Hao

Computer Science. 2020, 47 (3): 137-142. doi:10.11896/jsjkx.190200261

Abstract

PDF(2871KB) ( 1281 )

References | Related Articles | Metrics

To address the problems that point cloud registration process in three-dimensional (3D) reconstruction is susceptible to environmental noise,point cloud exposure,illumination,object occlusion and other factors,as well as the traditional ICP registration algorithm with low accuracy and long time-consuming,this paper proposed a point cloud registration algorithm based on adaptive Levenberg-Marquart.Firstly,the initial point cloud data is pretreated by way of statistical filtering and voxel raster filtering,and then the filtered point cloud is stratified to eliminate the outlier data,so as to improve the accuracy of subsequent point cloud registration.Furthermore,aiming at the problem that traditional point cloud feature description method is computation-intensive,smoothness parameter is adopted to conduct extracting point cloud features and improve the efficiency of point cloud re-gistration.Finally,the point-to-line and point-to-surface constraints between frames are established on the basis of the point cloud features,and the modified Levenberg-Marquardt method is utilized to realize point cloud registration,so as to construct a satis-fying 3D reconstruction model.The experimental results show that the proposed point cloud registration method is suitable for 3D reconstruction of indoor and outdoor scenes,with outstanding environmental adaptability.Meanwhile,the accuracy and efficiency of point cloud registration are greatly improved compared with the traditional methods.

Optimization of Compressed Sensing Reconstruction Algorithms Based on Convolutional Neural Network

LIU Yu-hong,LIU Shu-ying,FU Fu-xiang

Computer Science. 2020, 47 (3): 143-148. doi:10.11896/jsjkx.190100199

Abstract

PDF(2964KB) ( 2099 )

References | Related Articles | Metrics

Compressed sensing theory is widely used in image and video signal processing because of its low coding complexity,resource saving and strong anti-jamming ability.However,the traditional compressed sensing technology also faces such problems as long reconstruction time,high algorithm complexity,multiple iterations and large amount of computation.Aiming at the time and quality of image reconstruction,a new convolutional neural network structure named Combine Network (CombNet) was proposed.It takes the measured values of compressed sensing as the input of convolutional neural network,connects a full connection layer,and then obtains the final output through CombNet.Experiment results show that CombNet has lower complexity and better recovery performance.At the same sampling rate,the peak signal-to-noise ratio (PSNR) of CombNet is 7.2%~13.95% higher than that of TVAL3,and 7.72%~174.84% higher than that of D-AMP.The reconstruction time of CombNe is three orders of magnitude higher than that of traditional reconstruction algorithm.When the sampling rate is very low (the sampling rate is 0.01),the average PSNR of CombNet is 11.982dB higher than D-AMP,therefore the proposed algorithm has better visual attraction.

Musical Note Recognition of Musical Instruments Based on MFCC and Constant Q Transform

CHEN Yan-wen,LI Kun,HAN Yan,WANG Yan-ping

Computer Science. 2020, 47 (3): 149-155. doi:10.11896/jsjkx.190100224

Abstract

PDF(4003KB) ( 1841 )

References | Related Articles | Metrics

Musical note recognition is a very important research content in the field of music signal analyzing and processing.It provides a technical basis for automatic music transcription,musical instrument tuning,music database retrieval and electronic music synthesis.In the conventional note recognition method,the musical note of one-to-one correspondence is identified by estimating the fundamental frequency of the note and the standard frequency.However,one-to-one correspondence is more difficult to identify,and the error increases as the fundamental frequency of the musical note increases.And the identifiable musical note frequency range is not wide.To this end,the paper used the idea of classification for musical note recognition,and established the required musical note library.For the importance of the low frequency information of the music signal,the Mel Frequency Cepstrum Coefficient (MFCC) and the Constant Q Transform (CQT) are selected as the note signal extraction features.The extracted features MFCC and CQT are respectively input as a note recognition single feature,and the feature fusion input is performed.Combining the advantages of Softmax regression model in multi-classification problem and the good nonlinear mapping ability and self-learning ability of BP neural network,the BP neural network multi-classification recognizer is constructed based on Softmax regression model.In the simulation environment of MATLAB R2016a,the characteristic parameters were input into the multi-classifier for learning and training,and the optimal solution was found by adjusting the network parameters.The comparative experi-ment was performed by changing the number of training samples.The experimental result data shows that when the fusion feature (MFCC+CQT) is used as the feature input,25 types of notes from the big character group to the small character group can be identified,and the average recognition rate of 95.6% can be obtained.And the feature CQT has a greater contribution than the feature MFCC in the recognition process.The experimental data fully demonstrates that using classification ideas for musical note recognition can achieve good recognition results and is not limited by the range of the musical note’s fundamental frequency.

Road Extraction Algorithm of Multi-feature High-resolution SAR Image Based on Multi-Path RefineNet

CHEN Li-fu,LIU Yan-zhi,ZHANG Peng,YUAN Zhi-hui,XING Xue-min

Computer Science. 2020, 47 (3): 156-161. doi:10.11896/jsjkx.190100124

Abstract

PDF(2964KB) ( 1385 )

References | Related Articles | Metrics

In order to solve the problems of existing SAR image road extraction algorithm with poor automation and poor universality,a multi-feature road extraction algorithm was proposed based on the multi-path refinement network.Firstly,gabor transformation and gray level-gradient co-occurrence matrix transformation are performed on SAR images to obtain rich road feature information.A multi-path refinement network is formed by coupling the cascade refinement network and the residual network.Then,the SAR original image,the acquired low-level feature image and the label image are input into the new network for trai-ning,and the road features extracted from each layer of network are fully utilized to obtain the initial road segmentation results.Finally,mathematical morphology operation is used to connect the initial road fracture and remove false alarm.This algorithm is used for road extraction of SAR images with different resolutions.The experimental results show that this algorithm has a wide range of application in SAR image extraction and the effect of road extraction is better.

Survey of Natural Language Processing Pre-training Techniques

LI Zhou-jun,FAN Yu,WU Xian-jie

Computer Science. 2020, 47 (3): 162-173. doi:10.11896/jsjkx.191000167

Abstract

PDF(1678KB) ( 6635 )

References | Related Articles | Metrics

In recent years,with the rapid development of deep learning,the pre-training technology for the field of natural language processing has made great progress.In the early days of natural language processing,the word embedding methods such as Word2Vec were used to encode text.These word embedding methods can also be regarded as static pre-training techniques.However,the context-independent text representation has limitation and cannot solve the polysemy problem.The ELMo pre-training language model gives a context-dependent method that can effectively handle polysemy problems.Later,GPT,BERT and other pre-training language models have been proposed,especially the BERT model,which significantly improves the effect on many typical downstream tasks,greatly promotes the technical development in the field of natural language processing,and thus initia-tes the age of dynamic pre-training.Since then,a number of pre-training language models such as BERT-based improved models and XLNet have emerged,and pre-training techniques have become an indispensable mainstream technology in the field of natural language processing.This paper first briefly introduce the pre-training technology and its development history,and then comb the classic pre-training techniques in the field of natural language processing,including the early static pre-training techniques and the classic dynamic pre-training techniques.Then the paper briefly comb a series of inspiring pre-training techniques,including BERT-based models and XLNet.On this basis,the paper analyze the problems faced by the current pre-training technology.Finally,the future development trend of pre-training technologies is prospected.

Survey of Construction and Application of Reading Eye-tracking Corpus

WANG Xiao-ming,ZHAO Xin-bo

Computer Science. 2020, 47 (3): 174-181. doi:10.11896/jsjkx.190800040

Abstract

PDF(1586KB) ( 1707 )

References | Related Articles | Metrics

The eye movements in reading are a reflection of the human cognitive process.Reading eye movement data is an important basic data in fields such as cognitive psychology,applied linguistics and computer science,while China is lack of the basic study data in this field.In view of this situation,this paper first introduced the background of reading eye-tracking corpus and the related literatures at home and abroad.Then,it presented the contents and indexes of eye movement in reading eye-tracking corpus including single fixation duration,the first fixation duration,gaze duration,total fixation duration,regression in count,regression out count from low level visual factors and high level visual factors,and analyzed three advantages of using corpus research method for reading eye movement compared to the traditional reading eye movement experiments.At last,some of influential and completed reading eye-tracking corpora were elaborated from the perspectives of the index variables,corpus size,corpus content,corpus language,scale of participants,characteristics of participants and data acquire equipment.It is expected to provide some reference for people who engage in reading eye movement.In the applied research of eye-tracking corpus,this paper reviewed the major researches in cognitive psychology,applied linguistics,computer science and related fields.Based on eye-tracking corpus,the representative studies carried out by computer science in eye movement computational model,natural language processing and pattern recognition were introduced with emphasis.Besides,the studies in eye tracking corpus construction and application in China were covered.This paper reviewed the current situation of relevant studies,analyzed the reasons for the lack of basic data,and proposed the solutions and suggestions from the point of view of the state,scientific research institutions and scientific workers respectively.

Survey on Sparse Reward in Deep Reinforcement Learning

ANG Wei-yi,BAI Chen-jia,CAI Chao,ZHAO Ying-nan,LIU Peng

Computer Science. 2020, 47 (3): 182-191. doi:10.11896/jsjkx.190200352

Abstract

PDF(1664KB) ( 5497 )

References | Related Articles | Metrics

As an important research direction of machine learning,reinforcement learning is a kind of method of finding out the optimal policy by interacting with the environment.In recent years,deep learning is widely used in reinforcement learning algorithm,forming a new research field named deep reinforcement learning.As a new machine learning method,deep reinforcement learning has the ability to perceive complex inputs and solve optimal policies.It is applied to robot control and complex decision-making problems.The sparse reward problem is the core problem of reinforcement learning in solving practical tasks.Sparse reward problem exists widely in practical applications.Solving the sparse reward problem is conducive to improving the sample-efficiency and the quality of optimal policy,and promoting the application of deep reinforcement learning to practical tasks.Firstly,an overview of the core algorithm of deep reinforcement learning was given.Then five solutions of sparse reward problem were introduced,including reward design and learning,experience replay,exploration and exploitation,multi-goal learning and auxiliary tasks.Finally,the related researches were summarized and prospected.

Research and Development of Multi-label Generation Based on Deep Learning

LIU Xiao-ling,LIU Bai-song,WANG Yang-yang,TANG Hao

Computer Science. 2020, 47 (3): 192-199. doi:10.11896/jsjkx.190300137

Abstract

PDF(1713KB) ( 2081 )

References | Related Articles | Metrics

In the era of big data,data show the characteristics of high dimension,large amount and rapid growth.Efficiently discovering knowledge from these data is a research focus.Multi-label has been proposed for ambiguous objects in reality,and is widely used in data intelligent processing.In recent years,Multi-label generation receives widespread attention due to the excellent performance of deep learning.The latest research results were summarized from five categories and were further compared and analyzed from the aspects of data,relationship types,application scene,adaptability and experimental performance.Finally,the challenges of multi-label generation were discussed,followed with the prospects for future work.

End-to-end Track Association Based on Deep Learning Network Model

HUANG Hong-wei,LIU Yu-jiao,SHEN Zhuo-kai,ZHANG Shao-wei,CHEN Zhi-min,GAO Yang

Computer Science. 2020, 47 (3): 200-205. doi:10.11896/jsjkx.190400037

Abstract

PDF(1853KB) ( 2118 )

References | Related Articles | Metrics

In order to improve the intelligence of track association in radar data processing,make full use of the characteristic information of the target and simplify the processing flow,an end-to-end track association algorithm based on deep learning network model was proposed.Firstly,this paper analyzed the problem that the track correlation based on neural network has few sample details and complex processing flow.Then,it proposed an end-to-end deep learning model,which takes all the track information features as input.According to the processing characteristics of track correlation data,the convolutional neural networks structure is improved for feature extraction,and the processing ability of long short-term memory neural network for historical information and future information is fully utilized to analyze the correlation of track before and after.After the original data is processed with Kalman filtering,the final track correlation results are directly output through the long short-term memory deep neural network model based on the convolutional neural networks features extracting.In this paper,the precision,recall and accuracy were set to verify the performance of the track association model.The simulation results show that the proposed model can fully learn multiple feature information of the target and has a high track association accuracy,which has reference value for the intelligent analysis of track association.

Document-level Event Factuality Identification Method with Gated Convolution Networks

ZHANG Yun,LI Pei-feng,ZHU Qiao-ming

Computer Science. 2020, 47 (3): 206-210. doi:10.11896/jsjkx.190200265

Abstract

PDF(1607KB) ( 1188 )

References | Related Articles | Metrics

Event factuality represents the factual nature of events in texts,it describes whether an event is a fact,a possibility,or an impossible situation.Event factuality identification is the basis of many relative tasks,such as question-answer system and discourse understanding.However,most of the current researches of event factuality identification focus on the sentences level,and only a few aim at the document-level.Therefore,this paper proposed an approach of document-level event factuality identification (DEFI) with gated convolution network.It first uses gated convolution network to capture both the semantic information and the syntactic information from event sentences and syntactic path,and then uses the self-attention layer to capture the feature representation of the overall information that is more important for each sequence itself.Finally,it uses the above information to identify the document-level event factuality.Experimental results on both the Chinese and English corpus show that the proposed DEFI outperforms the baselines both on macro-F1 and micro-F1.In Chinese and English corpus,the macro-average F1 value increased by 2.3% and 4.4%,while the micro-average F1 value increased by 2.0% and 2.8%,respectively.The training speed of this method is also increased by three times.

Clinical Electronic Medical Record Named Entity Recognition Incorporating Language Model and Attention Mechanism

TANG Guo-qiang,GAO Da-qi,RUAN Tong,YE Qi,WANG Qi

Computer Science. 2020, 47 (3): 211-216. doi:10.11896/jsjkx.190200259

Abstract

PDF(1557KB) ( 1746 )

References | Related Articles | Metrics

Clinical Named Entity Recognition (CNER) aims to identify and classify named entity such as diseases,symptoms,exams,etc.in electronic health records,which is a fundamental and crucial task for clinical and translational research.The task is regarded as a sequence labeling problem.In recent years,deep neural network methods achieve significant success in named entity recognition.However,most of these algorithms do not take full advantages of the large amount of unlabeled data,and ignore the further features from the text.This paper proposed a model which combines language model and multi-head attention.First,chara-cter embeddings and a language model are trained from unlabeled clinical texts.Then,the labeling model are trained from labeled clinical texts.In specific use,the vector representation of the sentence is sent to a BiGRU and a pre-trained language model.This paper further concatenate the output of BiGRU and the features of language model.Afterwards,the outputs are fed to another BiGRU and multi-head attention module.Finally,a CRF layer is employed to predict the label sequence.Experimental results show that the proposed method which takes advantages of language model from the text and multi-head attention mechanism gets 91.34% of F1-score on CCKS-2017 Task2 benchmark dataset.

First-order Logic Clause Set Preprocessing Method Based on Goal Deduction Distance

CAO Feng,XU Yang,ZHONG Jian,NING Xin-ran

Computer Science. 2020, 47 (3): 217-221. doi:10.11896/jsjkx.190100004

Abstract

PDF(1597KB) ( 964 )

References | Related Articles | Metrics

The first-order logic theorem proving is the core foundation of artificial intelligence,it has great academic significance to study the theory and efficient algorithm implementation for first-order logic automated theorem provers.The current provers adopt clause set preprocessing approaches to reduce clause set scale,then apply inference rules to prove theorems.The existing clause set preprocessing method used in provers is generally only from the perspective of the semantic relevance to axioms and conjecture,and it can't reflect the deduction between clauses well from the complementary pairs of literals point of view.In order to describe the relationship between clauses from the perspective of deduction,the definition of goal deduction distance and the calculation method are proposed,and a first-order logic clause set preprocessing method is presented based on the proposed goal deduction distance.Firstly,the original clause set is applied with redundant clause simplification and pure literal deletion rule.Then,taking the goal clauses into consideration,the literal goal deduction distance and the clause goal deduction distance are calculated,and finally further clause set preprocessing is realized by the setting threshold of clause deduction distance.We implement the proposed clause set preprocessing method in Vampire that is the winner of the 2017 CADE ATP System Competition (CASC-26) FOF division,and apply it to solve the CASC-26 problem.Within the standard 300 seconds,the top prover Vampire4.1 with the proposed clause set preprocessing method outperforms Vampire4.1 by solving 4 theorems more than Vampire4.1,and 10 out of the 74 problems unsolved by Vampire4.1,accounting for 13.5% of the total.The proposed clause set preprocessing method has affected 77.2% of the solved theorems,and the largest reduction proportion is 51.7%.Experimental results show that the proposed first-order logic clause set preprocessing method is an effective method,which can effectively reduce the clause set scale and improve the ability of first-order logic automated theorem prover.

Emotional Sentence Classification Method Based on OCC Model and Bayesian Network

XU Yuan-yin,CHAI Yu-mei,WANG Li-ming,LIU Zhen

Computer Science. 2020, 47 (3): 222-230. doi:10.11896/jsjkx.190200331

Abstract

PDF(2801KB) ( 1340 )

References | Related Articles | Metrics

Emotional sentence classification is one of the core problems in the field of emotional analysis.It aims to solve the problem of automatic judgment of emotional sentence categories.Traditional emotional sentence classification methods based on OCC sentiment recognition models mostly rely on dictionaries and rules.In the absence of textual information,the classification accuracy is relatively lower.This paper proposed an emotional sentence classification method based on OCC model and Bayesian network.By analyzing the emotion generation rules of OCC model,it extracts emotional assessment variables and combines the emotion features contained in the emotion sentence to construct a Bayesian network of emotion classification.Through probabilistic reasoning,it is possible to identify a variety of emotion categories that the text may want to express and reduce the impact of missing text information.Compared with the NLPCC2014 Chinese Weibo emotion analysis evaluation sub-task emotional sentence classification evaluation results,the results show that the proposed method is effective.

Coreference Resolution Incorporating Structural Information

FU Jian,KONG Fang

Computer Science. 2020, 47 (3): 231-236. doi:10.11896/jsjkx.190100108

Abstract

PDF(2001KB) ( 1214 )

References | Related Articles | Metrics

With the rise and development of deep learning,more and more researchers begin to apply deep learning technology to coreference resolution.However,existing neural coreference resolution models only focus on the sequential information of text and ignore the integration of structural information which has been proved to be very useful in traditional methods.Based on the neural coreference model proposed by Lee et al.,which has the best performance at present,two measures to solve the problem mentioned above with the help of the constituency parse tree were proposed.Firstly,node enumeration was used to replace the original span extraction strategy.It avoids the restriction of span length and reduces the number of spans that don’t satisfy syntactic rules.Secondly,node sequences are obtained through tree traversal,and the features such as height and path are combined to generate the context representation of the constituency parse trees directly.It avoids the problem of missing structural information caused by the use of word and character sequences only.A lot of experiments were conducted on the dataset of CoNLL 2012 Shared Task,and the proposed model achieves 62.35 average F1 for Chinese and 67.24 average F1 for English,which show that the proposed structural information integration strategy can improve the performance of coreference resolution significantly.

Anti-disturbance Control Algorithm of UAV Based on Pneumatic Parameter Regulation

ZHAO Min,DAI Feng-zhi

Computer Science. 2020, 47 (3): 237-241. doi:10.11896/jsjkx.190200371

Abstract

PDF(1984KB) ( 1242 )

References | Related Articles | Metrics

The control stability of UAV flight caused by aerodynamic damping disturbance is not good.At present,the aerodynamic parameter adjustment method of airfoil section is used to control UAV anti-disturbance,and the parameters such as torsion angle and vibration direction are taken as constraint index.The ambiguity of the parameter adjustment is large,and the stability of the pneumatic attitude parameter adjustment is not good.The anti-disturbance control algorithm of UAV based on aerodynamic parameter adjustment was proposed.According to the flight condition of UAV,the Aeroelastic coupling equations corresponding to each modal were constructed,in the velocity coordinate system and body coordinate system.The flight dynamics and kinematics model of UAV was constructed in the three-dimensional coordinate system of ballistic coordinate system.Kalman filtering method is used to realize the fusion adjustment of flight parameters and small disturbance suppression of UAV.The terminal position re-ference model is used to design the flight trajectory of UAV.The linearization of the dynamic model is realized in the Kalman filter prediction model,and the Aeroelastic modal parameter identification method is adopted.The attitude control is used as the inner loop to obtain the state feedback adjustment parameters of the position loop.The lift coefficient and torque coefficient of the UAV are used as the aerodynamic inertia parameters to adjust the stability of the flight attitude.The optimization design of anti-disturbance control law for UAV is realized.The pitch angle,roll angle and heading angle of the aircraft are collected and analyzed in Matlab as the original data.The simulation results show that the proposed method has a good stability in the anti-disturbance control of UAV.The accuracy of on-line estimation of aerodynamic parameters is high,the heading angle error is reduced by 12.4%,the anti-disturbance ability is improved by 8 dB,the convergence time is shortened by 0.14s,and the flight immunity and flight stability of UAV are improved.It has good application value in UAV flight control.

Study on Dynamic Priority Admission Control Algorithm in Heterogeneous Wireless Networks

TAO Yang,JI Rui-juan,YANG Li,WANG Jin

Computer Science. 2020, 47 (3): 242-247. doi:10.11896/jsjkx.190100089

Abstract

PDF(1805KB) ( 1035 )

References | Related Articles | Metrics

Aiming at the network congestion problem of different types of services in the heterogeneous network environment in emergency situation,and the fact that the importance and urgency of the services performed by different types of users are seldom considered in the existing research,so the limited network resources cannot be reasonably allocated,this paper proposed adynamicpriority admission control algorithm for heterogeneous wireless networks.Firstly,the initial priority is set according to the type of users and service types.Then,the priority of the service is dynamically adjusted according to the urgency of the service and the residual value density,and a preemption scheduling algorithm based on service priority is proposed.In order to avoid the bumpy scheduling phenomenon during the dynamic adjustment of service priority in the network,the conditions for avoiding bumpy scheduling are given.The proposed algorithm takes into account the importance of the services performed by different user types in the actual situation,and prioritizes the services so that the services with the highest initial priority can be preferentially served.On this basis,in order to meet the needs of overall users,the priority of the business is dynamically adjusted.In order to verify the effectiveness of the proposed method,a priority queue scheduling algorithm and a group handover algorithm based on blocking rate constraint are used as reference in MATLAB.The simulation results show that compared with the other two algorithms,this algorithm can increase the overall service completion rate by about 10% while reducing the blocking rate of the service switching.It proves that the proposed algorithm can enable the service with high initial priority to obtain the service preferentially,and provide the conditions for network switching for the service with low initial priority,which improves the rationality and fairness of network resource allocation.

Virtual Network Function Fast Mapping Algorithm over Satellite Network

WEI De-bin,YANG Peng,YANG Li,SHI Huai-feng

Computer Science. 2020, 47 (3): 248-254. doi:10.11896/jsjkx.190300383

Abstract

PDF(2843KB) ( 1232 )

References | Related Articles | Metrics

A new satellite network architecture based on SDN(Software Defined Networking)/NFV(Network Function Virtualization) co-deployment was proposed to solve the problem that the satellite network has limited load and does not allow the large-scale deployment of physical hardware,which lead to the lack of network functions and flexible network management and configuration.It carries out dynamic control of the network through the idea of SDN data layer and control layer separation,and uses NFV technology to create network functions in the data plane of SDN,so that network functions can be decoupled from hardware devices.In this way,network flexibility can be effectively improved.In order to solve the problems that the VNF (virtual network function) in this framework maps to the underlying physical network with excessive delay and cannot meet the real-time perfor-anceof the high dynamic satellite network,a dynamic mapping method named VG-DPA (Viterbi and GPM Dynamic Placement Approach) based on Viterbi and GPM (graph pattern matching) was further proposed.This algorithm first models the mapping process as a hidden Markov service chain through the estimator,and then uses Viterbi algorithm to obtain the mapping path that meets all the hardware and software constraints.Then,based on the estimator results,VNF scheduling strategy is developed by means of GPM.This algorithm solves the problem of mapping the required VNF to the underlying physical network in the highly dynamic satellite network with excessive time delay.Simulation results show that VG-DPA can greatly reduce the time delay and resource consumption compared with the traditional RAND and OMD algorithms.

Maximum Likelihood Blind Detection Algorithm Based on Piecewise Gaussian Approximation for Massive MIMO Outdoor Wireless Optical Communication Systems

LI Hao,CUI Xin-kai,GAO Xiang-chuan

Computer Science. 2020, 47 (3): 255-260. doi:10.11896/jsjkx.190200310

Abstract

PDF(2012KB) ( 995 )

References | Related Articles | Metrics

For outdoor visible light communication scenarios,existing blind detection algorithms often fail to fit well with the probability density function of the real channel model at the truncation when approximating the channel model,resulting in errors in finding the optimal decision threshold,thus affecting the system’s average symbol error rate performance.Therefore,aiming at the large-scale MIMO outdoor wireless optical communication system,a maximum likelihood blind detection algorithm based on piecewise Gaussian approximation was proposed.In the case of powerful gas turbulence,the algorithm obtains the equivalent channel model superposed by each sub-channel and obeys the gamma distribution.According to the unique extreme point of the equivalent channel probability density function,the left and right segmentation intervals are determined,and the first and second order statistical information of each sub-channel in two segmentation intervals is obtained.Then,by using the central limit theorem and the large number theorem,the equivalent channel is approximated to a Gaussian distribution in both segmentation intervals.The algorithm compensates for the poor fitting of the probability density function of the equivalent channel model and the real channel model at the truncation,and obtains the optimal decision threshold,thus improving the average symbol error rate performance of the system.In order to verify the superiority of the algorithm,the MATLAB simulation experiment was used to compare the average symbol error rate performance between the proposed algorithm and the existing blind detection algorithm.The experimental results show that the average symbol error rate performance of the proposed algorithm is nearly 10 times higher than that of the existing blind detection algorithm when the number of transmitting and receiving antennas is 4 and the signal to noise ratio is small.At the same time,when the number of receiving antennas is 8,the average symbol error rate performance of the proposed algorithm is close to that of the existing blind detection algorithm when the number of receiving antennas is 16,which 50％ the number of receiving antennas.The experimental datas fully demonstrate that compared with the existing blind detection algorithm,the proposed algorithm can significantly improve the average symbol error rate performance of the system with the increase of the number of transmitting and receiving antennas when only the mathematical model and statistical information of the channel are utilized.

Traffic Balance Scheme of Aeronautical Information Network Based on System Optimal Strategy

GAO Hang-hang,ZHAO Shang-hong,WANG Xiang,ZHANG Xiao-yan

Computer Science. 2020, 47 (3): 261-266. doi:10.11896/jsjkx.190200296

Abstract

PDF(1947KB) ( 1103 )

References | Related Articles | Metrics

With the demand for air combat in the future,the current aeronautical information network is gradually exposing va-rious shortcomings.For example,the network should have strong differentiated service capabilities for different combat missions,and the information between nodes in the network cannot be shared in time.In addition,the increase of network scale will also lead to traffic congestion in the network,and the network architecture was more bloated.The emergence of SDN has solved this problem better,and it combines SDN with aeronautical information network to propose a software-defined aeronautical information network.This paper was oriented to the problem of traffic transmission in aeronautical information networks,and a SO-based traffic load balancing scheme was proposed for the unbalanced traffic distribution in the network.By constructing a hybrid SDN/IP aeronautical information network model,the centralized control characteristics of the SDN controller in the network enable the SDN node to multi-path forward the service traffic to optimize its scheduling,and defined the link congestion coefficient and SDN data flow.Taking the minimum link utilization as the goal,the Wardrop equilibrium theory was used to analyze the solution,and the SO-based flow balance distribution algorithm was proposed.In order to reflect the superiority of the SOA algorithm,the SMR algorithm and the MSR algorithm were set in the simulation,and the results show that the SOA algorithm has a significant improvement in business completion rate and service throughput.For example,the service completion rates of MSR and SMR algorithms were 58.4% and 52.2% respectively,while the SOA algorithm’s service completion rate was about 70.5%,and the performance was improved by 20.7% and 35.1% respectively in large-scale networks.Therefore,the algorithm of this paper implements the better processing of traffic forwarding in the network,and provides a new idea for solving the problem of traffic transmission under the aeronautical information network in the future.

Spectrum Allocation Strategy for Neighborhood Network Based Cognitive Smart Grid

WANG Yi-rou,ZHANG Da-min,XU Hang,SONG Ting-ting,FAN Ying

Computer Science. 2020, 47 (3): 267-272. doi:10.11896/jsjkx.190600027

Abstract

PDF(2226KB) ( 1114 )

References | Related Articles | Metrics

Reliable and efficient communication network is the premise to give full play to the potential of smart grid.In view of the problems existing in the wireless communication environment of smart grid,such as the shortage of spectrum and low efficiency of resource utilization,this paper applied cognitive radio technology to the neighborhood network communication of smart grid.The concept of cognitive smart grid was introduced to ensure the fairness and effectiveness of service transmission.After considering the SNR and path loss in the communication process,the network throughput was selected as the channel benefit,and the modeling and simulation were carried out in urban residential areas with fixed topology.On this basis,an improved spectrum allocation algorithm for binary cat swarm (WBCSO) optimization was proposed.Firstly,the inertia weight of nonlinear dynamics is added into the speed update formula of binary cat swarm algorithm (BCSO),which decreases linearly with the increase of iteration times to prevent premature algorithm.Secondly,a breeding operator is introduced to generate offspring to increase the diversity of the population and obtain a better global optimal solution.Then,four common benchmark functions are selected to test the performance of the improved algorithm.The test results show that the optimized mean and standard deviation of WBCSO algorithm are better than that of BCSO algorithm.With the overall benefit of the system and the fairness of users as the optimization objectives,the proposed algorithm was compared with binary genetic algorithm (BGA) and binary particle swarm optimization (BPSO) on the contrast experiment.The simulation experiments show that WBCSO algorithm eventuallysystem total benefits and user fairness index of WBCSO algorithm is higher than BCSO algorithm with 13.7% and 14.6%respectively,and its performance is better than BPSO and BGA.Therefore,the improved binary cat swarm algorithm has the characteristics of fast convergence and strong search ability in the spectrum allocation of the neighborhood network of cognitive smart power grid.

Blockchain Dynamic Sharding Model Based on Jump Hash and Asynchronous Consensus Group

PAN Ji-fei,HUANG De-cai

Computer Science. 2020, 47 (3): 273-280. doi:10.11896/jsjkx.190100238

Abstract

PDF(2305KB) ( 1270 )

References | Related Articles | Metrics

The current implementation of blockchain systems generally suffer from performance and capacity deficiencies,making it impossible to achieve deeper popularity and wider application.Sharding is considered as the most likely technology to solve the blockchain bottleneck.However,at present,the mainstream sharding schemes generally suffer from the problem of sacrificing decentralization or security to improve performance.Based on the existing sharding technology,this paper proposed the jump Hash wight asynchronous consensus group scheme,which builds shards based on jump hash and dynamic weights,to improve the efficiency and rationality of shards creation.The algorithm satisfies the characteristics of high efficiency,fairness,and adaptability.The network fragmentation efficiency is improved by 8% compared with Ethereum.The workload of node migration is reduced by 25% compared with Ethereum.The asynchronous consensus group mechanism is introduced to improve the transaction security of sharding and effectively handle cross-shard transactions.Through theoretical analysis and experiments,the maximum transaction performance of blockchain dynamic sharding model based on jump Hash and asynchronous consensus group can reach 5000 transactions per second.

Data Privacy Protection Method of Block Chain Transaction

XU Chong-jian,LI Xian-feng

Computer Science. 2020, 47 (3): 281-286. doi:10.11896/jsjkx.190300086

Abstract

PDF(2163KB) ( 3628 )

References | Related Articles | Metrics

Block chain has the advantages of openness,non-tampering and distributed sharing of global accounts,but at the same time,these characteristics also bring about the privacy disclosure of transaction data,which seriously affects its application in many business areas,especially in the field of enterprise alliance chain.With the continuous development of block chain,how to protect the privacy of transaction data on block chain platform is a very worthwhile problem to study.To this end,firstly,the exi-sting methods of data privacy protection in block chain transactions were studied and their shortcomings were pointed out.Se-condly,the requirements of data privacy protection in block chain transactions were qualitatively analyzed.Each transaction data was divided into sensitive data and basic data.A demand analysis matrix was established to obtain the essential and implicit needs of transaction privacy protection and possible application scenarios.Then,combining the characteristics of symmetric encryption and asymmetric encryption and the consensus of intelligent contract,a privacy protection method of block chain transaction data based on double encryption was designed.The method mainly includes three modules:encrypting and storing transaction data by private data provider,using and decrypting private data to read transaction data,and sharing transaction data by private data accessible party.The workflow of each module was discussed in detail.Finally,the method was validated on the Mychain Platform,which combines with the actual business of multi-party participation in international trade.The evaluation results show that the proposed method can achieve fine-grained transaction data privacy protection at the field level,and can efficiently and steadily share private data on the chain and complete the full-link operation of private data.More than 1 million transaction tests have been completed on the block chain platform constructed by four nodes,and the TPS has reached 800.Compared with the original transaction performance without privacy protection,there is no significant reduction in performance.Compared with Bitcoin,Ethereum and other block chain platforms,the performance of the proposed method is improved dozens of times.

Improved TLS Fingerprint Enhance User Behavior Security Analysis Ability

HU Jian-wei,XU Ming-yang,CUI Yan-peng

Computer Science. 2020, 47 (3): 287-291. doi:10.11896/jsjkx.190200332

Abstract

PDF(1418KB) ( 3528 )

References | Related Articles | Metrics

With the upgrade of offensive and defensive confrontation,the combination of user behavior analysis and network security has gradually entered the researchers’ field of vision.User behavior analysis technology can achieve active defense by identi-fying untrusted users and preventing the intrusions before being attacked successfully.Currently,the datasets used in user beha-vior analysis in Web security are mainly the application layer HTTP data,which is insufficient toidentity user and is likely to cause false negatives.This paper proposed an improved TLS fingerprint data based on n-gram and Simhash,which enhances the fault tolerance of the existing TLS fingerprint.The application by using the improved fingerprint to user behavior analysis can improve the accuracy of user indentification.The comparative experiment used convolutional neural network to model and analyze the fingerprint data and log-type user behavior data obtained from the real environment.The results show that the improved TLS fingerprint data can identify normal users and hackers more effectively,and the accuracy is improved by 4.2%.Further analysis shows that the improved TLS fingerprint can trace hackers to a certain extent by correlating user behaviors and timeline backtracking,thus providing an intelligence context for security incident investigation.

User Attributes Profiling Method and Application in Insider Threat Detection

ZHONG Ya,GUO Yuan-bo,LIU Chun-hui,LI Tao

Computer Science. 2020, 47 (3): 292-297. doi:10.11896/jsjkx.190200379

Abstract

PDF(2281KB) ( 1391 )

References | Related Articles | Metrics

With the widely use of information technology and Internet technology in enterprise organizations,enterprise information security faces unprecedented challenges.Most companies are faced with both external and internal attacks.Due to the lack of timely and effective detection methods,the damage caused by internal attacks is more serious.As the conductor of malicious behaviors in organization and enterprise,human is the research object in insider threat detection.Aiming at the low correlation and low detection efficiency of the similar threat detection for the existing insider threat detection method,user attributes profiling method was proposed.In this paper,users in the organization were taken as the research subject,and the clustering and supervision of similar users were mainly studied.Firstly,the method of calculating the similarity of portraits is defined.Then,the ontology theory and tabular portrait method were used to integrate multiple factors,such as user personality,personality,past expe-rience,working status,and setbacks.Similar users are clustered and managed in group by improved K-Means method,achieving the purpose of joint supervision on potential malicious ones,which reduces the possibility of similar damage occurring.Experimental results show that the proposed method is feasible and makes a way to combat the insider threat.

Log Security Storage and Retrieval Based on Combination ofOn-chain and Off-chain

LV Jian-fu,LAI Ying-xu,LIU Jing

Computer Science. 2020, 47 (3): 298-303. doi:10.11896/jsjkx.190200298

Abstract

PDF(2018KB) ( 1764 )

References | Related Articles | Metrics

There are a large number of security device logs in the information system.These security device logs are very important for system monitoring,query,security auditing and fault diagnosis.Therefore,it is important to securely store and process the security device logs in the information system.This paper proposed a log security storage and retrieval model based on the combination of on-chain and off-chain.This model combines blockchain and distributed storage technology,achieves security log storage which is decentralized,detrusted, andhard to tamper with data,and provides a ciphertext retrieval interface to security administrators externally.At the same time, it can use blockchain technology to realize data integrity check.The security analysis demonstrates that the model can ensure the secure and reliable storage of security device logs,and the performance analysis proves that the model has good retrieval efficiency.

Performance Study on Constellation Obfuscation Design Method for Physical Layer Security

XI Chen-jing,GAO Yuan-yuan,SHA Nan

Computer Science. 2020, 47 (3): 304-311. doi:10.11896/jsjkx.190200369

Abstract

PDF(2883KB) ( 1222 )

References | Related Articles | Metrics

Physical layer security encryption technology is an effective physical layer security method to ensure the safe transmission of information.The signal constellation is designed by means of phase rotation,modulation constellation diversity,symbol obfuscation,amplitude adjustment and symbol sequence change to protect modulation mode and modulation symbol information.The existing physical layer security encryption technology has some disadvantages,such as unclassified key sharing and insufficient constellation ambiguity.The MIO scheme adoptes the encryption method of superposition of artificial noise symbol key andmo-dulated symbol vector to solve the problem of insufficient constellation ambiguity.Inspired by MIO scheme,this paper proposed a new physical layer security encryption scheme based on constellation obfuscation design (COD) by superimposing channel coefficient and modulated symbol vector.Under the condition of TDD mode and channel reciprocity,the channel coefficient of the legitimate channel is used as the key to solve the problem of unclassified key pre-sharing.This paper introduced the complete transmission process of sending-end encryption and legal-end decryption,and analyzed the reception process of high-order cumulant modulation re-cognition and intelligent attack eavesdropper.The theoretical SER (symbol error rate) equation of legitimate receiver in Rayleigh fading channel was derived.Simulation of SER performance of legitimate receiver,high order cumulant modulation re-cognition eavesdropper and intelligent attack eavesdropper were conducted.The performance of the legitimate receiver and eavesdropper of MIO scheme and COD scheme are compared.Simulation results show that when the SER of legitimate receiver is 1×10^－4,the SNR(signal-to-noise ratio) of COD scheme is 6dB lower than that of MIO scheme.After the COD scheme is encrypted,when SNR is 0,the success rate of modulation identification is 11.8%,and the highest success rate of modulation identification is 25%,which remains stable after the SNR is greater than 40dB.In the first three data packets,the SER of the COD scheme’s intelligent attack eavesdropping terminal is always 0.284,while that of the MIO scheme’s eavesdropper who knowns the starting key is lower.With the SNR between 0 and 54dB,the SER performance of the legitimate receiver is always better than that of the modulation recognition eavesdropping terminal and the intelligent attack eavesdropping terminal.Therefore,the proposed COD scheme can guarantee the secure communication and resist the attack of modulation recognition and intelligent attack eavesdropper,and the effectiveness and reliability of COD scheme are better than MIO scheme.

Authenticated Privacy Protection Scheme Based on Certificateless Ring Signcryption in VANET

ZHAO Nan,ZHANG Guo-an

Computer Science. 2020, 47 (3): 312-319. doi:10.11896/jsjkx.190100115

Abstract

PDF(2018KB) ( 1284 )

References | Related Articles | Metrics

Aiming at the protection of vehicle users’ privacy information and the security transmission of communication messages in the vehicle ad-hoc networks,an authenticated certificateless ring signcryption scheme was proposed.The vehicles communicated with others through their pseudo-identities generated by the trusted authority,and the real-identity of the vehicle can only be determined by the trusted authority according to the original registration information of the vehicle node and the tracing keys,which ensures the anonymity of the communication and traceability of the malicious vehicles.The signcryption and decryption algorithms are implemented respectively by message sending vehicle and receiving vehical based on the proposed authenticated certificateless ring signcryption model,which results in the achievement of the identity authentication of the signcryption vehicle and authentication of sending messages.The confidentiality and unforgeability of the proposed scheme can be proved under the random oracle model.Compared with the existing privacy-protection schemes of VANET,the security performance of the proposed scheme is more perfect in terms of confidentiality,authentication and traceability.Through the lists,the number of operations in the ring signcryption and decryption algorithms of the scheme are compared.The sum overhead of bilinear operations and scalar multiplications is treated as the computational overhead of the scheme,and is listed and analyzed numerically.The simulations of the scheme are based on Intel I7,3.07GHz hardware platform and MATLAB software.The results show that the computational overhead of the proposed scheme is far less than the other three.When the number of vehicles increases to 100,the upper limit of the applicable range,the computational overhead of the proposed scheme is still less than 150ms.Therefore,the proposed privacy protection scheme has satisfied the requirement of security and instant messaging,especially suitable for urban transportation systems.