Started in January,1974(Monthly)
Supervised and Sponsored by Chongqing Southwest Information Co., Ltd.
ISSN 1002-137X
CN 50-1075/TP
CODEN JKIEBK
Editors
Current Issue
Volume 48 Issue 12, 15 December 2021
  
Computer Architecture
Review of Visualization Drawing Methods of Flow Field Based on Streamlines
ZHANG Qian, XIAO Li
Computer Science. 2021, 48 (12): 1-7.  doi:10.11896/jsjkx.201200108
Abstract PDF(3214KB) ( 2094 )   
References | Related Articles | Metrics
Flow visualization is an important branch of scientific computational visualization.It mainly visualizes the simulation calculation results of computational fluid dynamics,and provides researchers with visually intuitive graphical images to facilitate researchers' analysis.The known techniques for flow visualization include geometric-based methods:such as streamline,particle tracking methods;and texture-based methods:LIC,spot noise and IBFV.Streamline visualization is an important and commonly used geometric visualization method for flow field visualization.In the study of streamline visualization,the placement of streamline is the focus of the entire streamline visualization,and the number and position of streamline affect the entire visualization effect.When too many streamlines are placed,it will cause visual clutter,and too little cause the flow field information to be incompletely expressed and cannot be transmitted to domain experts.In order to achieve accurate display of scientific data,streamline visualization has generated two important research directions:placement of seed points and reduction of streamline.This article introduces the related research of seed point placement method and streamline reduction method,summarizes some problems and solutions adopted in 2D and 3D flow fields,and proposes the need for streamline visualization in view of the growing scientific data in the future.
Anomaly Propagation Based Fault Diagnosis for Microservices
WANG Tao, ZHANG Shu-dong, LI An, SHAO Ya-ru, ZHANG Wen-bo
Computer Science. 2021, 48 (12): 8-16.  doi:10.11896/jsjkx.210100149
Abstract PDF(1563KB) ( 1777 )   
References | Related Articles | Metrics
Microservice architectures separate a large-scale complex application into multiple independent microservices.These microservices with various technology stacks communicate with lightweight protocols to implement agile development and conti-nuous delivery.Since the application using a microservice architecture has a large number of microservices communicating with each other,the faulty microservice should cause other microservices interacting with the faulty one to appear anomalies.How to detect anomalous microservices and locate the root cause microservice has become one of the keys of ensuring the reliability of a microservice based application.To address the above issue,this paper proposes an anomaly propagation-based fault diagnosis approach for microservices by considering the propagation of faults.First,we monitor the interactions between microservices to construct a service dependency graph for characterizing anomaly propagation.Second,we construct a regression model between me-trics and API calls to detect anomalous services.Third,we get the fault propagation subgraph by combining the service dependency graph and the detected abnormal service.Finally,we calculate the anomaly degree of microservices with a PageRank algorithm to locate the most likely root cause of the fault.The experimental results show that our approach can locate faulty microservices with low overhead.
Method of Service Decomposition Based on Microservice Architecture
JIANG Zheng, WANG Jun-li, CAO Rui-hao, YAN Chun-gang
Computer Science. 2021, 48 (12): 17-23.  doi:10.11896/jsjkx.210500078
Abstract PDF(2010KB) ( 1524 )   
References | Related Articles | Metrics
The microservice decomposition of the monolithic system can effectively alleviate the problems of system redundancy and difficulty in maintenance of the monolithic architecture.However,the existing microservice decomposition methods fail to make full use of the attribute information of the microservice architecture,which leads to the low rationality of service decomposition results.This paper proposes a service decomposition method based on microservice architecture.The method constructs an entity-attribute relationship graph through the association information of system services and attributes.Then the service decomposition rules are formulated by combining the feature information of the microservice architecture with the demand information of the target system,the association information between the two types of vertices is quantified,and a weighted entity-attribute graph is generated.Finally,the weighted GN algorithm is applied to realize the microservice decomposition of the system automatically.The experimental results show that the method greatly improves the timeliness of service decomposition,and the gene-rated microservice decomposition scheme performs better in terms of various evaluation metrics.
Parallel WMD Algorithm Based on GPU Acceleration
HU Rong, YANG Wang-dong, WANG Hao-tian, LUO Hui-zhang, LI Ken-li
Computer Science. 2021, 48 (12): 24-28.  doi:10.11896/jsjkx.210600213
Abstract PDF(1806KB) ( 1286 )   
References | Related Articles | Metrics
Word Mover's Distance (WMD) is a method of measuring text similarity.It defines the difference between two texts as the minimum distance between the word embedding vectors of the text.WMD uses the vocabulary to represent the text as a normalized bag-of-words vector.Since the words of the text occupies a small proportion in the corpus,the document vector gene-rated by the bag-of-words model is very sparse.Multiple documents can form a high-dimensional sparse matrix,such a sparse matrix will generate a lot of unnecessary operations.By calculating the WMD of a single source document for multiple target documents at once,the calculation process can be highly parallelized.Aiming at the sparsity of text vectors,this paper proposes a GPU-based parallel Sinkhorn-WMD algorithm,which uses compressed format to store target text to improve memory utilization,and reduces intermediate calculations based on the sparse structure.The pre-trained word embedding vector is used to calculate the word distance matrix,the WMD algorithm is improved,and the optimization algorithm is verified on two public news data sets.The experimental results show that the parallel algorithm on NVIDIA TITAN RTX can achieve a speedup of up to 67.43× compared with the CPU serial algorithm.
High Performance Implementation and Optimization of Trigonometric Functions Based on SIMD
YAO Jian-yu, ZHANG Yi-wei, ZHANG Guang-ting, JIA Hai-peng
Computer Science. 2021, 48 (12): 29-35.  doi:10.11896/jsjkx.201200135
Abstract PDF(1337KB) ( 1996 )   
References | Related Articles | Metrics
As a basic mathematical operation,the high-performance implementation of trigonometric functions is of great significance to the construction of the basic software ecology of the processor.Especially,the current processors have adopted the SIMD architecture,and the implementation of high-performance trigonometric functions based on SIMD has important research significance and application value.In this regard,this paper uses numerical analysis method to implement and optimize the five commonly used trigonometric functions sin,cos,tan,atan,atan2 with high performance.Based on the analysis of floating-point IEEE754 standard,an efficient trigonometric function algorithm is designed.Then,the algorithm accuracy is further improved by the application of Taylor formula,Pade approximation and Remez algorithm in polynomial approximation algorithm.Finally,the perfor-mance of the algorithm is further improved by using instruction pipeline and SIMD optimization.The experimental results show that,on the premise of satisfying the accuracy,the trigonometric function implemented is compared with libm algorithm library and ARM_M algorithm library,on the ARM V8 computing platform,has achieved great performance improvement,whose time performance is 1.77~6.26 times higher than libm algorithm library,and compared with ARM_M,its times performance is 1.34~1.5 times higher.
Quantum Fourier Transform Simulation Based on “Songshan” Supercomputer System
XIE Jing-ming, HU Wei-fang, HAN Lin, ZHAO Rong-cai, JING Li-na
Computer Science. 2021, 48 (12): 36-42.  doi:10.11896/jsjkx.201200023
Abstract PDF(1622KB) ( 1163 )   
References | Related Articles | Metrics
The “Songshan” supercomputer system is a new generation of heterogeneous supercomputer cluster independently developed by China.The CPU and DCU accelerators it carries are all independently developed by my country.In order to expand the scientific computing ecology of the platform and verify the feasibility of quantum computing research on this platform,the paper uses a heterogeneous programming model to implement a heterogeneous version of the quantum Fourier transform simulation on the “Songshan” supercomputer system.The computing hotspots of the program are allocated to run on the DCU;then MPI is used to enable multiple processes on a single computing node to realize the concurrent data transmission and calculation of the DCU accelerator;finally,the hiding of computing and communication prevents the DCU from being in the middle of data transmission.The experiment implements a 44 Qubits-scale quantum Fourier transform simulation on a supercomputing system for the first time.The results show that the heterogeneous version of the quantum Fourier transform module makes full use of the computing resources of the DCU accelerator and achieves 11.594 compared to the traditional CPU version.The speedup ratio is high,and it has good scalability on the cluster.This implementation method provides a reference for the simulation implementation and optimization of other quantum algorithms on the “Songshan” supercomputer system.
DGX-2 Based Optimization of Application for Turbulent Combustion
WEN Min-hua, WANG Shen-peng, WEI Jian-wen, LI Lin-ying, ZHANG Bin, LIN Xin-hua
Computer Science. 2021, 48 (12): 43-48.  doi:10.11896/jsjkx.201200129
Abstract PDF(2335KB) ( 727 )   
References | Related Articles | Metrics
Numerical simulation of turbulent combustion is a key tool for aeroengine design.Due to the need of high-precision model to Navier-Stokes equation,numerical simulation of turbulent combustion requires huge amount of calculations,and the phy-sicochemical models causes the flow field to be extremely complicated,making the load balancing a bottleneck for large-scale pa-rallelization.We port and optimize the numerical simulation method of turbulent combustion on a powerful computing server,DGX-2.We design the threading method of flux calculation and use Roofline model to guide the optimization.In addition,we design an efficient communication method and propose a multi-GPU parallel method for turbulent combustion based on high-speed interconnection of DGX-2.The results show that the performance of a single V100 GPU is 8.1x higher than that on dual-socket Intel Xeon 6248 CPU node with 40 cores.And the multi-GPU version on DGX-2 with 16 V100 GPUs achieves 66.1x speedup,which is higher than the best performance on CPU cluster.
Loop Fusion Strategy Based on Data Reuse Analysis in Polyhedral Compilation
HU Wei-fang, CHEN Yun, LI Ying-ying, SHANG Jian-dong
Computer Science. 2021, 48 (12): 49-58.  doi:10.11896/jsjkx.210200071
Abstract PDF(2471KB) ( 1048 )   
References | Related Articles | Metrics
Existing polyhedral compilation tools often use some simple heuristic strategies to find the optimal loop fusion decisions.It is necessary to manually adjust the loop fusion strategy to get the best performance for different programs.To solve this problem,a fusion strategy based on data reuse analysis is proposed for multi-core CPU platform.This strategy avoids unnecessary fusion constraints that affecting the mining of data locality.For different stages of scheduling,the parallelism constraint for diffe-rent parallel levels is proposed.And a tiling constraint for CPU cache optimization is proposed for statements with complex array accesses.Compared with the previous loop fuion strategies,this strategy takes into account the changes in spatial locality when calculating the fusion profits.This strategy is implemented based on the polyhedral compilation module Polly in the LLVM compilation framework,and some test cases in test suites such as Polybench are selected for testing.In the case of single-core testing,compared with the existing fusion strategies,the average performance is improved by 14.9%~62.5%.In the case of multi-core testing,compared with the existing fusion strategies,the average performance is improved by 19.7%~94.9%,and the speedup is up to 1.49x~3.07x.
Computer Software
Approach of God Class Detection Based on Evolutionary and Semantic Features
WANG Ji-wen, WU Yi-jian, PENG Xin
Computer Science. 2021, 48 (12): 59-66.  doi:10.11896/jsjkx.210100077
Abstract PDF(1514KB) ( 689 )   
References | Related Articles | Metrics
With the acceleration of software development iterations,developers often violate the basic principles of software design due to various reasons such as delivery pressure,resulting in code smells and affecting software quality.God class is one of the most common code smells,referring to classes that have taken on too many responsibilities.God class violates the design principle of “high cohesion and low coupling”,damages the quality of the software system,and affects the understandability and maintainability of the code.Therefore,a new method of god class detection is proposed.It extracts the evolutionary and semantic features of the actual project,then merges the evolution and semantic features.Based on the merged features,it re-clusters all the methods for the projects.By analyzing the distribution of the member methods of each class in the actual project in the new clustering result,it calculates the cohesion of the class,and finds the class with low cohesion as the God class detection result.Experiments show that this method is superior to the current mainstream God class detection methods.Compared with traditional mea-surement-based detection methods,the recall and precision rates of the proposed method are increased by more than 20%.Compared with detection methods based on machine learning,although the recall rate of the proposed method is slightly lower,but the precision rate and F1 value are significantly improved.
Cooperative Modeling Model Combination and Update Method Based on Meta-model
ZHANG Zi-liang, ZHUANG Yi, YE Tong
Computer Science. 2021, 48 (12): 67-74.  doi:10.11896/jsjkx.201100024
Abstract PDF(1872KB) ( 724 )   
References | Related Articles | Metrics
With the increasing scale of software and the increasing complexity of software,the design and development of large-scale systems such as aircrafts and ships are often completed by teams with different professional fields and functions.Aiming at the problem of incomplete model caused by missing information between local models and model inconsistency caused by conflict between update operations,this paper proposes a method of modelcombination and update (MCAU) based on the meta-model.This method defines the collaborative relationship and update operation on the meta model,which can ensure the integrity and consistency of the model in the process of collaborative modeling.An example is given to illustrate the application and analysis of the proposed method.Secondly,this paper proposes a model driven software collaborative modeling framework(SCMF),which can effectively support the extension of multiple modeling languages.Finally,this paper develops a software collaborative mode-ling prototype system(CorModel) based on eclipse framework,and further verifies the effectiveness of MCAU through related experiments.
Test Suite Reduction via Submodular Function Maximization
WEN Jin, ZHANG Xing-yu, SHA Chao-feng, LIU Yan-jun
Computer Science. 2021, 48 (12): 75-84.  doi:10.11896/jsjkx.210300086
Abstract PDF(6167KB) ( 818 )   
References | Related Articles | Metrics
As regression testing size and cost increase,test suite reduction becomes more important to promote its efficiency.Du-ring the selection of test suite subset,we are supposed to consider the representativeness and diversity of subset,and apply an effective algorithm to solve it.Aimed at test suite reduction,a submodular function maximization based algorithm,SubTSR,is proposed in this paper.Although the introduced discrete optimization problem is an NP-hard problem,the heuristic greedy search is used in this paper to find the suboptimal solution with approximation guarantee by exploiting the submodularity of the objective function.To validate the effectiveness of the SubTSR algorithm,the SubTSR algorithm is experimented on fifteen datasets with changes of topic count in LDA and distance for similarity measurement,and compared with other test suite reduction algorithms about the average percentage of fault-detection,fault detection loss rate,first faulty test's index and other metrics.The experiment results show that the SubTSR algorithm has significant improvement in fault detection performance compared with other algorithms,and its performance remains relatively stable on different datasets.Under the text representation change due to topic count change,the SubTSR combined with Manhattan distance keeps relatively stable compared with other algorithms.
Fuzzing Test Case Generation Method Based on Depth-first Search
LI Yi-hao, HONG Zheng, LIN Pei-hong
Computer Science. 2021, 48 (12): 85-93.  doi:10.11896/jsjkx.200800178
Abstract PDF(2584KB) ( 888 )   
References | Related Articles | Metrics
Fuzzing test is an important method to exploit network protocol vulnerability.Existing fuzzing test methods have some problems such as incomplete path coverage and low efficiency.To solve these problems,this paper proposes a depth-first search based fuzzing test case sgeneration method.The method transforms the state machine into a directed acyclic graph to obtain the state transition paths,and increases the proportion of test cases in the testing messages to improve the fuzzing efficiency.The method includes five stages:merging state transition,eliminating loops,searching state transition paths,marking the same state transitions,and test case guidance based fuzzing test.Firstly,the state transitions which have the same start states and end states are merged.Secondly,according to the depth-first search,the loops in the graph are found,and the state machine is converted into a directed acyclic graph by deleting the edges of the loops.Thirdly,the directed acyclic graph is analyzed for the full path from the initial state to the end state,and the original state machine graph is supplemented with the removed edges using Floyd algorithm to construct the complete test paths,so as to ensure that each state transition in the state machine can be fully tested.Fourthly,repeated state transitions are marked to avoid repeated test of similar state transitions and reduce testing redundancy.Finally,test cases for state transitions are generated,and test cases which may guide the state transition are distributed evenly over the repetitive state transitions to carry out fuzzing test on the target.Experimental results show that the proposed method can achieve higher proportion of valid test cases.
Code Readability Assessment Method Based on Multidimensional Features and Hybrid Neural Networks
MI Qing, GUO Li-min, CHEN Jun-cheng
Computer Science. 2021, 48 (12): 94-99.  doi:10.11896/jsjkx.200800193
Abstract PDF(1631KB) ( 846 )   
References | Related Articles | Metrics
Quantitative and accurate assessment of code readability is an important way to ensure software quality,reduce communication and maintenance costs,and improve the efficiency of software development and evolution.However,existing code readability studies depend mainly on the manual feature engineering method,which is likely to limit the model performance due to factors such as code representation strategies and technical means.Unlike prior studies,we propose a novel code readability assessment method based on multidimensional features and hybrid neural networks by using the technique of deep learning.Specifi-cally,we first propose a representation strategy with different granularity levels to transform source codes into matrices and vectors as the input to deep neural networks.We then build a CNN-BiGRU hybrid neural network that can automatically learn structural and semantic features from the source code.The experimental results show that our method is able to achieve an accuracy of 84.6%,which is 9.2% higher than CNN alone and 6.5% higher than BiGRU alone.Moreover,our method can outperform five state-of-the-art code readability models,which confirms the feasibility and effectiveness of multidimensional features and hybrid neural networks proposed in this study.
Context-aware Based API Personalized Recommendation
CHEN Chen, ZHOU Yu, WANG Yong-chao, HUANG Zhi-qiu
Computer Science. 2021, 48 (12): 100-106.  doi:10.11896/jsjkx.201000127
Abstract PDF(1865KB) ( 702 )   
References | Related Articles | Metrics
In the process of software development,developers often search for appropriate APIs to complete programming tasks when encountering programming difficulties.Contextual information and developer portraits play a critical role in effective API recommendation,but they are largely overlooked.This paper proposes a novel context-aware based API personalized recommendation approach.This approach leverages program static analysis technology (abstract syntax tree) to parse the code file to extract information to construct the code base and model developer API usage preferences.Then it calculates the semantic simila-rity between the developer's current query and the queries in the historical code base,and retrieves top-k similar historical queries.Finally,it leverages the information of query,method name,context and developer API usage preference to re-rank the candidate APIs and recommend to developers.MRR,MAP,Hit and NDCG are used to verify the effectiveness of the method in dif-ferent stages of simulation programming.The experimental results show that the proposed approach outperforms the baseline me-thod and it is more likely to recommend the APIs that developers want.
Program Complexity Analysis Method Combining Evolutionary Algorithm with Symbolic Execution
ZHOU Sheng-yi, ZENG Hong-wei
Computer Science. 2021, 48 (12): 107-116.  doi:10.11896/jsjkx.210200052
Abstract PDF(1765KB) ( 733 )   
References | Related Articles | Metrics
The worst case execution path is an important indicator of program complexity,which helps to discover possible complexity vulnerabilities in the system.In recent years,the application of symbolic execution in program complexity analysis has made great progress,however,the existing methods have the problems of poor generality and long analysis time.EvoWca,an evolutionary algorithm for worst-case path detection,is proposed in this paper.Its core idea is to use known worst-case path characteristics of programs at smaller input sizes to guide the construction of initial path sets at larger input sizes.Evolutionary algorithm is then simulated to combine,mutate,and select routes iteratively,and the worst path detected within the search range approximates the path corresponding to the worst time complexity.Based on this algorithm,a prototype tool EvoWca2j for program complexity analysis is implemented.The tool and existing technologies are used to explore the worst path and evaluate the execution efficiency of a group of Java programs.The experimental results show that,compared with the existing methods,EvoWca2j'sgenerality and exploration efficiency are significantly improved.
Automatic Code Comments Generation Method Based on Convolutional Neural Network
PENG Bin, LI Zheng, LIU Yong, WU Yong-hao
Computer Science. 2021, 48 (12): 117-124.  doi:10.11896/jsjkx.201100090
Abstract PDF(2547KB) ( 989 )   
References | Related Articles | Metrics
Automatic code comment generation technology can analyze the semantic information of source code and generate corresponding natural language descriptions,which can help developers understand the program and reduce the time cost during software maintenance.Most of the existing technologies are based on the encoder and decoder model of the recurrent neural network(RNN).However,this method suffers from long-term dependency problem,which means it cannot generate high-quality comments when analyzing far-away code blocks.To solve this problem,this paper proposes an automatic code comment generation method,which uses the convolutional neural network(CNN) to alleviate the inaccurate comments information caused by the long-term dependence problem.More specifically,this paper uses two CNNs,one source-code based CNN and one AST-based CNN,to capture source code's semantic information.The experimental results indicate that,compared to the two most recent methods,DeepCom and Hybrid-DeepCom,the method proposed in this paper generates more useful code comments and takes less time to execute.
SCADE Model Checking Based on Program Transformation
RAN Dan, CHEN Zhe, SUN Yi, YANG Zhi-bin
Computer Science. 2021, 48 (12): 125-130.  doi:10.11896/jsjkx.201100080
Abstract PDF(1520KB) ( 822 )   
References | Related Articles | Metrics
SCADE synchronization language is a common programming language for embedded system.It is often used to realize real-time embedded automatic control system in the research of equipment in aviation,aerospace,transportation and other safety critical fields.SCADE synchronization language is derived from the Lustre language,which adds more language structure to simplify source code.However,compared with Lustre language,the development of model checking technology of SCADE synchronous language is relatively backward.In this paper,we introduce a method and suite for model checking of SCADE programs.The core idea of the method is based on program transformation,that is,the SCADE program is finally transformed into equivalent Lustre program after lexical analysis,syntax analysis,abstract syntax tree generation,simplification,and then use JKind to complete the model checking.Moreover,we prove the correctness of the suite's model checking result by theoretical deduction and a large number of experiments.Experimental results show that the SCADE and Lustre programs with the same function produce the same model checking results,but the efficiency of SCADE is lower.
Noise Tolerable Feature Selection Method for Software Defect Prediction
TENG Jun-yuan, GAO Meng, ZHENG Xiao-meng, JIANG Yun-song
Computer Science. 2021, 48 (12): 131-139.  doi:10.11896/jsjkx.201000168
Abstract PDF(2647KB) ( 690 )   
References | Related Articles | Metrics
Software defect prediction can identify defective modules in advance by mining the defect datasets,helping testers to achieve more targeted testing.However,the ubiquity of label noise in the datasets affects the performance of the prediction mo-del.Few feature selection methods have been used to specifically design noise tolerance.In addition,the strategy selection in the mainstream noise tolerable feature selection framework can only be performed manually based on human experience,which is difficult to be applied in software engineering.In view of this,this paper proposes a novel method NTFES (noise tolerable feature selection).In particular,NTFES first generates multiple Bootstrap samples by Bootstrap sampling method.Then it divides the original features into different groups on Bootstrap samples by approximate Markov blanket and selects candidate features from each group based on two heuristic feature selection strategies. Sequently it uses genetic algorithm (GA) to search the optimal feature subset in the candidate feature space.To verify the effectiveness of the proposed method,this paper chooses NASA MDP dataset,and inject label noises simultaneously to imitate noisy datasets.Then it compares NTFES with other classical baseline methods,such as FULL,FCBF and CFS,by controlling the ratio of label noises.The experimental results show that the proposed method has the advantages of achieving higher classification performance and has better noise tolerable while the ratio of label noises is acceptable.
Code Search Engine for Bug Localization
CHANG Jian-ming, BO Li-li, SUN Xiao-bing
Computer Science. 2021, 48 (12): 140-148.  doi:10.11896/jsjkx.201100209
Abstract PDF(2729KB) ( 782 )   
References | Related Articles | Metrics
With the evolution and the increased complexity of software project,bug fixing is getting more difficult.During the bug fixing,developers need to spend a lot of time on bug localization and fixing.To evaluate this problem,this paper builds a bug-code database by integrating the bug reports and the corresponding evolution information,and analyzing the relationship between the code block and the bug report.Then,a code search engine is constructed for bug localization based on the bug-code database,which is used for recommending more comprehensive information about similar bug reports,bug related code files,and code blocks.The experiment results show that the proposed approach is more accurate to localize buggy files,and the localization can effectively reach code level.
Identification of Key Classes in Software Systems Based on Graph Neural Networks
ZHANG Jian-xiong, SONG Kun, HE Peng, LI Bing
Computer Science. 2021, 48 (12): 149-158.  doi:10.11896/jsjkx.210100200
Abstract PDF(2347KB) ( 753 )   
References | Related Articles | Metrics
There are usually some key classes which are in the core position in the topology structure of software systems.The defects in these classes will bring great security risks to the system.Therefore,it is very important to identify these key classes for engineers to understand or maintain an unfamiliar software system.To do this,the paper proposes a novel method of identifying key classes based on graph neural networks.Specifically,the software system is abstracted as software network by using complex network theory,and then by combining unsupervised network embedding learning and neighborhood aggregation mode,we construct an encoder-decoder framework to extract the representation vector of class nodes in software system.Finally,according to the obtained node representations,Pairwise learning-to-rank algorithm is adopted to realize the importance ranking of nodes,so as to achieve the identification of key classes in software system.In order to verify the effectiveness of our method,an empirical analysis of four object-oriented Java open-source software is done,and we compare it with five commonly used node importance measurement methods and two existing works.The experimental results show that,compared with node centrality,K-core and PageRank,the proposed method is more effective in identifying key classes from the perspective of network robustness.In addition,on the existing public labeled dataset,the recall and precision of this paper are better at the top 15% percent of nodes,and improved by more than 10%.
Model-based Fault Tree Automatic Generation Method
ZHAN Wan-li, HU Jun, GU Qing-fan, RONG Hao, QI Jian, DONG Yan-hong
Computer Science. 2021, 48 (12): 159-169.  doi:10.11896/jsjkx.200800177
Abstract PDF(2151KB) ( 828 )   
References | Related Articles | Metrics
Model-based safety analysis methods can improve the modeling and analysis capabilities of current complex safety-critical systems.At present,fault tree is widely used in system safety analysis and reliability analysis.Fault tree analysis (FTA) is a top-down deductive failure method,which analyzes undesired states in the system according to the fault tree.In the system engineering,the possible problems of the current system model can be determined as early as possible and avoided in time.The work of this paper is oriented to a type of system safety modeling language AltaRica in the aerospace field.Based on its semantic model GTS (guarded transition systems),a method for automatically constructing a system fault tree from the flattened GTS model is designed,which saves the time of artificial fault tree construction,and speeds up the progress of system analysis.According to the semantic rules of the AltaRica3.0 language,extracting the data of the flattened GTS model to construct instance objects,designing the GTS model division algorithm to obtain a set of independent GTS models and an independent assertion,constructing the reachability graph of the independent GTS through the adjacency matrix and obtaining the key event sequence,then,the indepen-dent GTS that has been processed is combined with the independent assertion,the state of the entire system and the sequence of key events are obtained through the assertion propagation algorithm,and the system fault tree is generated.Finally,an example system shows that the algorithm can effectively complete the automatic generation of fault trees from the flattened GTS model.
Process Variants Merging Method Based on Group-fair Control Flow Structure
WANG Wu-song, FANG Huan, ZHENG Xue-wen
Computer Science. 2021, 48 (12): 170-180.  doi:10.11896/jsjkx.201100157
Abstract PDF(1834KB) ( 620 )   
References | Related Articles | Metrics
Merging process variants models can quickly construct a single process model to meet a new demand.The issue of how to merge the process variants models is of great practical value.Therefore,a process variants merging method using group-fair control flow structure is proposed.Firstly,process variants are segmented into individual variant using group-fairness in Petri nets.Then,the control flow paths of the variant fragments are extracted and their corresponding matrix representation are constructed,then the variants are merged into a single flow model.Finally,it is proved that the merged process model captures all the behaviors of the input process models,and it can detect the unexpected behaviors of the merged model compared to the former input models.
Database & Big Data & Data Science
Multi-space Interactive Collaborative Filtering Recommendation
LI Kang-lin, GU Tian-long, BIN Chen-zhong
Computer Science. 2021, 48 (12): 181-187.  doi:10.11896/jsjkx.201100031
Abstract PDF(2449KB) ( 2667 )   
References | Related Articles | Metrics
In the era of big data,due to information overload,it is difficult for users to find interesting content from massive data.The birth of personalized recommendation system has greatly solved this problem.Collaborative filtering has been widely used in the field of personalized recommendation,but due to the limitations of the model,the recommendation effect has not been further improved.Most existing collaborative filtering introduce presentation learning methods to obtain better user representation vectors and item representation vectors or improve user item match functions to improve the performance of the recommendation system,but such work is devoted to extracting user-item interaction information from a single interaction.This paper proposes a multi-space interactive collaborative filtering recommendation algorithm,which maps user vectors and item vectors to multiple spaces,performs user-item interaction from multiple angles,and then uses a two-layer attention mechanism to aggregate the final user representation vector and item representation vector to make a score prediction.The multi-space interaction collaborative filtering(MSICF) was compared with baseline on the published real dataset,and the evaluation of the MSICF is better than baseline.
Attribute Network Representation Learning Based on Global Attention
XU Ying-kun, MA Fang-nan, YANG Xu-hua, YE Lei
Computer Science. 2021, 48 (12): 188-194.  doi:10.11896/jsjkx.210100203
Abstract PDF(1511KB) ( 942 )   
References | Related Articles | Metrics
The attribute network not only has complex topology,its nodes also contain rich attribute information.Attribute network represent learning methods simultaneously extracts network topology and node attribute information to learn low-dimensional vector embedding of large attribute networks.It has very important and extensive applications in network analysis techniques such as node classification,link prediction and community identification.In this paper,we first obtain the embedded vector of the network structure according to the topology of the attribute network.Then we propose to learn the attribute information of adjacent nodes through global attention mechanism,use convolutional neural network to convolve the attribute information of the node to obtain the hidden vectors,and the weight vector and correlation matrix of global attention are generated from the hidden vectors of convolution,and then the attribute embedding vector of nodes is obtained.Finally,the structure embedding vector and the attribute embedding vector are connected to obtain the joint embedding vector which reflects the network structure and the attribute simultaneously.On three real data sets,the new algorithm is compared with the current eight network embedding models on tasks such as link prediction and node classification.Experimental results show that the new algorithm has good attribute network embedding effects.
Travel Demand Forecasting Based on 3D Convolution and LSTM Encoder-Decoder
TENG Jian, TENG Fei, LI Tian-rui
Computer Science. 2021, 48 (12): 195-203.  doi:10.11896/jsjkx.210400022
Abstract PDF(3399KB) ( 824 )   
References | Related Articles | Metrics
Reliable regional travel demand forecasting can provide reasonable and effective suggestions for the scheduling and planning of traffic resources.However,travel forecasting is a very challenging problem,facing massive spatial-temporal big data modeling problem.And how to extract the spatial and temporal features of the data effectively has become a research hotspot of urban computing.This paper proposes a demand forecasting model based on 3D deconvolution and encoder-decoder attention mechanism (in short 3D-EDADF),which is used to predict the inflow and outflow of travel demand in urban areas at the same time.3D-EDADF model first uses 3D convolution to extract spatial-temporal correlation of data,then uses LSTM encoder-decoder to capture temporal dependence,and combines attention mechanism to describe the difference of inflow and outflow.3D-EDADF model conducts hybrid modeling on the three time-dependent features of closeness dependency,daily dependency and periodic dependency,and then weights and fuses their multi-dimensional features to obtain the final prediction result.The experiments are carried out by using real travel demand data sets.The results show that compared with baseline models,the 3D-EDADF model has the lowest overall prediction error and has better prediction performance.
Attributed Network Embedding Based on Matrix Factorization and Community Detection
XU Xin-li, XIAO Yun-yue, LONG Hai-xia, YANG Xu-hua, MAO Jian-fei
Computer Science. 2021, 48 (12): 204-211.  doi:10.11896/jsjkx.210300060
Abstract PDF(2180KB) ( 697 )   
References | Related Articles | Metrics
An attributed network contains not only the complex topological structure but also the nodes with rich attribute information.It can be used to more effectively model modern information systems than traditional networks.Community detection of the attributed network has important research value in hierarchical analysis of complex systems,control of information propagation in the network,and prediction of group behavior of network users.In order to make better use of topology information and attribute information for community discovery,an attributed network embedding based on matrix factorization and community detection(CDEMF) are proposed.First,an attributed network embedding method based on matrix factorization is proposed to model the attributed proximity and the similarity of adjacent nodes calculated in term of the local link information of the network,where the low-dimensional embedding vector corresponding to each node can be obtained by a distributed algorithm of matrix decomposition,that is,the network nodes can be mapped into a collection of data points represented by low-dimensional vectors.Then the community detection method based on curvature and modularity is developed to achieve attributed network community division by clustering the data point set,which can automatically determine the number of communities contained in the data point set.CDEMF is compared with the other 8 kinds of well-known approaches on public real network datasets.The experimental results demonstrate the effectiveness and superiority of CDEMF.
Deep Network Representation Learning Method on Incomplete Information Networks
FU Kun, ZHAO Xiao-meng, FU Zi-tong, GAO Jin-hui, MA Hao-ran
Computer Science. 2021, 48 (12): 212-218.  doi:10.11896/jsjkx.201000015
Abstract PDF(2252KB) ( 611 )   
References | Related Articles | Metrics
The goal of network representation learning(NRL) is embedding network nodes into low-dimensional vector space,for effective feature representation of the downstream tasks.Due to the difficulty of information collection in the real-world scene-ries,large-scale networks often meet missing links between nodes.However,the most existing NRL models are designed on the foundation of complete information networks and that causes the poor robustness in incomplete networks.To solve this problem,a deep network representation learning(DNRL) method based on incomplete information networks is proposed.Firstly,a transfer probability matrix is used to dynamically mix the structural information and attribute information to cover the excessive loss caused by incomplete structural information.Then,a deep generative model variational autoencoder with powerful feature extraction capability is used to learn low-dimensional representation of nodes,and capture the potential high nonlinear features of nodes.Compared with the commonly used network representation learning methods,the experimental results on three real attri-bute networks show that the proposed model obviously improve effect in the node classification task with different degrees of link missing,visualization results clearly demonstrate the cluster relationship of nodes.
Microblog User Interest Recognition Based on Multi-granularity Text Feature Representation
YU You-qin, LI Bi-cheng
Computer Science. 2021, 48 (12): 219-225.  doi:10.11896/jsjkx.201100128
Abstract PDF(2552KB) ( 831 )   
References | Related Articles | Metrics
Microblog user interest discovery is of great significance to the personalized recommendation of social networks and the correct information dissemination guidance.We propose a method of microblog user interest recognition based on multi-granular text feature representation.First,this paper constructs a text vector for microblog users from three aspects,including topic layer,word order layer,and vocabulary layer.LDA is used to extract the content's topic features,and LSTM learns the semantic features of the sentences.The open-source word vector of Tencent AI Lab is introduced to obtain the semantic features of words;then,the multi-granular text feature representative matrix obtained by the above three feature vectors is input into CNN for text classification training.Finally,the interest recognition of Weibo users is completed through the multi-terminal output layer.Experimental results show that the precision rate,recall rate,and F1 value of the multi-granularity feature representation model are improved by 8%,12%,and 13%,respectively.Based on the careful consideration of text coarse and fine semantic granularity and word granularity,combined with the neural network classification algorithm,the multi-granularity feature representation model's evaluation index is better than the single-granularity feature representation model.
Complex Network Link Prediction Method Based on Topology Similarity and XGBoost
GONG Zhui-fei, WEI Chuan-jia
Computer Science. 2021, 48 (12): 226-230.  doi:10.11896/jsjkx.200800026
Abstract PDF(1722KB) ( 717 )   
References | Related Articles | Metrics
In order to improve the performance of complex network link prediction,topology similarity and XGBoost algorithm are used to complete link prediction in complex network.According to the topological structure of complex network,the adjacency matrix is established to solve the common neighbor set.Then the similarity score function of complex network is calculated according to the topological similarity theory.The score function and weight parameters of each time window are taken as input,and XGBoost algorithm is used to realize the link prediction of complex network.By setting two regularization coefficients of XGBoost algorithm through differentiation,the influence on link prediction accuracy is tested,and the optimal regularization coefficient is obtained,thus a stable XGBoost link prediction model is obtained.The experimental results show that,compared with the common network link prediction algorithms,the prediction accuracy based on topology similarity and XGBoost algorithm has obvious advantages,and the prediction time performance is smaller than other algorithms,especially suitable for large-scale complex network link prediction.
Computer Graphics & Multimedia
Survey of Intelligent Rain Removal Algorithms for Cloud-IoT Systems
ZHANG Yu-long, WANG Qiang, CHEN Ming-kang, SUN Jing-tao
Computer Science. 2021, 48 (12): 231-242.  doi:10.11896/jsjkx.201000055
Abstract PDF(5222KB) ( 1026 )   
References | Related Articles | Metrics
According to the “White Paper on China's Intelligent Internet of Things (AIoT) 2020”,with the prompt development of China's 5G network,the rapid popularization of large-capacity with low-price IoT sensor devices and the explosive growth of data,image processing is widely used in various fields of Internet of Things,such as smart city,smart transportation,smart healthcare,and other industry,etc.In these research areas,researchers usually ignore the actual problems in the data collection process,for instance,data degradation caused by time changes:seasonal shifting,diurnal variation,weather changes,and noise problems caused by spatial changes:object superposition,blur,and partial occlusion.Among those problems,the weather pro-blems represented by rainy days are the most challenging and common.Therefore,this paper systematically investigates the actual problems in the data collection process above,classifies and summarizes the image rain-removal algorithms under complex weather conditions.At the same time,regarding the compute-intensive execution of such algorithms,we utilize the Amazon EC2 cloud instance G4 and P3 series to quantitatively evaluate the processing time and effect of various reviewed rain removal algorithms.Finally,we illustrate the characteristics of various rain removal algorithms and the latest trends in Cloud-IoT applications.
Natural Scene Text Detection Algorithm Combining Multi-granularity Feature Fusion
CHEN Zhuo, WANG Guo-yin, LIU Qun
Computer Science. 2021, 48 (12): 243-248.  doi:10.11896/jsjkx.201000154
Abstract PDF(1738KB) ( 847 )   
References | Related Articles | Metrics
In natural scenes,text information usually has the characteristics of diversity and complexity.Due to the way of manua-lly designing features,traditional natural scene text detection methods lack robustness,and the existing text detection methods based on deep learning have the problem of losing important feature information in the process of extracting features in each layer of the network.This paper proposes a natural scene text detection method combined with multi-granularity feature fusion.The main contribution of this method is that by combining the features of different granularities in the general feature extraction network and adding the residual channel attention mechanism,the model can pay more attention to the target feature information and suppress useless information on the basis of fully learning the feature information of different granularities in the image,and this method improves the robustness and accuracy of the model.The experimental results show that,compared with other latest me-thods,the model has achieved 85.3% accuracy and 82.53% F-value on public datasets,and has better performance.
Distortion Correction Algorithm for Complex Document Image Based on Multi-level TextDetection
KOU Xi-chao, ZHANG Hong-rui, FENG Jie, ZHENG Ya-yu
Computer Science. 2021, 48 (12): 249-255.  doi:10.11896/jsjkx.200700072
Abstract PDF(5171KB) ( 1508 )   
References | Related Articles | Metrics
Document distortion correction is the basic step of document OCR(optical character recognition),which plays an important role in improving the accuracy of OCR.Document image distortion correction often depends on text extraction.However,most of the current document image correction algorithms cannot accurately locate and analyze the text in complex documents,resulting in unsatisfactory correction effects.To address this problem,a text detection framework based on a fully convolutional network is proposed,and the synthetic document is used to train the network to achieve accurate acquisition of three-level text information of characters,words,and text lines.A self-adaptive sampling of text and three-dimensional modeling of the page using a cubic function will transform the correction problem into a model parameter optimization problem to achieve the purpose of correcting complex document images.Correction experiments using synthetic distortion documents and real test data show that the proposed correction method can accurately extract text from complex documents,significantly improve the visual effect of complex document image correction.Compared with other algorithms,the accuracy rate of OCR after correction significantly increa-ses.
Detection Method of High Beam in Night Driving Vehicle
GONG Hang, LIU Pei-shun
Computer Science. 2021, 48 (12): 256-263.  doi:10.11896/jsjkx.200700026
Abstract PDF(3964KB) ( 819 )   
References | Related Articles | Metrics
Managing the illegal use of high beams can reduce the occurrence of night traffic accidents.However,at present,there is no efficient detection method for high beam of night,and relevant traffic regulations cannot be effectively implemented.In order to solve this problem,an algorithm to detect the high beam at night is proposed in this paper.Based on YOLOv3,this algorithm optimizes the network structure of YOLOv3,accelerates its operation speed,uses standard residual components and dilates convolution to enhance the feature expression ability of the network,and then the loss function of YOLOv3 is improved to optimize the contribution of small-scale target to coordinate loss,which enhances the detection ability of small-scale target,finally YOLOv3 prior frame clustering algorithm and number are optimized to improve the expression ability and detection speed of the model.The experimental results show that the mean average precision (mAP) of the algorithm designed in this paper is 99.09%,and 30% higher than that of YOLOv3.The algorithm satisfies the practical requirement and can detect the violation effectively.
Object Detection Based on Neighbour Feature Fusion
LI Ya-ze, LIU Hong-zhe
Computer Science. 2021, 48 (12): 264-268.  doi:10.11896/jsjkx.201200196
Abstract PDF(1694KB) ( 664 )   
References | Related Articles | Metrics
With the development of intelligent driving,the precision requirements for target detection are getting higher and higher,especially for small targets that are far away.In the neck of two-stage object detection network,although the feature fusion of semantic information and location information is more effective for large targets if the bottom-up fusion method is used,it will cause big information loss to small targets.To address this problem,we propose neighbor feature pyramid networks(NFPN) method of feature fusion of neighbor layers,the Double RoI(Region of Interest) method to fuse the FPN and NFPN features,and the recursive feature pyramicl(RFP) method.Using Faster RCNN 50 as the benchmark,the mean average precision(mAP) of our model in the Lisa data set has increased by 2.6% while using NFPN,Double RoI and RFP.On the VOC2007 data set,using the VOC07+12 train data set for training,VOC2007 test as the test set,and Faster RCNN101 as the baseline,the mAP of our model both used NFPN,Double RoIE and RFP has increased by 6%,and the object detect accuracy of large,medium and small targets is improved at the same time.
Person Re-identification by Region Correlated Deep Feature Learning with Multiple Granularities
DONG Hu-sheng, ZHONG Shan, YANG Yuan-feng, SUN Xun, GONG Sheng-rong
Computer Science. 2021, 48 (12): 269-277.  doi:10.11896/jsjkx.210400121
Abstract PDF(3095KB) ( 560 )   
References | Related Articles | Metrics
Extracting both global and local features from pedestrian images has become the mainstream inperson re-identification.While among most of current deep learning based person re-identification models,the relations between adjacent body parts are seldom taken into consideration during extracting local features.This may decay the capability of distinguishing different persons when they share similar attributes of local regions.To address this problem,a novel method is proposed to learn region correlated deep features for person re-identification.In our model,the output feature map of backbone network is partitioned with multiple granularities first.And then the structure information preserved local features are learned via a new designed Region Correlated Network (RCNet) module.The RCNet makes full use of the structure maintenance of average pooling and the performance advantage of max pooling,endowing local features with rich structural information.By jointly processing current feature and local features from other regions,they are strongly related to each other due to the spatial correlation.As a result,the discrimination of them is significantly enhanced.For better optimization of the whole network,the shortcut connection in deep residual networks is also employed in the architecture of RCNet.Finally,the re-identification is conducted with both global features and the local features with structural information incorporated.Experimental results show that the proposed method achieves higher matching accuracies in comparison with existing approaches on the public Market-1501,CUHK03 and DukeMTMC-reID datasets,demonstrating favorable re-identification performance.
Artificial Intelligence
Survey on Retrieval-based Chatbots
WU Yu, LI Zhou-jun
Computer Science. 2021, 48 (12): 278-285.  doi:10.11896/jsjkx.210900250
Abstract PDF(2335KB) ( 1970 )   
References | Related Articles | Metrics
With the rapid progress of natural language processing techniques and the massive accessible conversational data on Internet,non-tasked oriented dialogue systems,also referred to as Chatbots,have achieved great success,and drawn attention from both academia and industry.Currently,there are two lines in chatbots research,retrieval-based chatbots and generation-based chatbots.Due to the fluent responses and low latency,retrieval-based chatbots is a common method in practice.This paperfirst briefly introduces the research background, basic structure and component modules of retrieval-based chatbots,and then illustrates the constraints of the response selection module and related data set in details.Subsequently,we summarize recent popular techniques for response selection problem,including:statistic method,representation-based neural network method,interaction-based neural network method,and pre-training-based method.Finally,we pose the challenges of chatbots and outline promising directions as future work.
Review on Interactive Question Answering Techniques Based on Deep Learning
HUANG Xin, LEI Gang, CAO Yuan-long, LU Ming-ming
Computer Science. 2021, 48 (12): 286-296.  doi:10.11896/jsjkx.210100209
Abstract PDF(1814KB) ( 1286 )   
References | Related Articles | Metrics
Compared to the traditional question answering(QA),interactive question answering(IQA) considers dialogue context and background information,which brings new challenges to understand user input and reason answers.First of all,user input is not only limited to questions,but can also be utterances that inform the details of the question and give feedback on whether the answer is feasible or not.Therefore,it is necessary to understand the intent of each utterance in the dialogue.Secondly,IQA allows multiple characters to discuss a question at the same time,generating personalized answers.So,it is necessary to understand different characters and identify them from each other.Thirdly,when IQA revolves around a background document,it is necessary to understand this document and extract answers from it.This paper reviews recent development in three subareas:IQA without background,IQA with background,and the application of transfer learning in IQA,and finally discusses the future perspective of interactive question answering.
Proximal Policy Optimization Based on Self-directed Action Selection
SHEN Yi, LIU Quan
Computer Science. 2021, 48 (12): 297-303.  doi:10.11896/jsjkx.201000163
Abstract PDF(2499KB) ( 1199 )   
References | Related Articles | Metrics
The optimization algorithm of monotonous improvement of strategy in reinforcement learning is a current research hotspot,and it has achieved good performance in both discrete and continuous control tasks.Proximal policy optimization(PPO)algorithm is a classic strategy monotonic promotion algorithm,but it is an on-policy algorithm with low sample utilization.To solve this problem,an algorithm named proximal policy optimization based on self-directed action selection(SDAS-PPO)is proposed.The SDAS-PPO algorithm not only uses the sample experience according to the importance sampling weight,but also adds a synchronously updated experience pool to store its own excellent sample experience,and uses the self-directed network learned from the experience pool to guide the choice of actions.The SDAS-PPO algorithm greatly improves the sample utilization rate and ensures that the intelligent body can learn quickly and effectively when training the network model.In order to verify the effectiveness of the SDAS-PPO algorithm,the SDAS-PPO algorithm and the TRPO algorithm,PPO algorithm and PPO-AMBER algorithm are used in the continuous control task Mujoco simulation platform for comparative experiments.Experimental results show that this method has better performance in most environments.
Three-dimensional Path Planning of UAV Based on Improved Whale Optimization Algorithm
GUO Qi-cheng, DU Xiao-yu, ZHANG Yan-yu, ZHOU Yi
Computer Science. 2021, 48 (12): 304-311.  doi:10.11896/jsjkx.201000021
Abstract PDF(2732KB) ( 922 )   
References | Related Articles | Metrics
The three-dimensional path planning of UAVs is a relatively complex global optimization problem.Its goal is to obtain the optimal or close to optimal flight path considering threats and constraints.Aiming at the problems of whale algorithm in the three-dimensional trajectory planning of UAVs,it is easy to fall into the local optimum,and the convergence speed is slow,and the convergence accuracy is not high enough.A whale optimization algorithm based on Lévy flight is proposed to solve the pro-blem of UAV three-dimensional path planning.In the iterative process of the algorithm,Levy flight is added to randomly disturb the optimal solution;an information exchange mechanism is introduced to update the individual's position through the current global optimal solution,the individual memory optimal solution and the neighborhood optimal solution;better trade-offs local convergence and global development.The simulation results show that the path planning algorithm proposed in this paper can effectively avoid the threat zone,the convergence speed is faster,the convergence accuracy is higher,and it is less likely to fall into the local optimal solution.When the number of iterations is 300 and the number of populations is 50,the cost function value obtained by the LWOA algorithm is 91.1% of the PSO algorithm,92.1% of the GWO algorithm,95.9% of the WOA algorithm,and the track cost is smaller.
EEG Emotion Recognition Based on Frequency and Channel Convolutional Attention
CHAI Bing, LI Dong-dong, WANG Zhe, GAO Da-qi
Computer Science. 2021, 48 (12): 312-318.  doi:10.11896/jsjkx.201000141
Abstract PDF(2099KB) ( 921 )   
References | Related Articles | Metrics
The existing emotion recognition researches generally use neural network and attention mechanism to learn emotional features,which have relatively single feature representation.Moreover,neuroscience studies have shown that EEG signals of different frequencies and channels have different responses to emotion.Therefore,this paper proposes a method of fusing frequency and electrode channel convolutional attention for EEG emotion recognition.Specifically,EEG signals are firstly decomposed into different frequency bands and the corresponding frame-level features are extracted.Then the pre-activated residual network is employed to learn deep emotion-relevant features.At the same time,the frequency and electrode channel convolutional attention module is integrated into each pre-activated residual unit of residual network to model the frequency and channel information of EEG signals,thus generating final representation of EEG features.Experiments on DEAP and DREAMER datasets show that the proposed method helps to enhance the importing of emotion-salient information in EEG signals when compared with single-layer attention mechanism,and generates better recognition performance.
Aspect Extraction of Case Microblog Based on Double Embedded Convolutional Neural Network
WANG Xiao-han, TAN Chen-chen, XIANG Yan, YU Zheng-tao
Computer Science. 2021, 48 (12): 319-323.  doi:10.11896/jsjkx.201100105
Abstract PDF(1416KB) ( 575 )   
References | Related Articles | Metrics
Aspect extraction of the microblog involved in the case is a task in a specific domain.The expression of aspect words is diverse and the meaning is different from that of the general domain.Only relying on the word embedding in the general domain,these aspect words cannot be well represented.This paper proposes a method for extracting aspect words from microblogs by using both domain word embedding and generic word embedding.Firstly,all the microblogs involved in the case is pre-trained to obtain the embedding layer with the characteristics of the involved domain.Secondly,the microblog comments are input into two embedding layers to obtain the characterization results of the aspect words in different domains,and perform the splicing operation.Then,the features related to the case are extracted through the convolution layer.Finally,the classifier is used to label the sequence to extract aspect words involved in the case.The experimental results show that the F1 value of the proposed method reaches 72.36% and 71.02% respectively on the data sets of #Chongqing bus falling into the river# and #Mercedes Benz female driver rights protection#,which is better than the existing benchmark models,and verifies the influence of word embedding in different domains on the aspect extraction of the microblogs.
Multi-objective Optimization Method Based on Reinforcement Learning in Multi-domain SFC Deployment
WANG Ke, QU Hua, ZHAO Ji-hong
Computer Science. 2021, 48 (12): 324-330.  doi:10.11896/jsjkx.201100159
Abstract PDF(2046KB) ( 688 )   
References | Related Articles | Metrics
With the development of network virtualization technology,the deployment of service function chain in multi-domain network brings new challenges to the optimization of service function chain.The traditional deployment method usually optimizes a single target,which is not suitable for multi-objective optimization,and cannot measure and balance the weight among optimization targets.Therefore,in order to optimize the delay,network load balancing and acceptance rate of large-scale service function chain deployment requests synchronously,a data normalization processing scheme is proposed,and a two-step SFC deployment algorithm based on reinforcement learning is designed.The algorithm takes transmission delay and load balancing as feedback parameters and balances the weight relationship between them,and the SFC acceptance rate is optimized by using reinforcement learning framework simultaneously.The experimental results show that,the delay of the algorithm is reduced by 71.8% compared with LASP method,the acceptance rate is increased by 4.6% compared with MDSP method,and the average load balancing is increased by 39.1% compared with GREEDY method under the large-scale requests.The multi-objective optimization effect is guaranteed.
Abstractive Automatic Summarizing Model for Legal Judgment Documents
ZHOU Wei, WANG Zhao-yu, WEI Bin
Computer Science. 2021, 48 (12): 331-336.  doi:10.11896/jsjkx.210500028
Abstract PDF(1940KB) ( 1463 )   
References | Related Articles | Metrics
At present,the automatic summarization model for Chinese content applied to legal judgement documents mainly adopts the extraction method.However,due to the lengthiness and low level of structure of legal texts,the accuracy and reliability of extraction method is insufficient for practical application.In order to obtain high quality summaries of legal judgment documents,in this paper,we propose an abstractive automatic summarization model based on multi-model fusion.Based on Seq2Seq model,we apply attention mechanism and selective gates to better process the data input.Specifically,we combine Bert pre-trai-ning and reinforcement learning policy to optimize our model.The corpus we built consists of 50 000 legal judgment documents regarding small claims procedure and summary procedure.Evaluations on the corpus demonstrate that the proposed model outperforms all of the baseline model,and the mean ROUGE score is 5.81% higher than that of conventional Seq2seq+Attentionmodel.
Hierarchical Learning on Unbalanced Data for Predicting Cause of Action
QU Hao, CUI Chao-ran, WANG Xiao-xiao, SU Ya-xi, HAN Xiao-hui, YIN Yi-long
Computer Science. 2021, 48 (12): 337-342.  doi:10.11896/jsjkx.201100212
Abstract PDF(1780KB) ( 649 )   
References | Related Articles | Metrics
The cause of action represents the nature of the legal relationships involved in the case.A scientific and rational choice of the cause of action will facilitate the correct application of laws and enable the courts to perform classification management of cases.Cause of action prediction aims to endow computers with the ability to automatically predict the cause category based on the textual case description.Due to the small number of the samples of low-frequency categories and the difficulty of learning effective features,previous methods usually filters out the samples of low-frequency category in data preprocessing.However,in the problem of predicting the cause of action,the key challenge is how to make an accurate prediction for the cases of low-frequency cause categories.To solve this problem,in this paper,we propose a novel hierarchical learning method based on unbalanced samples for predicting cause of action.Firstly,all causes are divided into the first-level and second-level causes according to their inherent hierarchical structure.Then,the tailed ones in second-level causes can be merged into a new first-level category with sufficient samples,and the hierarchical learning is applied to realize the prediction of cause of action.Finally,we refine the loss function to alleviate the problem of data imbalance.Experimental results show that the proposed method is significantly superior over the baseline methods,leading to an improvement of 4.81% in terms of accuracy.Also,we verify the benefits of introducing the hierarchical learning as well as refining the loss function for unbalanced data.
Information Security
Identification Method of Voiceprint Identity Based on MFCC Features
WANG Xue-guang, ZHU Jun-wen, ZHANG Ai-xin
Computer Science. 2021, 48 (12): 343-348.  doi:10.11896/jsjkx.210100038
Abstract PDF(2614KB) ( 856 )   
References | Related Articles | Metrics
As a product of the development of modern forensic technology,voiceprint plays an important role in modern audio-visual identification.The traditional voiceprint analysis method is based on the sound processing tools for manual analysis.Considering the shortcomings of strict text relevance and conjecture of comparison,its evidential power as evidence appraisal opinion needs to be strengthened.In this paper,a method of identification based on Mel frequency cepstrum coefficient is proposed,which is to extract and quantify the envelope containing the original sound formant and its time axis information as voiceprint features for identity comparison.This method improves the shortcomings of traditional Mel frequency cepstrum coefficient,which extracts the mutation of formant,and adds the transformation characteristics of vowels and consonants into voiceprint features to improve the correctness of recognition.Experiments show that the accuracy of identification is 85% and the variance is about 9% when the test material is independent of the sample text.Therefore,it has good identifiability for the same person identification of voiceprint.In the case of non same person identification of voiceprint,it proves to be far more accurate in combination with traditional manual analysis.
Network Security Situation Based on Time Factor and Composite CNN Structure
ZHAO Dong-mei, SONG Hui-qian, ZHANG Hong-bin
Computer Science. 2021, 48 (12): 349-356.  doi:10.11896/jsjkx.210400227
Abstract PDF(2288KB) ( 600 )   
References | Related Articles | Metrics
In order to solve the problem of low accuracy of traditional network security situation awareness research methods in the case of complex network information,combined with deep learning,this paper proposes a network security situation assessment model based on time factor and composite CNN structure,which combines volume integral solution technology and deep separable technology to form a four layer series composite optimal unit structure.The one-dimensional network data are transformed into two-dimensional matrix and loaded into the neural network model in the form of gray value,so as to give full play to the advantages of convolution neural network.In order to make full use of the time-series relationship between data,time factor is introduced to form fusion data,which makes the network to learn the original data and fusion data with time-series relationship at the same time,the feature extraction ability of the model is increased,the spatial mapping of time-series data is established by using time factor and point convolution,and the integrity of the model structure is increased.Experimental results show that the accuracy of the proposed model on two datasets is 92.89% and 92.60% respectively,which is 2%~6% higher than randomfo-rest and LSTM algorithm.
Anomaly Detection Algorithm Based on SSC-BP Neural Network
SHI Lin-shan, MA Chuang, YANG Yun, JIN Min
Computer Science. 2021, 48 (12): 357-363.  doi:10.11896/jsjkx.201000086
Abstract PDF(2469KB) ( 748 )   
References | Related Articles | Metrics
Aiming at the increasing number and complexity of new network attacks in the Internet of Things environment,the traditional anomaly detection algorithm has high false alarm rate,low detection rate and large amount of data,which cause calculation difficulties,this paper proposes an anomaly detection algorithm based on the combination of subspace clustering(SSC) and BP neural network.Firstly,different subspaces are obtained by CLIQUE algorithm,which is the most commonly used subspace clustering algorithm;secondly,BP neural network anomaly detection is carried out on the data in different subspaces,and the prediction error value is calculated.By comparing with the pre-set accuracy,the threshold value is constantly updated for correction,so as to improve the ability of identifying network attacks.The NSL-KDD public data set and the network attack data set in the Internet of Things environment are used in the simulation experiment.The NSL-KDD public data set is divided into four kinds of single attack subsets and a mixed attack subsets.Compared with K-means,DBSCAN,SSC-EA and K-KNN anomaly detection models.In the mixed attack subset,the detection rate of SSC-BP neural network model is 6% higher than that of traditional K-means model,and the false detection rate is reduced by 0.2%;SSC-BP neural network model can detect the most attacked network with the lowest false detection rate in four single attack subsets.In the Internet of Things environment,SSC-BP neural network model is superior to other models.