Computer Science

Course Design and Redesign for Introduction to Data Science

CHAO Le-men

Computer Science. 2020, 47 (7): 1-7. doi:10.11896/jsjkx.200500088

Abstract

PDF(1296KB) ( 2269 )

References | Related Articles | Metrics

Introduction to Data Science is an intrinsic course for not only the development of emerging majors (Data Science and Big Data Technology,Big Data Management and Application,and so on),but also only the innovation of traditional ones (Compu-ter Science and Technology,Statistics,and Information Resource Management,and so on).Course design issues for this novel course,including its objectives,contents,experiments,assessment methods,reference books,personalized design are discussed based upon conducting an in-depth for typical courses offered by Columbia University,New York University,Harvard University and Renmin University of China as well as the author’s teaching experience.The redesign of exiting courses on introduction to Data Science should focus on improving the abilities of target students on full-stack data science,data product development,co-ding for Data Science,and communicating with non-professional users,as well as leveraging alternative course construction mo-dels,reflecting social needs,highlighting its roadmap roles for the curriculums.

Polynomial Time Algorithm for Hamilton Circuit Problem

JIANG Xin-wen

Computer Science. 2020, 47 (7): 8-20. doi:10.11896/jsjkx.191200176

Abstract

PDF(1760KB) ( 12529 )

References | Related Articles | Metrics

The ‘P vs.NP problem’ is the most famous and difficult problem in math.and computer science.Among the seven most important and difficult problems that the American Clay Mathematics Institute declared in 2000,this problem ranks the first one.Collecting 25 difficult problems that human being urgently want to solve,a list given by Science in 2005 contains the ‘P vs.NP problem’,ranking 19th.In the latest list of the most important 125 questions given by Science,the ‘P vs.NP problem’ ranks 19th too.Modern cryptography is based on the assumption that NP≠P.Whether NP=P or not depends on the complexity to solve a NPC problem.A new NP problem which is called MSP is proposed and a polynomial time algorithm to solve MSP is designed in this paper.To prove that the MSP problem is a NP complete problem,several papers,that reduced more than 10 NP complete problems to the MSP problem and reversely reduced the MSP problem to the SAT problem,were published in the last several years.Hence the result maybe helpful for proving NP=P.

Uncertain XML Model Based on Fuzzy Sets and Probability Distribution and Its Algebraic Operations

HU Lei, YAN Li

Computer Science. 2020, 47 (7): 21-30. doi:10.11896/jsjkx.190700164

Abstract

PDF(1635KB) ( 1442 )

References | Related Articles | Metrics

As a de-facto standard of information representation and exchange,XML has been widely used as a unified data exchange format between different applications,which has played an important role in real-world applications.However,the real world is filled with uncertain information and classical XML is not able to represent and deal with uncertain data.So it is necessary to extend classical XML model.The real world is complex,which often contains both random and fuzzy uncertainties.Conside-ring that probability theory and fuzzy set theory are powerful tools for dealing with uncertainty,this paper uses both probability theory and fuzzy set theory to build a new uncertain XML model,which is different from the existing fuzzy XML models and probabilistic XML models.The new uncertain XML model is compatible with existing XML models and can represent more complex uncertain information.

Study on Subnetwork Reliability of k-ary n-cubes

FENG Kai, LI Jing

Computer Science. 2020, 47 (7): 31-36. doi:10.11896/jsjkx.190700170

Abstract

PDF(1909KB) ( 1291 )

References | Related Articles | Metrics

The k-ary n-cube is one of the most attractive interconnection network topologies for parallel computing systems.In order to accurately measure the fault tolerance on subnetworks in a k-ary n-cube,the k-ary (n－1)-cube reliability in a k-ary n-cube under the probabilistic fault model is studied.When k is an odd integer and k≥3,a lower bound on the k-ary (n－1)-cube reliability in a k-ary n-cube under the probability fault model is derived by clarifying the intersections of k-ary (n－1)-cube subnetworks in a k-ary n-cube,and an approximate k-ary (n－1)-cube reliability result is obtained.The approximation result is shown to be close to the simulation result,and these two results are getting overlapped as the node reliability decreases.Moreover,an algorithm is given for searching fault-free k-ary (n－1)-cubes in a k-ary n-cube in the presence of node failures,and the experimental results demonstrate the effectiveness of the proposed algorithm.

Blind Quantum Computation over Noise Channels

LUO Wen-jun, LEI Shuang

Computer Science. 2020, 47 (7): 37-41. doi:10.11896/jsjkx.190600020

Abstract

PDF(1468KB) ( 1228 )

References | Related Articles | Metrics

Blind Quantum Computation (BQC),is a kind of protocol that remarkably distinguishes from traditional quantum computation,delegates computing tasks from clients to the servers through the quantum channels which eventually alleviates the computing pressure generated by the clients.Consequently,BQC requires that the quantum is teleported in an accurate manner of transmission via the channels.Due to the problem of noise of quantum channel,a purely noiseless transmission channel under ideal circumstance cannot be realized without quantum error correction codes that are implemented to rectify the flip errors in terms of quantum bit and phase resulted from noise channels.By the basis of BQC protocol,two anti-noise BQC protocols are proposed from the perspectives of noise bit flip channels and noise phase flip channels,respectively.Explicitly,the client encodes the qubits via various ways,then the encoded qubits are used to transmit the quantum information to the server by which the quantum error correction codes are exploited to recover the correct quantum information for the purpose of completion of BQC with the client.A protocol analysis indicates that via correction computation,the requirement of accurate transmission by BQC protocol can be met during the computation of BQC over the quantum bit flip and quantum phase flip noise channels with neither the sacrifice of correctness and blindness of BQC,nor the reduction in unconditional security of quantum computing.Finally,this paper hopes that the new BQC protocols can be applied to other quantum error correction codes as well.

Reward Mechanism Based Branching Strategy for SAT Solver

SHEN Xue, CHEN Shu-wei, AI Sen-yang

Computer Science. 2020, 47 (7): 42-46. doi:10.11896/jsjkx.190700191

Abstract

PDF(1450KB) ( 1390 )

References | Related Articles | Metrics

Branching decision is one of the most critical parts of CDCL solver,and an excellent branching strategy can reduce the number of branching decisions and improve the SAT solver’s efficiency.Most of the state-of-art branching strategies have incorporated the conflict analysis process.But the branching strategies have different reward methods on variables participating in conflict analysis,so the selected branching variables are different.In this paper,considering the important fact that decision variables are always selected in the unassigned variable set,a new branching strategy based on the EVSIDS (Exponential Variable State Independent Decaying Sum) branching strategy is proposed,which is called the branching strategy based on the reward mechanism (referred to as the RACT branching strategy).The RACT branching strategy is to reward the variables that are de-assigned in the conflict analysis again to increase the probability that the variables that are frequently involved in the conflict analysis in the unassigned variables are selected as branch variables.Finally,the proposed branching strategy is embedded into the Glucose4.1 solver to form a new solver Glucose4.1+RACT,and the effectiveness of the RACT branching strategy is tested by using 350 instances of the 2017 SAT competition as experimental data sets.The experimental comparison shows that the solver Glucose4.1+RACT can solve more instances than the original solver,especially with an increase of 13.5% in the number of satisfied instances,and the total time spent on solving 350 competition examples is also 3.9% lower than that of Glucose 4.1.The above experimental data shows that the proposed branching strategy can effectively reduce the number of branch decisions in the search tree and give the correct search space,thus improving the solving ability of the SAT solver.

Techniques for Recommendation System:A Survey

LIU Jun-liang, LI Xiao-guang

Computer Science. 2020, 47 (7): 47-55. doi:10.11896/jsjkx.200200114

Abstract

PDF(1473KB) ( 3227 )

References | Related Articles | Metrics

The recommendation system obtains users’ historical behavior data to predict their preferences,such as web browsing data,purchase records,social network information,users’ geographical location and so on.With the development of computer technology,the recommendation technology is mainly based on user-item data matrix decomposition technology in the early stage.Afterwards,it is gradually integrated with data mining,machine learning,artificial intelligence and other technologies,so as to deeply mine the potential preferences of user behavior and build a more accurate user preference model.The recommendation process also moves from static prediction to real-time recommendation,enriching the recommendation results through real-time interaction with users.This paper mainly reviews the key technologies adopted by the recommendation system in different periods,including content-based filtering technology,collaborative filtering technology,recommendation technology based on deep learning,recommendation technology based on reinforcement learning,recommendation technology based on heterogeneous information network.Finally,this paper analyzes the advantages and disadvantages of key technologies,and then looks forward to the future development of the recommendation system.

Research Progress on Risk Access Control

WANG Jing-yu, LIU Si-rui

Computer Science. 2020, 47 (7): 56-65. doi:10.11896/jsjkx.190700157

Abstract

PDF(1848KB) ( 1681 )

References | Related Articles | Metrics

Big data access control is one of the important technologies to ensure the security and information sharing of big data.However,because the traditional access control strategy can not meet the real-time and dynamic access information in the dynamic environment,the risk assessment method is introduced in the access control to coordinate access control policies,improve the application of access control in dynamic environments.In view of this,this paper systematically reviews and summarizes the main work of risk access control research at home and abroad,and analyzes the latest research results in recent years.Firstly,the risk access control extended to the traditional access control model and its XACML framework-based access control model is analyzed and summarized,and the application in different environments is summarized.Secondly,the techniques and methods of risk access control are summarized and analyzed,the risk is self-contained,and Risk-Adaptive Access Control (RAdAC) is analyzed and researched.Finally,the future research on risk access control in big data environment is prospected,and some problems with research value are proposed.This paper argues that risk-based access control is still an important research content of access control in future big data access control research technology.

Subspace Clustering Method Based on Block Diagonal Representation and Neighbor Constraint

GAO Fang-yuan, WANG Xiu-mei

Computer Science. 2020, 47 (7): 66-70. doi:10.11896/jsjkx.190600155

Abstract

PDF(1709KB) ( 1731 )

References | Related Articles | Metrics

Clustering is an important tool for machine learning and data mining,and subspace clustering is a popular method in high-dimensional data analysis.Spectral clustering based subspace clustering method learns the self-representation coefficient matrix of data in subspace,and then the spectral clustering is carried out on the coefficient matrix.It is found that the subspace-based clustering cannot deal with nonlinear problem and neglect the local geometric structure of the data.To this end,this paper proposes a new subspace clustering method which first projects the data to a high-dimensional linear space by a nonlinear mapping function and applies a Laplacian-based manifold regularization constraint on the subspace clustering model to preserve the local structure of the data at the same time.Three kinds of Laplacian matrix are used to establish the different nonlinear subspace clustering models based on manifold regularization and block diagonal constraints.Experimental results on different data sets show the effectiveness of the proposed methods.

Nonnegative Matrix Factorization Algorithm with Hypergraph Based on Per-treatments

LI Xiang-li, JIA Meng-xue

Computer Science. 2020, 47 (7): 71-77. doi:10.11896/jsjkx.200200106

Abstract

PDF(2446KB) ( 1301 )

References | Related Articles | Metrics

With the development of the media technology,more information is stored as the pictures.It is a topic problem in the machine learning field that how to distribute the right label to lots of unsigned pictures.And the image clustering has wide application on the face recognition and the handwriting number recognition field.Because the pictures are always stored as nonnegative matrices,the nonnegative matrix factorization algorithm (NMF) plays an important role in the image clustering.But the disadvantage in NMF algorithm is that the algorithm processes the data in the original data space which may produce a terrible result when the data have errors.To address this problem,the proposed algorithm is the nonnegative matrix factorization algorithm with a hypergraph based on per-treatments (PHGNMF).The PHGNMF algorithm introduces the per-treatments and the hypergraph into the NMF algorithm.In the per-treatments,the algorithm uses the grayscale normalization to eliminate the influence of the different illuminations firstly and then the algorithm can extract the low-frequency information of the pictures by the wavelet analysis.The wavelet procession could also reduce the dimensions of the data.The algorithm constructs a hypergraph for the data to save the neighboring information which has an important influence in the clustering procession.At last the results in five fundamental data sets confirm the effectiveness of the algorithm compared with fundamental algorithms.The results show the increase of accuracy is 2%～7% and the increase of normalized mutual information on some data sets is 2%～5%.

Mining Method of Business Process Change Based on Cost Alignment

LIU Jing, FANG Xian-wen

Computer Science. 2020, 47 (7): 78-83. doi:10.11896/jsjkx.190600140

Abstract

PDF(1408KB) ( 977 )

References | Related Articles | Metrics

Change mining is the core of business process management,and it is particularly important to mine the changes of business processes from the event log.Most of the existing analysis methods of change mining focus on the source model or target model.From the point of view of system log,this paper proposes a business process change mining method based on cost optimal alignment.Firstly,according to the event log,the effective high frequency morphological occurrence segment is extracted,the highest cost function value of each trace alignment is calculated,and on this basis,the optimal trace alignment is found,and then the similarity between the change log and the source log is measured to mine the change set quickly and efficiently.Finally,an example is given to show the effectiveness of the method.

Survey of Visual Image Saliency Detection

YUAN Ye, HE Xiao-ge, ZHU Ding-kun, WANG Fu-lee, XIE Hao-ran, WANG Jun, WEI Ming-qiang, GUO Yan-wen

Computer Science. 2020, 47 (7): 84-91. doi:10.11896/jsjkx.190900006

Abstract

PDF(2690KB) ( 2462 )

References | Related Articles | Metrics

In today’s society where image data are exploding,how to use computer to efficiently acquire and process image information has become an important research topic.Under the inspiration of human visual attention mechanism,researchers have found that when this mechanism is introduced into machine image processing tasks,the efficiency of information extraction can be greatly improved,thus saving more limited computing resources.Visual image saliency detection is to use computers to simulate the human visual attention mechanism to calculate the importance of the information of each part in the image,which has been widely used in image segmentation,video compression,target detection,image indexing and other aspects,and has important research values.This paper summarizes and introduces the research situation of image saliency detection algorithms.Firstly,it takes information-driven sources as starting point to summarize the saliency detection model,and then analyzes several typical saliency detection algorithms.The models are divided into 3 categories according to whether they are based on learning models,which are based on non-learning models,based on traditional machine learning models and based on deep learning models.For the first category,the paper compares in more details the saliency detection algorithms based on local contrast and global contrast,and points out their respective advantages and disadvantages.For the latter two categories,this paper analyzes the application of machine learning algorithms and deep learning in saliency detection.Finally,this paper summarizes and compares the existing saliency detection algorithms and prospects the future development direction of the research in this aspect.

Injection-molded Bottle Defect Detection Using Semi-supervised Deep Convolutional Generative Adversarial Network

XIE Yuan, MIAO Yu-bin, XU Feng-lin, ZHANG Ming

Computer Science. 2020, 47 (7): 92-96. doi:10.11896/jsjkx.190700093

Abstract

PDF(2129KB) ( 1207 )

References | Related Articles | Metrics

Defect detection of injection-molded bottles is an important part of injection molding.Due to the relatively few defective samples in production,the model tends to over-fit when using deep learning algorithm.In order to solve this problem,a defect detection model based on semi-supervised deep convolutional generative adversarial network(DCGAN) is proposed.Firstly,the model preprocesses the original images using HSV color space transformation and Otsu threshold segmentation methods.Then,the learning tasks are combined so that the unsupervised discriminator and the supervised classifier share convolutional parameters.At the same time,the loss function is modified,which consists of cross entropy and wasserstein distance.Finally,the model is fine-tuned using Adam optimizer.The experimental results show that the model can distinguish the defective samples,achieving an accuracy of 98.65%.Compared with traditional machine learning algorithm and CNN model with data augmentation,the proposed model avoids over-fitting.

Vehicle Self-localization Based on Standard Road Sign

ZHANG Shan-bin, YUAN Jin-zhao, CHEN Hui, WANG Yu-rong, WANG Jie, TU Chang-he

Computer Science. 2020, 47 (7): 97-102. doi:10.11896/jsjkx.190900011

Abstract

PDF(2013KB) ( 1388 )

References | Related Articles | Metrics

Vehicle self-localization is one of the key technologies of automatic driving and advanced assistant driving.Fast and accurate vehicle self-localization can provide vehicle location information for navigation or intelligent driving system in time.Aiming at the problem of vehicle positioning in complex environment in the field of automatic driving and advanced assistant driving system,a vehicle self-localization method based on standard road signs is proposed.A simple database containing standard road signs is designed,in which information such as fonts,sizes and control point coordinates of the road signs are pre-stored.The video stream images containing the standard road signs are captured by a vehicle-mounted monocular camera.Centroid coordinates of the identification area are extracted as control points,and the planar projection transformation matrix between each frame of the video stream image and the database reference image is calculated.Motion constrains and matrix decomposition are used to obtain the stable position of the on-board camera.Experimental tests are performed in the real road environment.The results show that the positioning accuracy of proposed method within 30 meters can reach 0.1 meters,and 0.05 meters within 20 meters.This method is low-cost,simple and reliable,and can use on-board monocular camera and standard road signs to realize precise self-localization of vehicles in complex traffic sections.

Hand Gesture Recognition Based on Self-adaptive Multi-classifiers Fusion

LIU Xiao, YUAN Guan, ZHANG Yan-mei, YAN Qiu-yan, WANG Zhi-xiao

Computer Science. 2020, 47 (7): 103-110. doi:10.11896/jsjkx.200100073

Abstract

PDF(3637KB) ( 1476 )

References | Related Articles | Metrics

In order to improve the performance of hand gesture recognition based on wearable devices,a hand gesture recognition method (SAMCF) based on self-adaptive multi-classifiers fusion is proposed to solve the bias of single classifier in hand gesture recognition.First,for statistical features that cannot characterize intra-class variability and similarity between complex gestures,SAMCF uses a convolutional neural network (CNN) to automatically extract depth features with strong representation capabilities.Then,SAMCF uses multiple basic classifiers to recognize the extracted feature vectors,and determines the optimal recognition result through self-adaptive fusion algorithm,which solves the bias of single classifier.After that,the robustness and genera-lization ability of the model are verified by using the data set collected by data glove.The experimental results show that SAMCF can effectively extract the depth features of gesture,solve the bias of single classifier,and improve the efficiency of hand gesture recognition and enhance the performance of hand gesture recognition.The recognition accuracy of character level hand gesture (American Sign Language and Arabic numerals) is 98.23%,which is 5% higher than other algorithms on average;the recognition accuracy of word level gesture (Chinese Sign Language) is 97.81%,which is 4% higher than other algorithm on average.

Face Recognition Based on Cluster Analysis in Complex Environment

GAO Yu-tong, LEI Wei-min, YUAN Yue

Computer Science. 2020, 47 (7): 111-117. doi:10.11896/jsjkx.190500004

Abstract

PDF(2560KB) ( 1249 )

References | Related Articles | Metrics

In modern society,the use of face recognition technology in a variety of fields is increasing.Meanwhile,the problems of social security environment and international security are becoming more serious,thus face recognition is confronted with more severe challenges.Detection target and background are complex and dynamic in a complicated environment,so the traditional face recognition technology can not meet the growing demand.Therefore,in this paper,the traditional SIFT (Scale,Invariant,Feature,Transform) algorithm is optimized by clustering analysis method,and the object features are classified according to the principle of clustering analysis,so as to make the clustering results more in line with the set threshold and improve the matching efficiency.The results show that the improved SIFT algorithm can eliminate the interference of irrelevant books and realize the complete connection of image matching points.In order to verify the effectiveness of the improved SIFT algorithm, it is compared with the common algorithms based on several commonly-used databases,and the results show that the clustering algorithm SIFT is better than other algorithms in CASPEALG R1,CFP,MultiGPIE,and has better application effect and applicability.

Automatic Recognition of ECG Based on Stacked Bidirectional LSTM

WANG Wen-dao, WANG Run-ze, WEI Xin-lei, QI Yun-liang, MA Yi-de

Computer Science. 2020, 47 (7): 118-124. doi:10.11896/jsjkx.190600161

Abstract

PDF(3409KB) ( 1916 )

References | Related Articles | Metrics

For the growing demand of ECG data analysis,a new ECG classification algorithm is proposed.Firstly,the original data are truncated by fixed length,sample equilibrium is obtained,and the pre-processing operations such as instantaneous frequency and spectral entropy of the signal are obtained.After the data is preprocessed,the model can better extract features from the data for learning.In training progress,a two-way LSTM network stacking model is adopted.The stacked two-way LSTM model is an improved cyclic neural network model.Compared with convolutional neural networks,the cyclic neural network is more sui-table for processing sequence data such as electrocardiogram.The experiment is conducted using MATLAB2018b under Windows for training and testing.The CUDA version is 9.0.The classification accuracy rate is used as an indicator to measure the performance of the model.The model is tested on two data sets,one is the data of the 2017 Physiological Signal Challenge(hereinafter referred to as the 2017 dataset),the final classification accuracy rate is 97.4%;the other is the data of the 2018 Physiological Signal Challenge (hereinafter referred to as the 2018 dataset),and the final classification accuracy rate is 77.6% on this dataset.The MATLAB group to which it belongs has achieved the third place.This algorithm improves the accuracy of 5.6% in the 2017 dataset and 7.6% in the 2018 dataset compared to the results of the traditional LSTM network.Compared to the results of a single-layer bidirectional LSTM network,in the 2017 data set,the accuracy rate improves 4.2%,and the accuracy rate improves 5.7％ in the 2018 data set,which fully verifies the feasibility and advantages of the algorithm.

Text-Video Feature Learning Based on Clustering Network

ZHANG Heng, MA Ming-dong, WANG De-yu

Computer Science. 2020, 47 (7): 125-129. doi:10.11896/jsjkx.190700006

Abstract

PDF(1539KB) ( 1018 )

References | Related Articles | Metrics

Comprehensive understanding of video content and text semantics has been widely researched in many fields.The early research is mainly to map text-video to a common vector space.However,one of the problems faced by this method is the lack of a large-scale text-video datasets.Because of the large information redundancy of the video data,extracting the whole video feature directly through 3D network will lead to more network parameters and poor real-time performance,which is not conducive to vi-deo tasks.In order to solve the above problems,this paper proposes that the local characteristics of video can be aggregated by good clustering network,and the network model can be trained by image and video datasets at the same time to effectively solve the problem of video modal missing.At the meantime,the influence of face mode on recall task is compared.The attention mechanism is added to the clustering network,which makes the network pay more attention to the modes strongly related to the text semantics,so as to improve the similarity value of the text-video and improve the accuracy of the model.The experimental result shows that text-video feature learning based on clustering network can map text-video to a common vector space,so that text and video with similar semantics are close to each other,text and video with different distances are far away.In this paper,the performance of the text-video recall task evaluation model based on MPII and MSR-VTT datasets is improved compared with other models.From the experimental result,it is fully proved that the text-feature learning based on clustering network can map the text-video to a common vector space,which can be used in the text-video recall task.

Image Denoising by Mixing 3D Block Matching with Harmonic Filtering in Transform Domain

WU Jing, ZHOU Xian-chun, XU Xin-ju, HUANG Jin

Computer Science. 2020, 47 (7): 130-134. doi:10.11896/jsjkx.190600120

Abstract

PDF(3688KB) ( 1296 )

References | Related Articles | Metrics

Aiming at remedy the defeat that the current denoising algorithms lack of analyses of integral structure and excessive computational complexity,this paper proposes an improved denoising algorithm using the harmonic filtering diffusion model in wave-domain to amend BM3D technology.Firstly,the algorithm uses the Euclidean distance to merge similar 2-D image fragments thus obtaining 3-D date arrays.Then it is dealt by collaborative filtering,and the pre-estimation data of the image would be obtained by inverse 3-D transformation.Wavelet decomposition is used to extract high frequency part of pre-denoised image to filter.Lastly,wavelet reconstruction is conducted to estimate the original image,in order to avoid edge ambiguity.Laplacian of Gassian is used to construct a new operator into the diffusion model for filtering,so as to balance the operation speed and denoising performance,and protect the complete structureof the image information.The experimental results show that the new algorithm has perfect denoising performance,more integrity of internal information protection,and short running time,which is beneficial to practical applications.

Complex Scene Text Detection Based on Attention Mechanism

LIU Yan, WEN Jing

Computer Science. 2020, 47 (7): 135-140. doi:10.11896/jsjkx.190600157

Abstract

PDF(3555KB) ( 1847 )

References | Related Articles | Metrics

Most of the traditional text detection methods are developed in the bottom-up manner,which usually start with low-level semantic character or stroke detection,followed by non-text component filtering,text line construction,and text line validation.However,the modeling,scale,typesetting and surrounding environment of the characters in the complex scene change drastically,and the task of detecting text is carried up by human under variety of visual granularities.It’s difficult for these bottom-up traditional methods to maintain the text features under different resolution,due to their dependency on the low lever features.Recently,deep learning methods have been widely used in text detection in order to extract more features under different scale.However,in the existing methods,the key feature information is not emphasized during the feature extraction process of each layer,and will be lost in the layer-to-layer feature mapping process.Therefore,the missing information will also lead to a lot of false-alarm and leak detection,which causes much more time-consuming.This paper proposes a complex scene text detection method based on the attention mechanism.The main contribution of this method is to introduce a visual attention layer in VGG16,and use the attention mechanism to enhance the significant information in the global information in the network.Experiments show that in the Ubuntu environment with GPU,this method can ensure the integrity of the text area in the detection of complex scene text pictures,reduce the fragmentation of the detection area and can achieve up to 87% recall rate and 89% precision rate.

Review of Information Cascade Prediction Methods Based on Deep Learning

ZHANG Zhi-yang, ZHANG Feng-li, TAN Qi, WANG Rui-jin

Computer Science. 2020, 47 (7): 141-153. doi:10.11896/jsjkx.200300130

Abstract

PDF(2128KB) ( 2516 )

References | Related Articles | Metrics

Online social media greatly promotes the generation and transmission of information,exacerbates the communication and interaction between massive amounts of information,and highlights the importance of predicting information cascades.In recent years,deep learning has been widely used in the field of information cascade prediction.This paper mainly classifies,sorts,and summarizes the current research status of deep learning-based information cascade prediction methods and classic algorithms.According to the different emphasis of information cascade feature characterization,the information cascade prediction method based on deep learning is divided into time series information cascade prediction method and topology information cascade prediction method.The time series information cascade prediction method is further divided into methods based on random walks and methods based on diffusion paths,and the topology information cascade prediction method is divided into methods based on global topological structure and methods based on neighborhood aggregation.This paper details the principles and advantages and disadvantages of each type of method,and introduces the data sets and evaluation indicators commonly used in the field of information cascade prediction,and compares the information cascade prediction algorithms based on deep learning in the macro and micro information cascade prediction scenarios,and discusses some technical details commonly used in information cascade prediction algorithms.Finally,this paper summarizes the field possible future research directions and development trends.

Improved Salp Swarm Algorithm Based on Levy Flight Strategy

ZHANG Yan, QIN Liang-xi

Computer Science. 2020, 47 (7): 154-160. doi:10.11896/jsjkx.190600068

Abstract

PDF(1874KB) ( 1331 )

References | Related Articles | Metrics

Aiming at the shortcomings of slow convergence speed and easy to fall into local optimum in the optimization process of the Salp swarm algorithm (SSA),a Levy Flight-based Conditional Updating Salp Swarm Algorithm (abbreviated as LECUSSA) is proposed and it is used in the feature subset selection of classification algorithm.Firstly,the leader position is updated randomly by using the long and short jump characteristics of Levy Flight strategy,which enhances the global optimal search ability.Secondly,the conditional updating condition to the follower’s position is added to make the follower no longer follow blindly,thus accelerating the convergence speed.The performance of LECUSSA algorithm is compared with other algorithms on 23 benchmark functions.The algorithm is applied to the selection of classification feature subset of SVM algorithm,and 8 UCI datasets are used to compare the performance of the classification results after feature selection.The experimental results show that LECUSSA has good global optimal search ability and fast convergence speed.After feature selection using LECUSSA algorithm,the feature subset with the best classification accuracy can be found.

Multimodal Optimization Algorithm for Protein Conformation Space

LI Zhang-wei, XIAO Lu-qian, HAO Xiao-hu, ZHOU Xiao-gen, ZHANG Gui-jun

Computer Science. 2020, 47 (7): 161-165. doi:10.11896/jsjkx.190600100

Abstract

PDF(2542KB) ( 1299 )

References | Related Articles | Metrics

Due to the inaccuracy of protein energy model,the optimal solution in mathematics does not consistently correspond to the stable natural structure of a given protein.The existing methods tend to converge to local optimal solutions because of the huge conformational space.To address the problem of the inaccuracy of energy model in the field of protein structure prediction and the reliability of high-dimensional conformational space sampling,a dihedral angle similarity based multimodal conformation optimization method (DASOM) for ab-initio protein structure prediction is proposed in the framework of population based algorithm.Firstly,the modal exploration is conducted,knowledge-based Rosetta coarse-grained energy model is used as the standard to select new individuals with high quality,thus the diversity of the population can be increased.Then,a dihedral angular similarity model is established to meet the requirements of similar individual determination in the multi-modal optimization algorithm.Crowding update strategy is used for optimizing the existing modal to achieve the modal enhancement and more reasonable conformation is obtained.Experimental results on 10 test proteins show that the proposed method not only achieves high prediction accuracy,but also obtains many metastable protein conformations as well.

Pyramid Evolution Strategy Based on Differential Evolution for Solving One-dimensional Cutting Stock Problem

HOU Gai, HE Lang, HUANG Zhang-can, WANG Zhan-zhan, TAN Qing

Computer Science. 2020, 47 (7): 166-170. doi:10.11896/jsjkx.190500014

Abstract

PDF(1601KB) ( 1332 )

References | Related Articles | Metrics

One-dimensional cutting stock problem is a classic NP-hard problem in combinatorial optimization,which is widely used in mechanical manufacturing and engineering applications.It is difficult for traditional swarm intelligence algorithm to ba-lance the contradictions of mining and exploration,competition and cooperation between individuals and populations with in a po-pulation when solving one-dimensional cutting stock problem.On the basis of pyramid evolution strategy,an algorithm based on pyramid evolution strategy and differential evolution is proposed in this paper.The algorithm makes full use of the advantages of the PES algorithm and solves these two contradictions well,but the PES algorithm does not consider the cooperative relationship between the current individual and the optimal individual,it may lead the convergence speed of the algorithm to be slowly.To this end,the mutation and crossover operations of differential evolution are introduced into the accelerating process of PES algorithm,and the invalid individuals are corrected by the repair operator.On the one hand,it is conducive to speed up the convergence speed of the algorithm,on the other hand,it can utilize the differences among individuals and realize the local mining of the area near the individuals in the population,which is beneficial to improve the accuracy of the algorithm.The proposed algorithm is applied to six examples.The experimental results show that the algorithm has high optimization accuracy and convergence speed compared with other eight algorithms,which verifies the feasibility and effectiveness of the algorithm for solving the one-dimensional cutting stock problem.

Evolution Analysis of Household Car Supply Chain Based on Multi-Agent

SUN Jun-yan, ZHANG Yuan-yuan, WU Bing-ying, NIU Ya-ru, CHEN Chan-juan

Computer Science. 2020, 47 (7): 171-178. doi:10.11896/jsjkx.190600038

Abstract

PDF(2662KB) ( 112 )

References | Related Articles | Metrics

In order to explore the business strategy of multi-model product supply chain,this paper used Multi-Agent modeling and simulation method to establish a multi-model home vehicle supply chain network model.Aiming at maximizing the profit of manufacturer Agent,stimulus-response learning mechanism and the particle swarm learning mechanism are used to analyze the evolution of the model.The simulations show that,firstly,by using the stimulus-response learning mechanism,the sales volume of some models of the manufacturer will be greatly reduced.For manufacturers pursuing multiple models,it is not conducive to the promotion of multiple models,but the total sales volume and profits will increase.Secondly,by using the particle swarm learning mechanism,no matter what combination of C₁ and C₂ is,it is difficult to obtain the optimal sales and profit at the same time.For mass-consumer models,manufacturers can choose to strengthen C₂ and expand the market with “small profits but quick turnover” strategy to increase sales.For high-end consumer models,manufacturers can choose to strengthen C₁ and create high-quality products with a “benefit and profitable” strategy to increase profits.Relatively speaking,by using particle swarm learning mechanisms,manufacturers can quickly adjust strategies to cope with market changes,and the strategies are more stable after learning.This study has practical guidance for supply chain management with multiple models.

Improved Speedy Q-learning Algorithm Based on Double Estimator

ZHENG Shuai, LUO Fei, GU Chun-hua, DING Wei-chao, LU Hai-feng

Computer Science. 2020, 47 (7): 179-185. doi:10.11896/jsjkx.190500143

Abstract

PDF(2109KB) ( 1041 )

References | Related Articles | Metrics

Q-learning algorithm is a classical reinforcement learning algorithm.However,due to overestimation and the conservative updating strategy,there exists a problem of slow convergence.Speedy Q-learning algorithm and Double Q-learning algorithm are two variants of the Q-learning algorithm which are used to solve the problems of slow convergence and over-estimation in Q-learning algorithm respectively.Based on the updating rule of Q value in Speedy Q-learning algorithm and the updating strategy of Monte Carlo reinforcement learning,the equivalent form of the updating rule of Q value is proposed through theoretical analysis and mathematical proof.According to the equivalent form,Speedy Q-learning algorithm takes the estimation function of current Q value as the estimation of the historical Q value.Although the overall convergence speed of the agent is improved,Speedy Q-learning also has the problem of overestimation,which leads to a slow convergence at the beginning of iterations.In order to solve the problem of slow convergence at the initial stage of iterations in the Speedy Q-learning algorithm,an improved algorithm named Double Speedy Q-learning is proposed based on the fact that the double estimator in the Double Q-learning algorithm can improve the convergence speed of agents.By using double estimator,the selection of optimal action and maximum Q value is separated,so that the learning strategy of Speedy Q-learning algorithm in the initial iteration period can be improved and the overall convergence speed of Speedy Q-learning algorithm can be improved.Through grid world experiments of different scales,linear learning rate and polynomial learning rate are used to compare the convergence speed of Q-learning algorithm and its improved algorithm in the initial iteration stage and the overall convergence speed.The results show that the convergence speed of the Double Speedy Q-learning algorithm is faster than that of Speedy Q-learning algorithm in the initial iteration stage and its overall convergence speed is significantly faster than that of comparison algorithms.Its difference between the actual average reward value and the expected reward value is also the smallest.

Novel Artificial Bee Colony Algorithm for Solving Many-objective Scheduling

ZHENG You-lian, LEI De-ming, ZHENG Qiao-xian

Computer Science. 2020, 47 (7): 186-191. doi:10.11896/jsjkx.190600089

Abstract

PDF(2027KB) ( 1003 )

References | Related Articles | Metrics

Many-objective continuous optimization problem has been considered extensively while there are few studies on many-objective combination optimization problem.Artificial bee colony(ABC) algorithm has been successfully applied to solve various production scheduling problem,but ABC is seldom used to solve many-objective scheduling problem and many-objective scheduling problem itself is also seldom handled.Aiming at multi-objective flexible job shop scheduling problem,a new ABC algorithm is proposed to optimize simultaneously maximum completion time,total tardiness,total energy consumption and total workload.Unlike the general flexible job shop scheduling problem,the above problem is green scheduling one because of the inclusion of total energy consumption.The new ABC has new characteristics which are obviously different from the existing ABC algorithm.Its number of onlooker bees is less that of employed bees,employed bee focuses on global search while onlooker bee only carries out local search,which avoids the algorithm from falling into local optimization through the different search methods of two kinds of bees.At the same time,onlooker bee just selects some best employed bees or members of external file,and some employed bees cannot become follower objects to avoid wasting computing resources on search for poor solutions.A new strategy is adopted to handle scout.The simulation results show that the ratio of the number of non-dominated solutions to population scale for many-objective scheduling problem is notably less than the same ratio for many-objective continuous optimization problem.Compared with multi-objective genetic algorithm and variable neighborhood search,the computational results show that ABC has better results than two comparative algorithms on solving the considered many-objective scheduling.

Point Cloud Deep Learning Network Based on Dynamic Graph Convolution and Spatial Pyramid Pooling

ZHU Wei, SHENG Rong-jin, TANG Ru, HE De-feng

Computer Science. 2020, 47 (7): 192-198. doi:10.11896/jsjkx.190700180

Abstract

PDF(3273KB) ( 1622 )

References | Related Articles | Metrics

The classification and semantic segmentation of point cloud data have important applications in automatic driving,intelligent robot and holographic projection.While using the traditional method of manually extracting point cloud features or the feature learning method of firstly transforming three-dimensional point cloud data into data forms of multi-view and volumetric grid,there exist problems such as many processing links and great loss of three-dimensional features,resulting in low accuracy of classification and segmentation.The existing deep neural network PointNet,which can directly process point cloud data,ignoresthe local fine-grained features of point cloud and is weak in processing complex point cloud scenarios.To solve the above problems,this paper proposes a point cloud deep learning network based on dynamic graph convolution and spatial pyramid pooling.On the basis of PointNet,the dynamic graph convolution module GraphConv is used to replace the feature learning module in PointNet,which enhances the network’s ability to learn local topological structure information.At the same time,a point-based spatial py-ramid pooling structure PSPP is designed to capture multi-scale local features.Compared with the multi-scale sampling point cloud of PointNet++ and the repeated grouping method for multi-scale local features learning,it is simpler and more efficient.Experimental results show that,on the three benchmark data sets of point cloud classification and semantic segmentation task,the proposed network has higher classification and segmentation accuracy than the existing network.

Epileptic EEG Signals Detection Based on Tunable Q-factor Wavelet Transform and Transfer Learning

LUO Ting-rui, JIA Jian, ZHANG Rui

Computer Science. 2020, 47 (7): 199-205. doi:10.11896/jsjkx.200200104

Abstract

PDF(2645KB) ( 1252 )

References | Related Articles | Metrics

Aiming at the detection of epileptic EEG signals,a method of detecting epileptic EEG signals based on Tunable Q-factor wavelet transform and transfer learning is proposed.Firstly,the EEG signals are transformed by Tunable Q-factor wavelet transform,and the subbands with large energy differences are selected for partial reconstruction.The reconstructed signals are rearranged and expressed as two-dimensional color image data.Secondly,through the analysis of the existing automatic seizure detection algorithm and the Xception model of deep separable convolutional networks,the parameters of the pre-training model classified by the ImageNet dataset are used to initialize the network parameters,and the pre-training model of the depth separable convolution network Xception is obtained.Finally,the transfer learning method is used to transfer the pre-training results of the Xception model to the automatic seizure detection task.The performance of this method is verified on the BONN epilepsy dataset,and the accuracy,sensitivity and specificity reaches 99.37%,100% and 98.48%respectively,proving that the model has good generalization ability in automatic seizure detection task.Compared with traditional detection methods and other deep lear-ning methods based,the automatic detection method proposed in this paper achieves higher accuracy,avoids the process of artificial design and feature extraction,and has better application value.

Review on Placement of Multiple Controllers in SDN

JIA Wu-cai, LV Guang-hong, WANG Gui-zhi, SONG Yuan-long

Computer Science. 2020, 47 (7): 206-212. doi:10.11896/jsjkx.200200075

Abstract

PDF(1964KB) ( 1371 )

References | Related Articles | Metrics

With the rapid development of software defined network,deployment of the inherent defects of single controller gra-dually revealed,multiple controller deployment has become an inevitable trend.However,the number and location of controllers have a decisive influence on network performance,and the high complexity of weighing factors in solving this problem,which seriously hinders the application of SDN in data centers and wide area networks.Firstly,the essence of placement problem and its general solving steps are described.Secondly,based on the network model,the core components of the deployment strategy,namely the optimization objectives and search algorithm are described in detail.Then,based on the research at home and abroad,the deployment strategies are divided into static deployment and dynamic deployment,and the advantages and disadvantages of typical strategies are compared.Finally,the future research direction is prospected.

Multi-source Tree-based Scheduling Algorithm for Deadline-aware P2MP Inter-datacenter Transfers

ZHUANG Yi, YANG Jia-hai

Computer Science. 2020, 47 (7): 213-219. doi:10.11896/jsjkx.200300069

Abstract

PDF(2261KB) ( 1108 )

References | Related Articles | Metrics

With the growth of the data volume for cloud applications,more and more cloud service providers pay attention to inter-datacenter bulk transfer.The main challenge of inter-datacenter bulk transfer is how to find the best resource scheduling algorithm,which uses the least resources to transfer the user’s data to the specified destinations before the specified deadline.This paper proposes MSTB(Multi-Source Tree-Based) algorithm,an effective scheduling solution for deadline-aware P2MP inter-da-tacenter transfers.With the help of multi-source mechanism and multicast forwarding tree,MSTB outperforms the state-of-the-art method.Simulation experiments show that MSTB can increase the number of transfer requests accepted by up to 91% and increase effective throughput by up to 54% with short transfer completion time and low computation complexity.

4Bit-based Gradient Compression Method for Distributed Deep Learning System

JIANG Wen-bin, FU Zhi, PENG Jing, ZHU Jian

Computer Science. 2020, 47 (7): 220-226. doi:10.11896/jsjkx.200300097

Abstract

PDF(2745KB) ( 1345 )

References | Related Articles | Metrics

In order to reduce the communication overhead of distributed deep learning system,compression of gradient data before transmission is an effective method,such as 2Bit method in MXNet.However,there is a problem in this kind of method,that is,too high compression ratio will lead to decline in accuracy and convergence speed,especially for larger network models.To address this problem,a new gradient compression strategy called 4Bit is proposed.Four bits are used to represent a specific gradient value.Compared with 2Bit,this method can approximate the gradient more finely,thus improving the accuracy of training results and convergence speed.Furthermore,different approximation thresholds are selected according to the gradient characteristics of each layer of the network model,which makes the compressed values more reasonable,and finally improves the convergence speed and final accuracy of the model.The experimental results show that,although 4Bit is slightly lower than the 2Bit method in terms of acceleration,its accuracy is higher and practicability is better by using more bits and multiple thresholds.It is very meaningful to reduce the communication overhead of the distributed deep learning system while maintaining high accuracy by 4Bit.

Unknown Link Layer Protocol Frame Segmentation Algorithm Based on Longest Common Substrings Mining

CHEN Qing-chao, WANG Tao, FENG Wen-bo, YIN Shi-zhuang

Computer Science. 2020, 47 (7): 227-230. doi:10.11896/jsjkx.190600073

Abstract

PDF(1745KB) ( 858 )

References | Related Articles | Metrics

In the increasingly fierce field of modern electronic countermeasures,the original data intercepted by the listener is ge-nerally in the form of bit stream.Dividing the bit stream into data frames is the primary task to process the intercepted data.Although the existing methods can accurately extract the related sequence to achieve frame segmentation,when the amount of data to be processed is large,the consumption of time and space is unacceptable and some thresholds need to be set in advance.For this reason,an unknown link layer protocol frame segmentation algorithm based on the longest common substring mining is proposed in this paper.By counting the longest common substring of the bit stream of a certain length,the preamble and the frame start delimiter become iteratively accurate.Thus,the frame segmentation is realized.The experimental data show that compared with the algorithm based on frequent sequence mining to achieve frame segmentation,the number of candidate sequences of the proposed algorithm is reduced exponentially,and the final candidate sequence is unique.The time complexity of the proposed algorithm is O(n),and only a single scan is required,which fully shows that the proposed algorithm can realize frame segmentation efficiently.

Graph Convolution of Fusion Meta-path Based Heterogeneous Network Representation Learning

JIANG Zong-li, LI Miao-miao, ZHANG Jin-li

Computer Science. 2020, 47 (7): 231-235. doi:10.11896/jsjkx.190600085

Abstract

PDF(1874KB) ( 1588 )

References | Related Articles | Metrics

In recent years,network representation learning has received more and more attention as an effective method for analyzing heterogeneous information networks by representing nodes in a low-dimensional space.Random walk based methods are currently popular methods to learn network embedding,however,most of these methods are based on shallow neural networks,which make it difficult to capture heterogeneous network structure information.The graph convolutional network (GCN) is a popular method for deep learning of graphs,which is known to be capable of better exploitation of network topology,but current design of GCN is intended for homogenous networks,ignoring the rich semantic information in the network.In order to effectively mine the semantic information and highly nonlinear network structure information in heterogeneous information networks,this paper proposes a heterogeneous network representation learning algorithm based on graph convolution of fusion meta-path(MG2vec)to improve the effect of network representation.Firstly,the algorithm obtains rich semantic information in heterogeneous information networks through relevance measurement based on meta-paths.Then the graph convolution network is used for deep learning to capture the characteristics of nodes and neighbor nodes,to make up for the deficiency of shallow model in capturing the information of the network structure,so as to better integrate rich semantic information and structural information into the low-dimensional node representation.Experiments are carried out on DBLP and IMDB,compared with DeepWalk,node2vec and Metapath2vec classical algorithms,the proposed MG2vec algorithm has higher classification accuracy and better performance in multi-label classification tasks,the precision and Macro-F1 value can be respectively up to 94.49% and 94.16%,and the both of values are up to 26.05% and 28.73% higher respectively than DeepWalk.The experimental results show that the performance of MG2vec algorithm is better than that of classical network representation learning algorithms,and MG2vec has better heterogeneous information network representation effect.

Virtual Network Embedding Algorithm Based on Topology Comprehensive Evaluation and Weight Adaptation

SHI Chao-wei, MENG Xiang-ru, MA Zhi-qiang, HAN Xiao-yang

Computer Science. 2020, 47 (7): 236-242. doi:10.11896/jsjkx.190600022

Abstract

PDF(2209KB) ( 943 )

References | Related Articles | Metrics

The existing virtual network embedding algorithms do not consider the topological features of nodes comprehensively,the evaluation method of nodes is relative simple and the weights cannot be adaptively adjusted according to the network.To solve these problems,a virtual network embedding algorithm based on topology comprehensive evaluation and weight adaptation is proposed.In the node embedding stage,by considering the centrality,proximity and adjacent aggregation of nodes,this paper establishes a node multi-metric evaluation model combined with the node resource properties such as the node CPU and the sum of adjacent bandwidth.The weights are adjusted adaptively according to the change of network environment by using the entropy weight method.Simulation results show that compared with the latest and classical virtual network embedding algorithms,the acceptance ratio of the proposed algorithm is improved by 2%~23%,and the long-term average revenue-to-cost ratio is increased by 3%~17%.Moreover,the proposed algorithm can maintain good performance for different types of virtual network requests with different resource requirements.

WSN Coverage Optimization Based on Adaptive Particle Swarm Optimization

QI Wei, YU Hui-qun, FAN Gui-sheng, CHEN Liang

Computer Science. 2020, 47 (7): 243-249. doi:10.11896/jsjkx.200200133

Abstract

PDF(2830KB) ( 1290 )

References | Related Articles | Metrics

The wireless sensor network (WSN) coverage of data sensing layer has great significance on the quality of sensing services.In view of the problems of coverage redundancy,coverage void and premature convergence of particle swarm optimization caused by the randomness of initial deployment of wireless sensor network,an adaptive virtual force particle swarm optimization algorithm based on binomial perception coverage is proposed,which optimizes the effective coverage of the network.By adding mobile nodes to the network,the algorithm performs the redeployment distribution of position scheduling,adjusts the inertia weight by calculating the degree of population evolution and the degree of relative aggregation,and ueses the threshold of fitness variance to judge whether the intergerence of virtual force strategy is needed in the current state.This paper focuses on the analysis of the impact of the initial deployment category and mobile node proportion on the redeployment coverage performance,and gives the corresponding implementation algorithm.Simulation results show that compared with ACPSO,DACPSO and DVPSO,the improved PSO has 98.33% coverage and high mobile efficiency,which fully proves the effectiveness of the algorithm.

Approximate Algorithm for Minimum Virtual Backbone in 3D Wireless Ad Hoc Networks

YI Meng, LIANG Jia-rong, QIN Bin

Computer Science. 2020, 47 (7): 250-256. doi:10.11896/jsjkx.190700059

Abstract

PDF(2021KB) ( 1023 )

References | Related Articles | Metrics

In homogeneous wireless ad hoc networks,the size of VB (Virtual Backbone) is an important factor to measure the quality of ad hoc networks,the smaller the virtual backbone,the less the network routing overhead.The problem of minimum virtual backbone can be abstracted into the problem of minimum connected dominating set.There have been a lot of research achievements on the problem of minimum connected dominating set in UDG (Unit Disk Graphs,UDG) on two-dimensional wireless sensor networks,but in some practical cases,unit disk graphs cannot accurately abstract the network.So how to construct high quality CDS(Connected Dominating Set,CDS) in UBG (Unit Ball Graph,UBG) is proposed.In this paper,an optimal upper bound of the number of independent nodes in UBG is given,and the performance ratio of CDS is obtained by using the upper bound.In the proposed algorithm ST-CDS,the method of constructing minimum steiner tree with minimum number of steiner nodes is mainly used to optimize the connecting parts between nodes.Theoretical analysis shows that the performance ratio of ST-CDS algorithm is 11.8080＋ln11,which is the best result among all known researches in this direction.Simulation results also verify the feasibility of the proposed algorithm ST-CDS.

New Device Fingerprint Feature Selection and Model Construction Method

WANG Meng, DING Zhi-jun

Computer Science. 2020, 47 (7): 257-262. doi:10.11896/jsjkx.190900107

Abstract

PDF(1838KB) ( 2375 )

References | Related Articles | Metrics

In recent years,with the rapid development of mobile Internet,more and more businesses have moved from the browser to the mobile.But the black industry chain that is parasitic on the mobile Internet has reached the point of flooding.To solve this problem,the device fingerprint,that is,the use of the device’s characteristic attributes to generate a unique identifier for each device came into being.Many algorithms based on machine learning methods for device uniqueness authentication have emerged,most of which focus on the establishment of models.Few of them have in-depth research on feature selection.However,feature selection is directly related to the performance of the final model.Aiming at this problem,this paper proposes a new device fingerprint feature selection and model construction method (FSDS-WSC),which is based on the feature discrimination of different devices and the feature stability of the same device to select some of the most valuable features.The importance of the selected features’ weights is applied to the later model establishment.The FSFS-WSC is compared with other mainstream feature selection methods on 6424 Android devices in the real sence.The results show that FSFS-WSC has a great improvement compared with other methods,and the accuracy of device uniqueness authentication reaches 99.53%,which shows the superiority of FSFS-WSC.

Revised Impossible Differential Cryptanalysis of PFP Block Cipher

SHEN Xuan, WANG Xin-mei, HE Jun, SUN Zhi-yuan

Computer Science. 2020, 47 (7): 263-267. doi:10.11896/jsjkx.200200034

Abstract

PDF(2489KB) ( 1379 )

References | Related Articles | Metrics

Nowadays,the application scenarios in the resource-constrained terminal system appear more and more,and the data encryption requirement of them also needs to be satisfied.There are many lightweight block ciphers designed such as PRESENT which is an international standard block cipher.PFP cipher is an ultra-lightweight block cipher which takes Feistel structure,and its round function is designed by using the experience of PRESENT cipher for reference.The block size of PFP is 64-bit,the key size of PFP is 80-bit and its round number is 34.For PFP,this paper studies its ability against impossible differential cryptanalysis.In the design document,the designers proposed a 5-round impossible differential and attacked reduced 6-round PFP cipher with this distinguisher.Moreover,the designers can recover 32-bit master key.Comparing with this result,by exploiting the differential property of the S-box in PFP,this paper constructs a 7-round impossible differential distinguisher and attack reduced 9-round PFP.Moreover,it can recover 36-bit master key.Therefore,the result is much better than the known one in terms of either the round number or the recovered key.So far as I know,the result in this paper is the best impossible differential cryptanalysis of PFP cipher.

Multi-subblock Incentive Consensus Mechanism Based on Topology and Distribution Mechanism

LIU Shuai, GAN Guo-hua, LIU Ming-xi, FANG Yong, WANG Shou-yang

Computer Science. 2020, 47 (7): 268-277. doi:10.11896/jsjkx.200200027

Abstract

PDF(1628KB) ( 1217 )

References | Related Articles | Metrics

First of all,this paper proposes an modified consensus mechanism based on the classic PoW (Proof of Work) consensus mechanism by changing the condition of take the miners’ blocks into blockchain and income distribution strategy.To be specific,on the one hand,according to the rule of this modified consensus mechanism,the first generated sub-chain made up of N sub-blocks will be integrated into the main chain,which is different from PoW,the modified consensus mechanism replaces the simple single chain structure of PoW with a more complex network structure;on the other hand,the modified consensus mechanism improves on the revenue distribution strategy of the traditional consensus mechanism,its distribution strategy is divided into three steps in order to improve the expected earnings of miners with low computational power,so as to encourage those miners with low computational power to actively participate in mining and supervise the safety of blockchain.In addition,the network structure introduced by the modified consensus mechanism enables miners to have more strategies of mining.This paper also discusses the influence of selecting different mining strategies,splitting calculation power of malicious miners and collusion on the safety and efficiency of the block chain.Finally,a variety of market scenarios are set up to simulate the improved algorithm so as to analyze the mining benefits of various miners under different market characteristics,which are hoping to guide miners.

Encrypted Dynamic Configuration Method of FPGA Based on Cloud

CHEN Li-feng, ZHU Lu-ping

Computer Science. 2020, 47 (7): 278-281. doi:10.11896/jsjkx.190700110

Abstract

PDF(1914KB) ( 1073 )

References | Related Articles | Metrics

In the field of parallel computing which needs a lot of data,such as cloud computing,machine learning algorithm,artificial intelligence computing,etc.,as an important technical means to improve performance,FPGA has been widely used.In the configuration of FPGA,configuration data need to be read from memory and then written into the FPGA.As a practical embodiment of technological achievements,configuration data has the problem of how to prevent data from being illegally acquired,lea-ding to the leakage of research property.In order to deal with this problem,this paper proposes an effective method of FPGA configuration based on cloud encryption.This method encrypts and manages the configuration data file by cloud-based encryption APP.When configuring the FPGA,the microprocessor obtains the encrypted configuration data through the access port of the cloud-based server,and decrypts it using the decryption algorithm built in the microprocessor.Then,the decrypted data are used dynamically to config the FPGA.The method described in this paper stores the configuration data of the FPGA in the cloud ser-ver,and carries out strict data protection and file protection through encryption means on the cloud server,thus providing a flexible and powerful encryption protection capability.The microprocessor obtains data from the cloud through encryption channel,decrypts the encrypted data and then uses it for the configuration of FPGA.In the whole process,the configuration data are encrypted,and the risk of data leakage is effectively controlled.Thus,the configuration data can be protected to the maximum extent to prevent illegal acquisition and use,meanwhile the remote dynamic configuration of the FPGA can be realized.The proposed method has been verified in Aliyun and Tencent cloud platforms,which achieves good confidentiality and flexible configuration.

Reinforcement Learning Based Cache Scheduling Against Denial-of-Service Attacks in Embedded Systems

HUANG Jin-hao, DING Yu-zhen, XIAO Liang, SHEN Zhi-rong, ZHU Zhen-min

Computer Science. 2020, 47 (7): 282-286. doi:10.11896/jsjkx.200100135

Abstract

PDF(1904KB) ( 1096 )

References | Related Articles | Metrics

The sharing last level cache (LLC) scheduling of the central processor determines the instructions per cycle (IPC) of the user processes and the robustness of denial-of-service (DoS) attacks in the multicore embedded operating systems.However,existing scheduling schemes rely on the specific LLC scheduling model and DoS attack model,which makes it difficult for the processor to obtain the running information of the user processes in each scheduling cycle under different scheduling environments.Therefore,this paper proposes a reinforcement learning (RL) based LLC scheduling scheme to against DoS attacks in embedded systems,which optimizes the occupied position and the occupied space based on the measured occupied start and end positions,the previous IPC,load miss rate and store miss rate.The processor can jointly increase the IPC and reduce the success rate of the DoS attack from the malicious process without knowing the DoS attack model in the dynamic LLC scheduling environment.Simulations are implemented on the multicore embedded operating systems where multitenant virtual machines participate together,which show that the proposed scheme can significantly increase the IPC and reduce the success rate of the DoS attack.

Network Security Situation Assessment Method Based on Improved Hidden Markov Model

LI Xin, DUAN Yong-cheng

Computer Science. 2020, 47 (7): 287-291. doi:10.11896/jsjkx.190300045

Abstract

PDF(1743KB) ( 1346 )

References | Related Articles | Metrics

Cyber security situation awareness,as an effective supplement in cyber security protection measures,is one of the research focus in recent years.In particular,network security situation assessment has become an important research topic in the field of network security.Hidden Markov Model (HMM) can be used in network security situation assessment,which can evalua-te network status in real time,but there are problems such as difficult to configure model parameters and low evaluation accuracy.Therefore,this paper proposes a situation assessment method for improving the Hidden Markov Model,combining the Baum-Welch (BW) parameter optimization algorithm with the Seeker Optimization Algorithm (SOA).Taking advantage of the strong random search ability of SOA,the traditional parameter optimization algorithm is easy to fall into local optimal solution.The optimized parameters are substituted into the HMM,and the network security situation value is obtained through quantitative analysis.Based on the DARPA2000 dataset,this paper uses MATLAB software to verify the proposed method.The experimental results show that compared with BW algorithm,this method can improve the accuracy of the model,and it makes the quantification of the network security situation more reasonable.

Network Representation Learning Algorithm Based on Vulnerability Threat Schema

HUANG Yi, SHEN Guo-wei, ZHAO Wen-bo, GUO Chun

Computer Science. 2020, 47 (7): 292-298. doi:10.11896/jsjkx.190600156

Abstract

PDF(3415KB) ( 1065 )

References | Related Articles | Metrics

Threat intelligence analysis can provide effective attack and defense information for network attack and defense,and fine-grained mining,that is,the relationship between security entities and entities in network threat intelligence data,is a hotspot of network threat intelligence analysis research.Traditional machine learning algorithms,when applied to large-scale network threat intelligence data analysis,face sparse,high-dimensional and other issues,and thus it is difficult to effectively capture network information.To this end,a network representation learning algorithm based on vulnerability threat schema——HSEN2vec for the classification of network security vulnerabilities is proposed.The algorithm aims to capture the structure and semantic information of the heterogeneous security entity network to the maximum extent,and obtains the low-dimensional vector representation of the security entity.In the algorithm,the structural information of the heterogeneous security entity network is obtained based on the vulnerability threat schema,and then modeled by the Skip-gram model,and the effective prediction is performed by the negative sampling technique to obtain the final vector representation.The experimental results show that in the national security vulnerability data,compared with other methods,the learning algorithm proposed in this paper improves the accuracy of vulnerability classification and other evaluation indicators.

Method Based on Traffic Fingerprint for IoT Device Identification and IoT Security Model

YANG Wei-chao, GUO Yuan-bo, LI Tao, ZHU Ben-quan

Computer Science. 2020, 47 (7): 299-306. doi:10.11896/jsjkx.190700199

Abstract

PDF(2338KB) ( 3771 )

References | Related Articles | Metrics

The large-scale deployment of the Internet of Things makes it possible for vulnerable IoT devices to be connected to the network.When an attacker uses a vulnerable device to access the target internal network,it can lurk to wait for an attack.To prevent such attacks,it is necessary to develop a security mechanism for access control of suspicious devices and management of internal devices.Firstly,in order to realize the access control of suspicious devices,a device identification method is given in this paper.By setting a white list,a communication traffic feature fingerprint is constructed,and a random forest method is used to train the device identification model.Secondly,to manage internal devices,an intelligent security management model is proposed to build an ontology threat model based on assets,vulnerabilities and security mechanisms.Finally,the experimental results verify the detection performance of the device recognition model,the recognition accuracy rate is above 96%.Compared with theexisting similar methods,the results prove that the proposed method has better detection stability.

WSN Source-location Privacy Protection Based on Improved Ant Colony Algorithm

GUO Rui, LU Tian-liang, DU Yan-hui, ZHOU Yang, PAN Xiao-qin, LIU Xiao-chen

Computer Science. 2020, 47 (7): 307-313. doi:10.11896/jsjkx.200100056

Abstract

PDF(1991KB) ( 1013 )

References | Related Articles | Metrics

Wireless Sensor Network (WSN) oriented to target monitoring tasks is usually deployed in unsupervised,critical and sensitive environments.The openness of Wireless communication seriously threatens the security of monitoring targets,so it is necessary to effectively protect the location privacy of source nodes.Considering the low computational complexity of ant colony algorithm and its unique advantages in path planning,a source location privacy protection scheme based on improved ant colony algorithm EESLP-ACA (Energy Efficient Source Location Privacy based on Ant Colony Algorithm) is proposed.When the sensor node receives the packet,it will select the forwarding node according to the pheromone concentration and the improved path heuristic content to minimize and balance the network energy consumption.At the same time,by introducing the reference distance and improving the pheromone update mechanism,the possibility of the unselected node becoming the forwarding node is increased,and the repeated dynamic route with low probability is constructed to reduce the number of packets an attacker can receive and increase the difficulty of reverse tracking.The performance analysis shows that it not only effectively improves the performance of ant colony algorithm and makes it better applied in the field of WSN source location privacy protection,but also effectively enhances the security of source location privacy while prolonging the network life cycle and shortening the transmission delay compared with CDR and ELSP schemes.

Anti-eavesdropping Network Coding Based on Quantum GHZ State

XU Guang-xian, CUI Jun-jie

Computer Science. 2020, 47 (7): 314-321. doi:10.11896/jsjkx.190500168

Abstract

PDF(2704KB) ( 1083 )

References | Related Articles | Metrics

The coding efficiency of classical network coding is inefficient during transmission and easy to be bugged.Although Hayashi’s proposal of sharing EPR at the transmitter improves the coding efficiency for information transmission,it can not rea-lize the complete recovery of transmitted information,and the information is not safely disposed,so it is easy to be bugged.To this end,based on the three-particle maximally entangled GHZ state,a quantum network coding scheme is proposed to prevent information being eavesdropped by using the quantum no-cloning theorem and teleportation.Starting from the classical butterfly network coding,the direct product operation is carried out on particles to be sent and the GHZ state particles at the sending end.Then bell measurement is performed on the calculated particles,and the precoding information can be obtained according to the measurement results.The precoding information is transmitted to intermediate nodes,and auxiliary particles used for measurement are introduced at the receiving end.Then,quantum entanglement computation is performed on the received particles,and joint unitary operation is performed on particle clusters.Finally,based on the coding information,corresponding unitary matrix is selected to restore particle clusters.Experimental data shows that the single transmission success rate of GHZ-based quantum network coding has been significantly improved,in terms of encoding efficiency,the cross transmission of quantum information in the butterfly network can be completed with only 6 qubits.In terms of security,the probability of detecting whether the information is bugged has improved.It proves that the proposed scheme improves the coding efficiency of the network coding system and effectively solves the bugging problem of information transmission.

Antenna Selection for Spatial Modulation Based on Physical Layer Security

DING Qing-feng, XI Tao, LIAN Yi-chong, WU Ze-xiang

Computer Science. 2020, 47 (7): 322-327. doi:10.11896/jsjkx.190600133

Abstract

PDF(2130KB) ( 975 )

References | Related Articles | Metrics

For the high complexity of antenna selection algorithm based on optimal secrecy capacity,an improved antenna selection algorithm with low complexity based on the difference of column norm squared is proposed.First,the analytic formula of secrecy capacity is simplified by normalizing the fixed quantity with comparison of the difference between legitimate channel and eavesdropper channel,which is expanded and expressed by channel coefficient.Then the difference of channel coefficient square is traversed and sorted,and the antenna combination with the largest difference of channel norm square is selected.Meanwhile,combining the algorithm with the artificial noise,which is designed at the remaining legitimate channel null space,can obtain the optimal secrecy capacity.The simulation results show that compared with tranditional algorithm,the proposed algorithm can achieve optimal secrecy capacity with low complexity.In addition,the bit error rate of legitimate receiver is maintained and the bit error rate of eavesdropper is restricted maximumly.Meanwhile,the safety performance of the system is enhanced effectively.

Zero-high-resolution Information Hiding Algorithm for 3D Mesh Model

REN Shuai, WANG Meng, FAN Ao-xiong, GAO Ze, XU Jie, Shahzad KHURRAM, ZHANG Tao

Computer Science. 2020, 47 (7): 328-334. doi:10.11896/jsjkx.190800021

Abstract

PDF(3395KB) ( 1174 )

References | Related Articles | Metrics

Aiming at weak anti-analysis of current information hiding algorithm for 3D mesh model,a zero-high-resolution information hiding algorithm was proposed.Firstly,multi-resolution analysis based on the improved half-fold mesh simplification algorithm is manipulated on mesh model to decompose it into High-layer,Mid-layer,Low-layer.Secondly,the feature vectors are extracted by concentric sphere segmentation in High-layer,and the feature points embedded in information are determined by calculating vertex saliency in Mid-layer.Thirdly,the highest bit of the eigenvalue is obtained circularly from the feature vector and then correlatedwith the encrypted information scrambled by Chebyshev to form the associated information.Finally,the associated information is embedded into the DCT transform AC coefficient of the feature point spherical coordinate r value using the piecewise mapping function.The algorithm hides the associated information constructed by feature vector of High-layer and scrambled information intothe affine transform invariant of robust feature points in Mid-layer with less than 15% energy,which is beneficial to the invisibility,robustness and anti-analysis of algorithm.The obvious changes of point and surface features are undetected by the first-order Laplacian smoothing steganalysis method,which means that the proposed algorithm is of good anti-analysis ability.