Computer Science

Survey on High-performance Computing Technology and Standards

LU Pingjing, XIONG Zeyu, LAI Mingche

Computer Science. 2023, 50 (11): 1-7. doi:10.11896/jsjkx.221100021

Abstract

PDF(2112KB) ( 2488 )

References | Related Articles | Metrics

As an indispensable support for knowledge and technological innovation,high-performance computing(HPC) is an important component for scientific and technological innovation system.In the new era,being an alternative way of scientific research,it is equally important as theories and experiments.In the past thirty years,HPC has achieved magic improvement,and has entered the era of exascale computing.China has achieved remarkable development in HPC,and has achieved a series of achievements represented by Tianhe,Sunway,and Dawning.China's high-performance system development level ranks among the top international ranks.Performance gains through semiconductor miniaturization is challenging after Moore's law ends.In the post-Moore era,improvements in computing power,opportunities for growth in computing performance will increasingly come from technologies from software,algorithms,and hardware architecture.Meanwhile,there are still many deficiencies in the development of HPC standards.This paper analyzes the current status and development trends of HPC technology and standards,analyzes the state-of-art of the current HPC standards,and proposes the necessity and importance of developing national HPC standards.

Acceleration Design and FPGA Implementation of CNN Scene Matching Algorithm

WANG Xiaofeng, LI Chaoran, LU Kunfeng, LUAN Tianjiao, YAO Na, ZHOU Hui, XIE Yujia

Computer Science. 2023, 50 (11): 8-14. doi:10.11896/jsjkx.221100104

Abstract

PDF(2126KB) ( 2381 )

References | Related Articles | Metrics

Compared with traditional methods,the CNN-based scene matching algorithm has higher matching accuracy,better adaptability and stronger anti-interference ability.However,the algorithm has massive computing and storage requirements,which makes it difficult to deploy at the edge.To improve the real-time computing,an efficient edge-side acceleration scheme is designed and implemented.On the basis of analyzing the computation characteristics and overall architecture of the algorithm,correlation specific accelerator(CSA) is designed based on Winograd fast convolution method,and the acceleration scheme using CSA and deep-learning processor unit(DPU) pipelined computing feature correlation layer and feature extraction network is proposed.Experiments on Xilinx's ZCU102 development board finds that the peak perfor-mance of CSA reaches 576 GOPS,the actual performance reaches 422.08 GOPS,and the DSP usage efficiency reaches 4.5 Operation/clock.The peak performance of the accele-ration system reaches 1 600 GOPS,and the throughput delay of the algorithm is reduced to 157.89 ms.Experimental results show that the acceleration scheme can efficiently utilize the computing resources of the FPGA,to realize the real-time computing of the CNN-based scene matching algorithm.

Fast Performance Evaluation Method for Processor Design

DENG Lin, ZHANG Yao, LUO Jiahao

Computer Science. 2023, 50 (11): 15-22. doi:10.11896/jsjkx.220900250

Abstract

PDF(2351KB) ( 2280 )

References | Related Articles | Metrics

In the face of increasingly complex processor design and limited design cycles,how to efficiently and quickly perform performance evaluation is a problem faced by each processor design team.The complete performance test suite requires longer run time,especially in the pre-silicon validation phase,and the high time cost makes it impossible for the design team to use the full performance test suite for performance evaluation analysis.In this paper,a general processor Fast-Eval method based on the SimPoint technique,using the Fast Parallel-BBV method,the selection of the optimal simulation points and the thermal migration of the simulation points,significantly reduces the performance test time and BBV generation time.Experimental results show that the performance evaluation time of the ARM64 processor is reduced to16.88% ofthe original,and the performance evaluation time is reduced to 1.26% of the original,and the average relative error of the performance evaluation results is 0.53%.The ave-rage relative error of the test set on the FPGA board can reach 0.40%,and the running time is only 0.93% of the full running time.

Convergence Analysis of Multigrid Solver for Cahn-Hilliard Equation

GUO Jing, QI Deyu

Computer Science. 2023, 50 (11): 23-31. doi:10.11896/jsjkx.220800030

Abstract

PDF(1930KB) ( 2389 )

References | Related Articles | Metrics

The Cahn-Hilliard(CH) equation is a fundamental nonlinear equation in the phase field model and is usually analyzed using numerical methods.Following a numerical discretization,we get a nonlinear equations system.The full approximation scheme(FAS) is an efficient multigrid iterative scheme for solving such nonlinear equations.In the numerous articles on solving the CH equation,the main focus is on the convergence of the numerical format,without mentioning the stability of the solver.In this paper,the convergence property of the multigrid algorithm is established,which is from the nonlinear equation system obtained by solving the discrete CH equation,and the reliability of the calculation process is guaranteed theoretically.For the diffe-rence discrete numerical scheme of the CH equation,which is both second-order in spatial and time,we use the fast subspace descent method(FASD) framework to give the estimation of the convergence constant of its FAS scheme multigrid solver.First,we transform the original difference problem into a fully equivalent finite element problem.It demonstrates that the finite element problem comes from the minimization of convex functional energy.Then it is verified that the energy functional and the spatial decomposition satisfy the FASD framework assumption.Finally,the convergence coefficient estimate of the original multigrid algorithm is obtained.The results show that in the case of nonlinearity,the parameter ε in the CH equation imposes restrictions on the grid size,which will cause the numerical calculation process not to converge when it is too small.Finally,the spatial and temporal accuracy of the numerical format is verified by numerical experiment,and the dependence of the convergence coefficient on the equation parameters and grid-scale is analyzed.

Study on Cross-platform Heterogeneous Parallel Computing for Lattice Boltzmann Multi-phase Flow Simulations Based on SYCL

DING Yue, XU Chuanfu, QIU Haozhong, DAI Weixi, WANG Qingsong, LIN Yongzhen, WANG Zhenghua

Computer Science. 2023, 50 (11): 32-40. doi:10.11896/jsjkx.230300123

Abstract

PDF(2328KB) ( 2188 )

References | Related Articles | Metrics

Heterogeneous parallel architecture is an important technology trend in current high-performance computing.Since various heterogeneous platforms usually support different programming models,the development of cross-platform performance portable heterogeneous parallel application is difficult.SYCL is a single-source cross-platform parallel programming open standard based on C++ language.The current research on SYCL mainly focuses on the performance comparison with other parallel programming models,but there are few researches on the different parallel kernel implementations provided in SYCL and their performance optimization.To address this situation,the open source multi-phase flow simulation software openLBMflow is implemented based on the SYCL programming model for cross-platform heterogeneous parallel simulation.The performance optimization methods of SYCL parallel applications are systematically summarized by comparing the basic parallel version,the fine-grained tuned ND-range parallel version and many-to-one mapping computation to work-items method.The results show that on Intel Xeon Platinum 9242 CPU and NVIDIA Tesla V100 GPU,the basic parallel kernel achieves a speedup of 2.91 on CPU without additional tuning compared to the optimized OpenMP parallel implementation,indicating the out-of-the-box performance advantage of SYCL.Using the basic parallel version as a baseline,the ND-range parallel version achieves up to 1.45x speedup on the CPU and 2.23x speedup on the GPU respectively by changing the work-group size and shape.By changing and optimizing the number and shape of lattices processed per work-item,the many-to-one mapping computation to work-items method achieves up to 1.57x speedup on the CPU and 1.34x speedup on the GPU respectively compared to the basic parallel version.The results show that SYCL parallel applications are more suitable for many-to-one mapping computation to work-items method on the CPU and ND-range parallel kernels on the GPU to improve performance.

Geo-sensory Time Series Prediction Based on Joint Model of Auto Regression and Deep NeuralNetwork

DONG Hongbin, HAN Shuang, FU Qiang

Computer Science. 2023, 50 (11): 41-48. doi:10.11896/jsjkx.230500231

Abstract

PDF(1856KB) ( 2771 )

References | Related Articles | Metrics

Geo-sensory time series contain complex and dynamic semantic spatio-temporal correlations and geographic spatio-temporal correlations.Although a variety of existing deep learning models have been developed for time series prediction,few of them focus on capturing multi-type of spatial-temporal correlations within geo-sensory time series.In addition,it is challenging to si-multaneously predict the future values of multiple sensors at a certain time step.To address these issues and challenges,this paper proposes a joint model of autoregression and deep neural network(J-ARDNN) to achieve the multi-objective prediction task of geo-sensory time series.In this model,the spatial module is proposed to capture the multi-type spatial correlations between diffe-rent series,the temporal module introduces the temporal convolutional network to extract the temporal dependencies within a single series.Moreover,the autoregression model is introduced to improve the robustness of the J-ARDNN prediction model.To prove the superiority and effectiveness of the J-ARDNN model,the proposed model is evaluated in three real-world datasets from different fields.Experimental results show that the proposed model can achieve better prediction performance than state-of-the-art contrast models.

Rumor Detection Model on Social Media Based on Contrastive Learning with Edge-inferenceAugmentation

LIU Nan, ZHANG Fengli, YIN Jiaqi, CHEN Xueqin, WANG Ruijin

Computer Science. 2023, 50 (11): 49-54. doi:10.11896/jsjkx.221000043

Abstract

PDF(1741KB) ( 3142 )

References | Related Articles | Metrics

In recent years,in order to deal with various social problems which are caused by the wide spreading of rumors,researchers have developed many deep learning-based rumor detection methods.Although these methods improve detection performance by learning the high-level representation of rumor from its propagation structure,they still suffer the problem of lower reliability and cumulative errors effect,due to the ignoring of edges’ uncertainty when constructing the propagation network.To address such a problem,this paper proposes the edge-inference contrastive learning(EIC) model.EICL first constructs a propagation graph based on timestamps of retweets(comments) for a given message.Then,it augments the event propagation graph to capture the edge uncertainty of the propagation structure by a newly designed edge-weight adjustment strategy.Finally,it employs the contrastive learning technique to solve the sparsity problem of the original dataset and improve the model generalization.Experimental results show that the accuracy of EICL is improved by 2.0% and 3.0% on Twitter15 and Twitter16,respectively,compared with other state-of-the-art baselines,which demonstrate that it can significantly improve the performance of rumor detection on social media.

Parallel Mining Algorithm of Frequent Itemset Based on N-list and DiffNodeset Structure

ZHANG Yang, WANG Rui, WU Guanfeng, LIU Hongyi

Computer Science. 2023, 50 (11): 55-61. doi:10.11896/jsjkx.221000011

Abstract

PDF(1869KB) ( 2769 )

References | Related Articles | Metrics

Frequent itemset mining is a basic problem of data mining and plays an important role in many data mining applications.In order to solve the problems of the parallel frequent itemset mining algorithm(MrPrePost) in big data environment,such as algorithm efficiency degradation,unbalanced load of computing nodes and redundant search,this paper proposes a parallel frequent itemset mining algorithm(PFIMND),which is based on N-lists and DiffNodeset.Firstly,according to the advantages of N-list and DiffNodeset data structures,the data set sparsity estimation function(SE) is designed,and one of them is selected to store data according to the data set sparsity.Secondly,the computational estimation function(CE) is proposed to estimate the load of each item in the frequent 1-item set F-list,and the load is evenly grouped according to the computational cost.Finally,the set enumeration tree is used as the search space.In order to avoid combination explosion and redundant search problems,the superset pruning strategy and the pruning strategy based on width first searches are designed to generate the final mining results.Experimental results show that compared with the similar algorithm(HP-FIMND),the effect of PFIMND algorithm in mining frequent itemsets on Susy dataset is improved by 12.3%.

Clustering Method Based on Contrastive Learning for Multi-relation Attribute Graph

XIE Zhuo, KANG Le, ZHOU Lijuan, ZHANG Zhihong

Computer Science. 2023, 50 (11): 62-70. doi:10.11896/jsjkx.220900166

Abstract

PDF(2715KB) ( 5159 )

References | Related Articles | Metrics

In the real world,there are many complex graph data which includes multiple relations between nodes,namely multi-relation attribute graph.Graph clustering is one of the approaches for mining similar information from graph data.However,most existing graph clustering methods assume that only single type of relation exists between nodes.Even for those that considering the multi-relation of a graph,they use only node attributes for training,or regard graph representation learning and clustering as two completely independent processes.Recently,Deep Graph Infomax(DGI) has shown promising results on many downstream tasks.But there are two major limitations for DGI.Firstly,DGI does not fully explore the various relations among nodes.Secondly,DGI does not jointly optimize the graph representation learning and clustering tasks,resulting in suboptimal clustering results.To address the above-mentioned problems,this paper proposes a novel framework,called clustering method based on contrastive learning for multi-relation attribute graph(CCLMAG),for learning the node embedding suitable for clustering in a unsupervised way.To be more specific,1)The community-level mutual information mechanism is applied to solve the problem of ignoring cluster information by DGI;2)the Embedding Fusion Module is augmented to aggregate the embedding of nodes in different relationships;3)the clustering optimization module is added to link the graph representation learning and clustering so that the learned node representation is more suitable for the clustering task,thus enhancing the interpretability of the clustering results.Extensive experimental results on three multi-relation attribute graph datasets and a real-world futures dataset demonstrate the superiority of CCLMAG compared with the state-of-the-art methods.

Study on Short Text Clustering with Unsupervised SimCSE

HE Wenhao, WU Chunjiang, ZHOU Shijie, HE Chaoxin

Computer Science. 2023, 50 (11): 71-76. doi:10.11896/jsjkx.220900214

Abstract

PDF(2238KB) ( 589 )

References | Related Articles | Metrics

Traditional shallow text clustering methods face challenges such as limited context information,irregular use of words,and few words with actual meaning when clustering short texts,resulting in sparse embedding representations of the text and difficulty in extracting key features.To address these issues,a deep clustering model SSKU(SBERT SimCSE Kmeans Umap) incorporating simple data augmentation methods is proposed in the paper.The model uses SBERT to embed short texts and fine-tunes the text embedding model using the unsupervised SimCSE method in conjunction with the deep clustering KMeans algorithm to improve the embedding representation of short texts to make them suitable for clustering.To improve the sparse features of short text and optimize the embedding results,Umap manifold dimension reduction method is used to learn the local manifold structure.Using K-Means algorithm to cluster the dimensionality-reduced embeddings,and the clustering results are obtained.Extensive experiments are carried out on four publicly available short text datasets,such as StackOverFlow and Biomedical, and compared with the latest deep clustering algorithms.The results show that the proposed model exhibits good clustering performance in terms of both accuracy and standard mutual information evaluation metrics.

Graph Clustering Algorithm Based on Node Clustering Complexity

ZHENG Wenping, WANG Fumin, LIU Meilin, YANG Gui

Computer Science. 2023, 50 (11): 77-87. doi:10.11896/jsjkx.230600003

Abstract

PDF(4558KB) ( 2840 )

References | Related Articles | Metrics

Graph clustering is an important task in the analysis of complex networks,which can reveal the community structure within a network.However,clustering complexity of nodes varies throughout the network.To address this issue,a graph clustering algorithm based on node clustering complexity(GCNCC)is proposed.It calculates the clustering complexity of nodes and assigns pseudo-labels to nodes with low clustering complexity.Then it uses these pseudo-labels as supervised information to lower the clustering complexity of other nodes to obtain the community structure of the network.The GCNCC algorithm consists of three main modules:node representation,node clustering complexity assessment,and graph clustering.The node representation module represents nodes in a low-dimensional space to maintain the clustering of nodes,the node clustering complexity assessment module identifies low clustering complexity nodes,and assigns them pseudo-labels,which can be used to update the clustering complexity of other nodes.The graph clustering module uses label propagation to spread the pseudo-labels from nodes with low clustering complexity to those with high clustering complexity.Compared with 9 classic algorithms on 3 real citation networks and 3 biological datasets,the proposed GCNCC performed well in terms of ACC,NMI,ARI,and F1.

Time-aware Transformer for Traffic Flow Forecasting

LIU Qidong, LIU Chaoyue, QIU Zixin, GAO Zhimin, GUO Shuai, LIU Jizhao, FU Mingsheng

Computer Science. 2023, 50 (11): 88-96. doi:10.11896/jsjkx.221000201

Abstract

PDF(3039KB) ( 2985 )

References | Related Articles | Metrics

As a key part of intelligent transportation systems,traffic flow forecasting faces the challenge of long-term prediction inaccuracy.The key factor is that the traffic flow has complicated spatial and temporal correlations.Recently,the emerging success of Transformer has shown promising results in time series analysis.However,there are two obstacles when applying Transformer to traffic flow forecasting:1)it's difficult for the static attention mechanisms to capture the dynamic changes of traffic flow along the space and time dimensions;2)the autoregressive decoder in transformer could cause error accumulation problem.To address the above problems,this paper proposes a time-aware Transformer(TAformer) for traffic flow forecasting.Firstly,it proposes a time-aware attention mechanism that can customize attention calculation solution according to the time features,so as to estimate the spatial and temporal dependencies more accurately.Secondly,it discards the teacher forcing mechanism during the training phase and proposes a non-autoregressive inference method to avoid the problem of error accumulation.Finally,extensive experiments on two real traffic datasets show that the proposed method can effectively capture the spatial-temporal dependence of traffic flow.Compared with the state-of-the-art baseline method,the proposed method improves the performance of long-term prediction by 2.09%～4.01%.

Self-supervised Action Recognition Based on Skeleton Data Augmentation and Double Nearest Neighbor Retrieval

WU Yushan, XU Zengmin, ZHANG Xuelian, WANG Tao

Computer Science. 2023, 50 (11): 97-106. doi:10.11896/jsjkx.230500158

Abstract

PDF(2753KB) ( 2744 )

References | Related Articles | Metrics

Traditional self-supervised methods based on skeleton data often take different data augmentation of a sample as positive examples,and the rest of the samples are regarded as negative examples,which makes the ratio of positive and negative samples seriously unbalanced,and limits the usefulness of samples with the same semantic information.In order to solve the above problems,this paper proposes a double nearest neighbor retrieval action recognition algorithm named DNNCLR,in which positive samples are not limited by data augmentation.First,a new joint level spatial data augmentation,namely Bodypart augmentation,is designed based on the physical connection of human joints.The input skeleton sequence is randomly replaced with a normal distribution array to obtain high-level semantic embedding.Secondly,in order to avoid the limitation of positive samples by data augmentation,a more reasonable double nearest neighbor retrieval(DNN) positive sample augmentation strategy is proposed,and further,a double nearest neighbor retrieval contrastive loss(DNN Loss) is proposed.Specifically,by using support sets for global retrieval,the search range of the positive sample set is expanded to new data points that cannot be covered by ordinary data augmentation.In the negative sample set,there are positive samples that have been misjudged,which are skeleton samples with the same semantic information but from different videos.Therefore,by using nearest neighbor retrieval again,these potential positive examples are searched from the negative sample set to further expand the positive sample set,and the double nearest neighbor retrieval contrastive loss is further proposed,forcing the model to learn more general feature representations,making the model optimization more reasonable.Finally,the DNNCLR algorithm is applied to the AimCLR model to obtain the AimDNNCLR model,and the model is evaluated linearly on the NTU-RGB+D dataset.Compared with the first line model,the proposed method has an average improvement of 3.6% in accuracy.

Community Discovery Algorithm for Attributed Networks Based on Bipartite Graph Representation

ZHAO Xingwang, XUE Jinfang

Computer Science. 2023, 50 (11): 107-113. doi:10.11896/jsjkx.221000226

Abstract

PDF(1487KB) ( 2732 )

References | Related Articles | Metrics

Community discovery in attributed networks is an important research content in network data analysis.To improve the accuracy of community discovery,most existing algorithms perform low-dimensional representation of attributed networks by fusing topological and attributed information,and then perform community discovery based on low-dimensional features.Such algorithms,however,are typically based on deep learning models for representation learning,which lack interpretability.Therefore,in order to improve the accuracy and interpretability of community discovery results,this paper proposes a community discovery algorithm for attributed networks based on bipartite graph representation.Firstly,the topological and attributed information of the attributed networks are used to calculate the probability of each node serving as a representative point in the network,and a certain proportion of nodes are chosen as representative points.Secondly,based on the topological structure and node attributes,the distances of each node to the representative points are calculated to construct a bipartite graph.Finally,based on the bipartite graph,the result is obtained by using the spectral clustering algorithm for community discovery.Experiments are carried out on artificial and real attributed networks to compare and analyze the proposed algorithm and the existing algorithms.In terms of evaluation indices such as normalized mutual information and adjusted rand index,experimental results show that the proposed algorithm outperforms the existing algorithms.

Road Network Topology-aware Trajectory Representation Learning

CHEN Jiajun, CHEN Wei, ZHAO Lei

Computer Science. 2023, 50 (11): 114-121. doi:10.11896/jsjkx.221000058

Abstract

PDF(2889KB) ( 2806 )

References | Related Articles | Metrics

The approaches developed for task trajectory representation learning(TRL) on road networks can be divided into the following two categories,i.e.,recurrent neural network(RNN) and long short-term memory (LSTM) based sequence models,and the self-attention mechanism based learning models.Despite the significant contributions of these studies,they still suffer from the following problems.(1)The methods designed for road network representation learning in existing work ignore the transition probability between connected road segments and cannot fully capture the topological structure of the given road network.(2)The self-attention mechanism based learning models perform better than sequence models on short and medium trajectories but underperform on long trajectories,as they fail to character the long-term semantic features of trajectories well.Motivated by these findings,this paper proposes a new trajectory representation learning model,namely trajectory representation learning on road networks via masked sequence to sequence network(TRMS).Specifically,the model extends the traditional algorithm DeepWalk with a probability-aware walk to fully capture the topological structure of road networks,and then utilizes the Masked Seq2Seq learning framework and self-attention mechanism in a unified manner to capture the long-term semantic features of tra-jectories.Finally,experiments on the real-world datasets demonstrate that TRMS outperforms the state-of-the-art methods in embedding short,medium,and long trajectories.

NeuronSup:Deep Model Debiasing Based on Bias Neuron Suppression

NI Hongjie, LIU Jiawei, ZHENG Haibin, CHEN Yipeng, CHEN Jinyin

Computer Science. 2023, 50 (11): 122-131. doi:10.11896/jsjkx.220900169

Abstract

PDF(3193KB) ( 2758 )

References | Related Articles | Metrics

With the wide application of deep learning,researchers not only focus on the classification performance of the model,but also need to pay attention to whether the decision of the model is fair and credible.A deep learning model with decision bias may cause great negative effects,so how to maintain the classification accuracy and improve the decision fairness of the model is very important.At present,many methods have been proposed to improve the individual fairness of the model,but there are still shortcomings in the debiasing effect,the availability of the debiased model,and the debiasing efficiency.To this end,this paper analyzes the abnormal activation of neurons when there is individual bias in the deep model,and proposes a model debiasing me-thod NeuronSup based on the inhibition of biased neurons,which has the advantages of significantly reducing individual bias,less impact on the performance of the main task,and low time complexity.To be specific,the concept of bias neuron is first proposed based on the phenomenon that some neurons in the deep model are abnormally activated due to individual bias.Then,the bias neurons are found by using discrimination samples,and the individual bias of the deep model is greatly reduced by suppressing the abnormal activation of bias neurons.And the main task performance neurons are determined according to the maximum weight edge of each neuron.By keeping the main task performance neuron parameters of the deep model unchanged,the influence of debiasing operation on the classification performance of the deep model could be reduced.Because NeuronSup only debiases specific neurons in the deep model,the time complexity is lower and the efficiency is higher.Finally,debiasing experiments on three real datasets with six sensitive attributes,compared with five contrasting algorithms,NeuronSup reduces the individual fairness index THEMIS more than 50%,and at the same time,the impact of the debiasing operation on the classification accuracy of the deep model is reduced to less than 3%,which verifies the effectiveness of NeuronSup in reducing individual bias while ensuring the classification ability of deep model.

Natural Noise Filtering Algorithm for Point-of-Interest Recommender Systems

ZHU Jun, HAN Lixin, ZONG Ping, XU Yiqing, XIA Ji’an, TANG Ming

Computer Science. 2023, 50 (11): 132-142. doi:10.11896/jsjkx.230400045

Abstract

PDF(4795KB) ( 2783 )

References | Related Articles | Metrics

The inherent natural noise in the original dataset of recommender systems(RSs) causes error and interference to re-commendation algorithms.Existing studies pay more attention to the malicious noise represented by various security attacks.The natural noise which is more subtle and difficult to deal with has rarely been documented.Most researches about natural noise are conducted for conventional RSs.However,the data feature and the causes and forms of natural noise in point-of-interest(POI) RSs are all different from those in conventional RSs.To filter the natural noise for POI RSs,a novel natural noise filtering method(NFDC) based on dispersion quantification and clustering distance analysis is proposed.The dispersion of a subset of the original check-in dataset is defined and calculated to indicate the data-driven uncertainty,and the accuracy metric F1 is adopted to represent the prediction-driven uncertainty.The measures of dispersion and accuracy metric vectors are empirically categorized to identify the proportion of the potential noise.The fuzzy C-means-based denoi-sing algorithm is performed to analyze the similarity of user behavior patterns and then screen the potentially noisy points based on clustering distance analysis.A customized rule is designed to further verify and delete the natural noise.Extensive experiments are conducted on two real-world location-based social network datasets,Brightkite and Gowalla.The datasets processed by NFDC and the other four benchmark algorithms are respectively input into five representative POI recommendation algorithms for comparison.Experimental results show that NFDC effectively filters the natural noise and provides reliable input for RSs.Compared with the highest accuracy supported by other denoi-sing methods,the accuracy in NFDC-processed Brightkite and Gowalla datasets is respectively improved by 15.95% and 5.00% on average.

Transformer Object Detection Algorithm Based on Multi-granularity

XU Fang, MIAO Duoqian, ZHANG Hongyun

Computer Science. 2023, 50 (11): 143-150. doi:10.11896/jsjkx.230600028

Abstract

PDF(4143KB) ( 2187 )

References | Related Articles | Metrics

Different from other scale objects,small objects have the characteristics of carrying less semantic information and a small number of training samples.Therefore,the current object detection algorithm has the problem of low detection accuracy for small objects.Aiming at this problem,a Transformer object detection algorithm based on multi-granularity is proposed.Firstly,adopting the multi-granularity idea,a new Transformer serialization method is designed to predict the object position granularly from coarse to fine,thereby improving the object location effect of the model.Then,based on the three-way decision idea,fine-grained mining of small object samples and regular-scale object samples increases the number of small object samples and hardnegative samples.Finally,experimental results on the COCO dataset show that,the small object detection average accuracy(APs) of the algorithm reaches 31.5%,and the mean average accuracy(mAP) reaches 49.1%.Compared with the baseline model,the APs is improved by 1.4% and the mAP is improved by 2.2%.The algorithm effectively improves the detection effect of small objects and significantly improves the overall accuracy of object detection.

Surface Anomaly Detection Based on Image Reconstruction and Semantic Difference Discrimination

WANG Shangshang, JIN Cheng

Computer Science. 2023, 50 (11): 151-159. doi:10.11896/jsjkx.221100023

Abstract

PDF(3596KB) ( 2195 )

References | Related Articles | Metrics

Reconstruction-based methods are widely used for surface anomaly detection.These methods are expected to only reconstruct normal patterns well and detect and localize anomalies by the larger reconstruction error in anomalous areas.Previous methods either tend to “generalize” too well,resulting in high fidelity reconstruction of anomalies,or measure reconstruction differences in image space,which doesn’t really capture the semantic differences.To tackle these problems,this paper proposes a model consisting of a reconstruction network and a discrimination network.In the reconstruction network,we design a multiscale location-augmented dynamic prototype unit to reinforce the learning of normal patterns.In the discrimination network,we fuse the multiscale deep features of the input image and its anomaly-free reconstruction to utilize the multiscale semantic difference information before and after reconstruction,which reinforces the discrimination of semantic differences.On the MVTec dataset,our method reaches 99.5% AUROC in the detection task,and 98.5% AUROC,95.0% PRO in the location task,outperforms pre-vious reconstruction-based methods by a large margin.

Deepfake Face Tampering Video Detection Method Based on Non-critical Masks and AttentionMechanism

YU Yang, YUAN Jiabin, CAI Jiyuan, ZHA Keke, CHEN Zhangyu, DAI Jiawei, FENG Yuxiang

Computer Science. 2023, 50 (11): 160-167. doi:10.11896/jsjkx.221100109

Abstract

PDF(3400KB) ( 2159 )

References | Related Articles | Metrics

Since the introduction of Deepfake technology,its illegal application has caused a bad impact on individuals,society and national security,and there are huge hidden dangers.Therefore,deep fake detection for face video is a hot and difficult problem in the field of computer vision.In view of the above problems,this paper proposes a deepfake video detection method based on non-critical mask and CA_S3D Model.It firstly divides the face image into key areas and non-critical regions,and improves the attention of the deep neural network to the key areas of the face image through the mask processing of the non-critical areas,and reduces the influence and interference of irrelevant information on the deep neural network.Then it introduces the contextual attention module in the S3D network,which enhances the ability to capture the long-range dependence of sample data information and improves the attention to key channels and features.Experimental results show that the proposed method improves the perfor-mance of the deep neural network on the DFDC dataset,the accuracy rate increases from 83.85% to 90.10%,and the AUC value increases from 0.931 to 0.979.By comparing with the existing deepfake video detection methods,the performance of the proposed method is better than that of the existing methods,which verifies its effectiveness.

Robust Video Watermarking Scheme Based on QDCT Global Equalization Strategy

TAO Xinyu, XIONG Lizhi, ZHANG Xiang

Computer Science. 2023, 50 (11): 168-176. doi:10.11896/jsjkx.221000228

Abstract

PDF(2993KB) ( 2232 )

References | Related Articles | Metrics

As a promising technology of copyright protection,video watermarking has attracted more and more attention in recent years.Different from the original domain scheme,the compressed domain scheme does not need to fully encode and decode video,so it has higher efficiency,and the video storage and transmission generally need to be compressed and encoded.Therefore,robust video watermarking scheme in compressed domain become a research hotspot.However,most of the existing schemes in the compressed domain use the individual QDCT coefficients in the compressed domain to embed the watermark,which makes the algorithm less robust.In order to improve the robustness of the compressed domain algorithm,a robust video watermarking scheme based on the QDCT global equalization strategy is proposed in this paper.Firstly,the blocks with both texture and high spatial complexity are selected as watermark blocks by using the number of non-zero coefficients,and then the sum of all coefficients in the two blocks is calculated respectively.According to the sum of the coefficients and the watermark information,all the non-zero coefficients in the sequence block are modified by the global equalization strategy to satisfy the block-pair coefficients rule,and the watermark is embedded.Experimental results show that the robust performance of the proposed scheme is better than that of the existing robust video watermarking scheme in resisting both recompression and noise attacks,increases by 8% and 9% respectively,while ensuring the high visual quality of the watermark-containing video.

Three-dimensional AI Clone Speech Source Identification Method Based on Improved MFCCFeature Model

WANG Xueguang, ZHU Junwen, ZHANG Aixin

Computer Science. 2023, 50 (11): 177-184. doi:10.11896/jsjkx.221000024

Abstract

PDF(4051KB) ( 2405 )

References | Related Articles | Metrics

The emergence of AI cloned voice technology will have a fatal impact on the legal order of modern society.In recent years,researchers have only focused on the research in the field of AI-synthesized speech containing the same sample speech content,but little research has been done on the identification of AI-synthesized speech containing the content that is different from the sample content.Thus,this paper proposes a three-dimensional model to identify AI cloned speech sources based on an improved MFCC feature model.Firstly,it verifies the characteristics of artificially analyzed AI cloned speech by previous scholars,and summarize the characteristics of “abnormally active formant F5” and “abnormal mutation of energy,formant and pitch curve” for computer identification.Secondly,it uses the second-order difference to correct the MFCC coefficients based on the characte-ristics of AI cloned speech,and use the “inverse logic deduction method” to further quantify and sample the mutation characteristics of energy,formants,and pitch curves,and define them as feature vector ternary of speech recognition.After that,it takes the feature vector triples as input,and uses the D-S evidence synthesis rule to fuse the results of the comparison of the three groups of inspection materials with the samples.Finally,a three-dimensional material evaluation model based on improved MFCC characteristic parameters is formed.After the random sampling experiment of the crowd,the AI clone source identification method has an average probability of 67.324% with a standard deviation of 7.32% for the identification of AI clones synthesized with the same human clone source,which is very effective.

End-to-End Event Coreference Resolution Based on Core Sentence

HUAN Zhigang, JIANG Guoquan, ZHANG Yujian, LIU Liu, DING Kun

Computer Science. 2023, 50 (11): 185-191. doi:10.11896/jsjkx.221000078

Abstract

PDF(1799KB) ( 2281 )

References | Related Articles | Metrics

Most previous event coreference resolution models belong to pairwise similarity models,which judge whether the two events are coreferences by calculating the similarity between them.However,when two event mentions appear close to each other in the document,encoding one event contextual representation will introduce information from the other event,which degrades the performance of the model.To solve the problem,an end-to-end event coreference resolution method based on core sentence(ECR-CS) is proposed.The model automatically extracts event information and constructs a core sentence for each event mention according to the preset template,and uses the core sentence representation instead of the event representation.Since the core sentence contains only the information of a single event,the model can eliminate the interference of other event information when encoding the event representation.In addition,limited by the performance of event extraction,the core sentence may lose some important information of the event.The contextual representation of the event in the document is used to make up for this problem.To supplement the missing important information in the core sentence with the contextual information,a gated mechanism is introduced to filter the noise in the contextual representation.Experiments on dataset ACE2005 show that the CoNLL and AVG scores of ECR-CS improves by 1.76 and 1.04,respectively,compared with the state-of-the-art baseline model.

Chat Dialogue Summary Model Based on Multi-granularity Contrastive Learning

KANG Mengyao, LIU Yang, HUANG Junheng, WANG Bailing, LIU Shulong

Computer Science. 2023, 50 (11): 192-200. doi:10.11896/jsjkx.230300241

Abstract

PDF(3035KB) ( 2553 )

References | Related Articles | Metrics

While the development of social networks brings convenience,but also generates massive amounts of chat data.How to filter key information from chat conversations has become a major difficulty.Chat summary is an effective tool to solve such pro-blems,as it allows users to quickly obtain important content without having to repeatedly browse through lengthy chat records.Currently,pre-trained models are widely used in various types of text,including unstructured,semi-structured,and structured text.However,for chat dialogue text,common pre-trained models are often unable to capture its unique structural features,and further exploration and improvement are still needed.To address these issues,this paper proposes a chat summary model MGCSum,which based on multi-granularity contrastive learning and does not require manual annotation of the datasets,making it easy to learn and transfer.Firstly,a stop word list for chat text is constructed by using document frequency,term frequency and entropy to remove interference information in chat.Then,self-supervised contrastive learning is performed at the granularity of words and topics to identify the structure of conversation,uncover keywords and distinct topic information in chats.Experimental results on the publicly available chat summary datasets SAMSum and financial fraud dialogue summary dataset FINSum show that,compared to current mainstream chat summary methods,this algorithm significantly improves coherence,information content and ROUGE evaluation metrics.

QubitE:Qubit Embedding for Knowledge Graph Completion

LIN Xueyuan, E Haihong , SONG Wenyu, LUO Haoran, SONG Meina

Computer Science. 2023, 50 (11): 201-209. doi:10.11896/jsjkx.221100217

Abstract

PDF(1573KB) ( 2269 )

References | Related Articles | Metrics

The knowledge graph completion task completes the knowledge graph by predicting missing facts in the knowledge graph.The quantum-based knowledge graph embedding(KGE) model uses variational quantum circuits to score triples by mea-suring the probability distribution of qubit states,and triples with high scores are the missing facts.But the current quantum-based KGE either loses the quantum advantage in the optimization process and the matrix unitary property is destroyed,or requires a large number of parameters for storing quantum states,resulting in overfitting and low performance.Furthermore,these methods ignore the theoretical analysis that is essential for understanding model performance.In order to solve the performance problem and bridge the theoretical gap,we propose QubitE:entities are embedded as qubits(unit complex vectors),relations are embedded as quantum gates(unit unitary matrices),the scoring process is complex matrix multiplication,and kernel methods are used for optimization.The parameterization method of the model can maintain the quantum advantage in optimization,the space-time complexity is linear,and it can even further realize semantic-based quantum logic calculation.In addition,the model can be proved to be fully expressive,relational schema reasoning ability and inclusiveness,etc.theoretically,which is helpful to understand the model performance.Experiments show that QubitE can achieve results comparable to state-of-the-art classical models on some benchmark knowledge graphs.

Multi-elite Interactive Learning Based Particle Swarm Optimization Algorithm with Adaptive Bound-handling Technique

XU Jie, ZHOU Xinzhi

Computer Science. 2023, 50 (11): 210-219. doi:10.11896/jsjkx.221000129

Abstract

PDF(2836KB) ( 2264 )

References | Related Articles | Metrics

Particle swarm optimization(PSO) algorithm relies on the cooperation between particles,which makes it show great intelligence in solving many optimization problems.However,due to the optimization mechanism,particles are easy to break through the boundary restrictions of the feasible region.If this behavior can have a clear guiding significance in the optimization process,it will help to improve the optimization performance of the algorithm.More importantly,the learning objects of particles in the original particle swarm optimization algorithm are mainly focused on the global optimal particles.This updating mechanism undoubtedly accelerates the loss of population diversity,and makes the population tend to fall into the local optimal.In order to further improve the population diversity and convergence accuracy when solving complex problems,an elite interactive learning particle swarm optimization algorithm(A-EIPSO) based on adaptive strategy is proposed.Firstly,the algorithm introduces a new bound-handling technique into the original PSO algorithm,and adaptively endows the distribution characteristics of particles in the solution space by using the historical location information and the distance of out of bounds particles,so as to modify the position of particles to meet the requirements of effectively handling out of violated particles.Then,based on multi-swarm technology,an elites learning strategy is designed to promote the exchange of social information among subswarms,and the elite particles instead of the global optimal particles guide the optimization behavior of particles in each subswarm.Experimental results show that,in most cases,the adaptive strategy can ensure that particles can achieve uniform exploration in the search space and significantly improve the performance of PSO algorithm.In addition,A-EIPSO is compared with five advanced particle swarm optimization variant algorithms and two mainstream evolutionary algorithms on the CEC2017 benchmark suite.The results show that A-EIPSO has superior performance on different types of functions,improves the convergence accuracy of most optimization pro-blems,and is superior to other representative PSO variant algorithms and evolutionary algorithms.

Adaptive Heavy-Ball Momentum Method Based on AdaGrad+ and Its Optimal Individual Convergence

WEI Hongxu, LONG Sheng, TAO Wei, TAO Qing

Computer Science. 2023, 50 (11): 220-226. doi:10.11896/jsjkx.220900131

Abstract

PDF(2682KB) ( 2295 )

References | Related Articles | Metrics

Adaptive strategies and momentum methods are commonly used to improve the performance of optimization algorithms.Most of the adaptive gradient methods use the AdaGrad-type strategy at present.The AdaGrad+ method,which is more suitable for dealing with constrained problems,is proposed to solve the inefficiency of AdaGrad-type strategy on constrained optimization.But it is the same as SGD in non-smooth convex situations.The optimal individual convergence rate is not reached.Combining the strategy with NAG momentum but fail to achieve the expected acceleration effect.Aiming at the above problems,this paper proposes an adaptive momentum method based on AdaGrad+.The method uses the strategy of AdaGrad+ to adjust the step size,and inherits the advantages of the Heavy-Ball momentum method to accelerate the convergence.It is proved that the method achieves the optimal individual convergence rate for non-smooth convex problems by setting the weighted momentum term,selecting time-varying parameters skillfully and processing the adaptive matrix flexibly.Finally,experiments are conducted on the typical optimization problem of hinge loss function with l_∞ norm constraint,and the experiment results verify the correctness of the theoretical analysis.In addition,the deep learning experiments confirm that the proposed method also has good performance in practical applications.

Deep Hashing-based Retrieval Framework for KBQA

LIU Shuo , ZHOU Gang, LI Zhufeng, WU Hao

Computer Science. 2023, 50 (11): 227-233. doi:10.11896/jsjkx.220900206

Abstract

PDF(2241KB) ( 2302 )

References | Related Articles | Metrics

Question answering over knowledge base usually involves three sub-tasks,topic entity recognition,entity linking and relation detection.Given that the knowledge base usually contains enormous entities and relationships,previous approaches prefer to utilize sophisticated rules and inverted index to retrieve candidate items.In this paper,a new approach is proposed to construct a retrieval framework for question answering over knowledge base to address the problems of search space limitations,low recall and the difficulty to incorporate semantic information demonstrated by previous approach.The framework consists of text retrieve module and hash retrieve module.A cascade retrieve model which contains traditional text retrieve and hash retrieve(semantic information remained) is constructed by recalling twice.The experiment,utilizing the datasets provided by KgCLUE and NLPCC2016,demonstrates that this deep hashing-based retrieve framework can acquire high-quality candidates efficiently and access the knowledge base easily with limited time cost.

Bayesian Rule-based Knowledge Completion with Hierarchical Attention

SHAN Xiaohuan, ZHAO Xue, CHEN Tingwei

Computer Science. 2023, 50 (11): 234-240. doi:10.11896/jsjkx.221000056

Abstract

PDF(1579KB) ( 2207 )

References | Related Articles | Metrics

As artificial intelligence in the big data era,knowledge graphs are widely used in many fields.Knowledge graphs gene-rally suffer from incompleteness and sparsity.As a sub-task of knowledge acquisition,knowledge completion aims to predict mis-sing links from known triples in the knowledge base.However,existing knowledge completion methods generally ignore the auxi-liary role of entity type jointly with neighborhood information,which can improve the knowledge completion accuracy.There are other problems such as feature information closely encodes into the objective function,and integration operations depend on the training process highly.To this end,a Bayesian rule-based knowledge completion method with hierarchical attention is proposed.Firstly,it regards entity type and neighborhood information as hierarchical structures,groups by relationship.It calculates each type information's attention weights independently.Then the entity types and neighborhood information encoding are regarded as the prior probability.The instance information encoding as likelihood probability.The two are combined according to the Bayesian rule.Experimental results show that the mean reciprocal rank(MRR ) metric in the FB15k dataset improves 14.4% over ConvE and 10.7% over TKRL.The MRR metric in the FB15k-237 dataset improves 2.1% over TACT.In the FB15k,FB15k-237 and YAGO26K-906 datasets,its Hits@1 reaches 77.5%,73.8% and 95.1% respectively,which demonstrates the introduction of type information and neighborhood information with hierarchical structure can embed richer and more accurate descriptive information for entities,and thus improve the accuracy of knowledge completion.

Attention Based Concept Enhanced Cognitive Diagnosis

YUAN Dongxue, SUN Quansen, FU Peng

Computer Science. 2023, 50 (11): 241-247. doi:10.11896/jsjkx.221100169

Abstract

PDF(1822KB) ( 2703 )

References | Related Articles | Metrics

Cognitive diagnosis is a fundamental problem in intelligent education systems,which aims to evaluate the mastery le-vels of students on different knowledge concepts.Although the performance current deep learning-based cognitive diagnostic me-thods has improved greatly compared with traditional methods,they cannot fully exploit the potential correlation between concepts.To this end,this paper proposes an attention-based concept enhanced cognitive diagnosis(ACECD) model to obtain more accurate cognitive diagnostic results by modeling the relationship between related concepts.Specifically,we first project students,exercises,and concepts to factor vectors to perform complex interactions,and then feed the concept factors into a self-attention network to capture the implicit correlations that exist between concepts,and concept factor vector can be enhanced with the captured implicit relation.Finally,the enhanced concept factors are interacted with the student factor and the practice factor,and the interacted results are input into the diagnosis module to get the final diagnosis result.In addition,we also use the interaction between the practice factor and the concept factor to correct the bias of the manually-labeled Q matrix.The proposed model is compared with other methods on two real-world datasets,and the experimental results show that the ACECD model effectively improves the diagnostic results.

Bidirectional Learning Equilibrium Optimizer Combining Sparrow Search and Random Difference

HOU Xinyu, LU Haiyan, LU Mengdie, XU Jie, ZHAO Jinjin

Computer Science. 2023, 50 (11): 248-258. doi:10.11896/jsjkx.221100143

Abstract

PDF(3080KB) ( 2322 )

References | Related Articles | Metrics

To address the problems of low solution accuracy and slow convergence speed of equilibrium optimizer,a bidirectional learning equilibrium optimizer combining sparrow search and random difference is presented.Firstly,an adaptive population division strategy based on sparrow search algorithm is proposed to balance the global exploration and local exploitation of the algorithm,so as to improve the convergence accuracy and convergence speed of the algorithm.Secondly,a random difference strategy is introduced to reconstruct the equilibrium pool and to increase the information exchange between individuals,so as to facilitate the algorithm to jump out of the local optimum.Finally,a bidirectional chaotic opposition learning strategy is designed and applied to the updated population to increase the population diversity and hence to further improve the convergence accuracy of the algorithm.Simulation experiments are conducted with 14 test functions,the performance of algorithm is evaluated using Wilcoxon rank-sum test and mean absolute error,and the improved algorithm is applied to two engineering design problems.Experimental results show that the three improvement strategies are effective and the convergence accuracy,convergence speed and robustness of the improved algorithm are significantly enhanced.

Similarity and Consistency by Self-distillation Method

WAN Xu, MAO Yingchi, WANG Zibo, LIU Yi, PING Ping

Computer Science. 2023, 50 (11): 259-268. doi:10.11896/jsjkx.221000009

Abstract

PDF(3863KB) ( 2624 )

References | Related Articles | Metrics

Due to high data pre-processing costs and missing local features detection in self-distillation methods for models compression,a similarity and consistency by self-distillation(SCD) method is proposed to improve model classification accuracy.Firstly,different layers of the sample images are learned to get the feature maps,and the attention maps are obtained by the distribution of feature weights.Then,the similarity of the attention graph between samples within the mini-batch is calculated to obtain the similar consistency knowledge matrix,and the similar consistency-based knowledge is constructed without distorting the instance data or extracting the same class of data to obtain additional inter-instance knowledge,avoiding a large amount of data pre-processing work.Finally,the similar consistency knowledge matrix is passed unidirectionally between intermediate layers of the model,allowing shallow layers to mimic deep layers and capture richer contextual scenes and local features which can solve the problem of missing local feature detection.Experimental results show that the proposed SCD method can improve the classification accuracy on the public dataset CIFAR100.Compared with the self attention distillation(SAD) method and the similarity-preserving knowledge distillation(SPKD) method,the average improvement is 1.42%.Compared with the be your own teacher(BYOT) method and the on-the-fly native ensemble(ONE) method,the average improvement is 1.13%.Compared with the data-distortion guided self-distillation(DDGSD) method and the class-wise self-knowledge distillation(CS-KD) method,the average improvement is 1.23%.

Multi-ship Coordinated Collision Avoidance Decision Based on Improved Twin Delayed Deep Deterministic Policy Gradient

HUANG Renxian, LUO Liang, YANG Meng, LIU Weiqin

Computer Science. 2023, 50 (11): 269-281. doi:10.11896/jsjkx.221000131

Abstract

PDF(5892KB) ( 2340 )

References | Related Articles | Metrics

At present,most models of collision avoidance algorithms take ships as single agent to make collision avoidance decisions,without considering the coordinated avoidance between ships.In the scenario of multi-ship meeting,it will lead to poor avoidance effect by relying on single ships.Therefore,this paper proposes a softmax deep double deterministic policy gradients(SD3) multi-ship cooperative collision avoidance model based on improved twin delayed deep deterministic policy gradient(TD3).The time collision model and space collision model are constructed to quantitatively analyze the ship collision risk based on the time and space factors of ship navigation safety.On this basis,the ship domain model based on the situation of collision and the dynamic change of ship speed vector is used to qualitatively analyze the ship collision risk.The reward function is designed using the constraints of ship objective guidance,course angle change,course keeping,collision risk and international regulations for preventing collisions at sea(COLREGs),combined with the typical encounter situation in COLREGS,the collision avoidance simulation is carried out for the encounter scene with multi-situation coexistence of encounter,head-on,chase and cross encounter.Ablation experiment shows that the softmax operator improves the performance of SD3 algorithm,making it have better decision-ma-king effect in ship coordinated collision avoidance and compared with other reinforcement learning algorithms for learning efficiency and learning effect.Experimental results show that the SD3 algorithm can effectively make accurate collision avoidance decisions and outperform other reinforcement learning algorithms in performance in complex multi-situation encounter scenarios.

Research Developments of 5G Network Slicing

TIAN Chenjing, XIE Jun, CAO Haotong, LUO Xijian, LIU Yaqun

Computer Science. 2023, 50 (11): 282-295. doi:10.11896/jsjkx.221100044

Abstract

PDF(4405KB) ( 2793 )

References | Related Articles | Metrics

As the key enabling technology for fifth-generation communication networks and beyond,network slicing(NS) has received a surge of attention and recognition from network operators and academia for its promising abilities in vertical industry customization,quality of service(QoS) assurance,isolation,flexibility,and reliability.In recent years,many institutions have pre-sented their understanding and development plans of NS through special reports or white papers.However,these works have varying focuses,and the terminology has not been standardized,which hampers the researchers’ overall grasp of NS picture.To facilitate researchers’ understanding of the developmental context,technology architecture,management and orchestration,and other relevant aspects of NS,this paper presents a comprehensive review for recent related work.First,it provides an overview of NS by examining its historical background,definitions,and key characteristics.Subsequently,we discuss the end-to-end NS realization in three components,namely access NS,carrier NS,and core NS.In each component,the network architecture developments,technological breakthroughs,and standardization achievements in recent years are presented and analyzed.Afterward,an introduction is made to the content of network slice management and orchestration,and the relevant research is discussed accor-ding to the slicing scenario.Finally,in view of NS's development requirements and practical dilemmas,several open research pro-blems are identified.

RFID Multi-tag Relative Location Method Based on RSSI Sequence Features

HE Yong, GUO Zhengxin, GUI Linqing, SHENG Biyun, XIAO Fu

Computer Science. 2023, 50 (11): 296-305. doi:10.11896/jsjkx.230300165

Abstract

PDF(4021KB) ( 2879 )

References | Related Articles | Metrics

High-precision indoor multi-target localization technology is crucial for implementing customized intelligent services.Currently,indoor localization technology based on radio frequency identification(RFID) has received extensive attention from both academia and industry due to its advantages such as low cost,easy deployment,and multi-target sensing.However,traditional RFID-based multi-target relative localization systems require the use of multiple receiving antennas for data transmission and reception,leading to high deployment costs.Additionally,the received signal strength indication(RSSI) sequence also have data interruption.To address these problems,this paper proposes an RFID multi-tag relative localization method based on the features of RSSI sequence.This method first uses uniformly moving antennas to obtain the received RSSI signal sequences of multiple target tags.Then,the received RSSI sequence data is pre-processed to fill in missing data and construct a sequence similarity measurement table based on cosine similarity.Finally,this paper designs different tag grouping algorithms from multiple group dimensions to achieve relative localization of RFID multi-tags.Through a large number of relative localization tests on a typical indoor multi-group RFID tag array,experimental results show that the proposed method has an average accuracy of over 92% for RFID tag relative localization,and the average localization calculation time for a 5*5 antenna array is less than 1 s.Compared with other relative localization works,the computational efficiency of this method is improved by nearly 10 times.

Adaptive Model Quantization Method for Intelligent Internet of Things Terminal

WANG Yuzhan, GUO Bin, WANG Hongli, LIU Sicong

Computer Science. 2023, 50 (11): 306-316. doi:10.11896/jsjkx.230300078

Abstract

PDF(5263KB) ( 2872 )

References | Related Articles | Metrics

With the rapid development of deep learning and the Internet of Everything,the combination of deep learning and mobile terminal devices has become a major research hotspot.While deep learning improves the performance of terminal devices,it also faces many challenges when deploying models on resource-constrained terminal devices,such as the limited computing and storage resources of terminal devices,and the inability of deep learning models to adapt to changing device context.We focus on the adaptive quantization of deep models with resource adaptive.Specifically,a resource-adaptive mixed-precision model quantization method is proposed,which firstly uses the gated network and the backbone network to construct the model and partitioned model at layer as the granularity to find the best quantization policy of the model,and combines the edge devices to reduce the model resource consumption.In order to find the optimal model quantization policy,FPGA-based deep learning model deployment is adopted.When the model needs to be deployed on resource-constrained edge devices,adaptive training is performed according to resource constraints,and a quantization-aware method isadopted to reduce the accuracy loss caused by model quantization.Experimental results show that our method can reduce the storage space by 50% while retaining 78% accuracy,and reduce the energy consumption by 60% on the FPGA device with no more than 2% accuracy loss.

Efficient Distributed Training Framework for Federated Learning

FENG Chen, GU Jingjing

Computer Science. 2023, 50 (11): 317-326. doi:10.11896/jsjkx.221100224

Abstract

PDF(3035KB) ( 2555 )

References | Related Articles | Metrics

Federated learning effectively solves the problem of isolated data island,but there are some challenges.Firstly,the training nodes of federated learning have a large hardware heterogeneity,which has an impact on the training speed and model performance.The existing researches mainly focus on federated optimization,but most methods do not solve the problem of resource waste caused by the different computing time of each node in synchronous communication mode.In addition,most of the training nodes in federated learning are mobile devices,so the poor network environment leads to high communication overhead and serious network bottlenecks.Existing methods reduce the communication overhead by compressing the gradient uploaded by the training nodes,but inevitably bring the loss of model performance and it is difficult to achieve a good balance between quality and speed.To solve these problems,at the computing stage,this paper proposes adap-tive federated averaging(AFA),which adaptatively coordinates the local iteration according to the hardware performance of each node,minimizes the idle time of waiting for global gradient download and improves the computational efficiency of federated learning.In the communication stage,it proposes double sparsification(DS) to minimize the communication overhead by gradient sparsification on the training node and parameter server.In addition,each training node compensates the error according to the lost value of the local gradient and the global gra-dient,and reduces the communication cost greatly in exchange for lower model performance loss.Experimental results on the image classification dataset and the spatio-temporal prediction dataset prove that the proposed method can effectively improve the training acceleration ratio,and is also helpful to the model performance.

Joint Layered Message Passing Detection for Multi-user Large-scale LDPC-SM-MIMO System

ZOU Xin, ZHANG Shunwai

Computer Science. 2023, 50 (11): 327-332. doi:10.11896/jsjkx.220900103

Abstract

PDF(2547KB) ( 2562 )

References | Related Articles | Metrics

Message passing detection(MPD) is the most commonly used detection algorithm in multi-user large-scale spatial mo-dulation multi-input multi-output(SM-MIMO) systems,but the traditional MPD algorithm is still complex.To overcome this pro-blem,the layered MPD(LMPD) algorithm isused to accelerate the convergence speed of the algorithm.Then,low-density parity-check(LDPC) codes are combined with SM-MIMO systems,and a joint LMPD-belief propagation(JLMPD-BP) algorithm in which the LMPD can use the feedback information of BP decoding is proposed to further improve the system detection perfor-mance.Theoretical analysis and simulation results show that,compared with the traditional MPD algorithm,the LMPD algorithm accelerates the convergence speed of the algorithm without losing the bit error rate(BER) performance.For example,when the signal-to-noise ratio is 4 dB,LMPD algorithm need only 2 iterations,while MPD algorithm need 3 iterations.At the same time,thanks to the great advantages of LDPC codes,JLMPD-BP algorithm greatly reduces BER of the system.When the iteration number is (2,2,2) and SNR=2 dB,compared with LMPD-BP algorithm with iteration (4,4,0),the BER of JLMPD-BP algorithm deceases from 10^－2 to 5×10^－3.

Design of Dynamic S-box Based on Anti-degradation Chaotic System and Elementary Cellular Automata

ZHAO Geng, GAO Shirui, MA Yingjie, DONG Youheng

Computer Science. 2023, 50 (11): 333-339. doi:10.11896/jsjkx.220900026

Abstract

PDF(3091KB) ( 1417 )

References | Related Articles | Metrics

S-box is the basic non-linear module of most block cipher algorithms,which can meet the obfuscation and proliferation requirements of block cipher algorithms.In order to improve the safety of chaotic S-boxes,this paper uses an anti-degenerative chaotic system to generate S-box elements,and generates an S-box based on elementary cellular automata to generate S-box retrieval table.The anti-degradation chaotic system can avoid the situation that the Skew Tent system enters the fixed point and eliminate the phenomenon of the system entering a short period of time at low precision.Because the elementary cellular automata is an operation on the binary domain and satisfies the discreteness in time and space,the elementary cellular automata is applied to the chaotic block cipher without considering the problem of dynamics degradation.In the case of global chaos rules,if the number of cells is enough,the pseudorandom of the output can be guaranteed.The use of elementary cellular automata to generate a search table for the S-box can not only ensure the confusion principle of S-box design,but also simplify the steps of S-box generation.Finally,the security analysis and comparison of the designed S-box shows that the S-box generated by the proposed method has good security,satisfies the principle of confusion and diffusion of block ciphers,and can be used in the design of chaotic block cipher algorithms.

VPN Traffic Hijacking Defense Technology Based on Mimic Defense

GAO Zhen, CHEN Fucai, WANG Yawen, HE Weizhen

Computer Science. 2023, 50 (11): 340-347. doi:10.11896/jsjkx.221000091

Abstract

PDF(2459KB) ( 1416 )

References | Related Articles | Metrics

VPN technology can effectively guarantee the confidentiality and integrity of communication traffic.However,the traffic hijacking attack named blind in/on-path emerged in recent years,uses VPN protocol rules to implement attacks by injecting forged messages into encrypted tunnels,which seriously threatens the security of VPN technology.Aiming at such threats,this paper proposes a VPN traffic hijacking prevention technology based on pseudo defense,and designs a pseudo VPN architecture(Mimic VPN,M-VPN).The architecture consists of a tuner and a node pool containing multiple heterogeneous VPN encryption and decryption nodes.Firstly,the tuner dynamically selects several encryption and decryption nodes to process the encryption traffic in parallel according to the node's credibility.Then the processing results of each encryption and decryption node are comprehensively judged.The decision result will be used as the basis for the response message and the updated credibility.By judging the same response from different nodes,the attacker is effectively prevented from injecting forged packets.TExperimental simulation shows that compared with the traditional VPN architecture,M-VPN can reduce the success rate of blind in/on-path attacks by about 12 orders of magnitude.

USPS:User-space Cross Protocol Proxy System for Efficient Collaboration of Computing Power Resources

XIA Jingxuan, SHEN Guowei, GUO Chun, CUI Yunhe

Computer Science. 2023, 50 (11): 348-355. doi:10.11896/jsjkx.230300171

Abstract

PDF(3061KB) ( 1599 )

References | Related Articles | Metrics

With the rapid development of computing power network,computing power resources such as general computing po-wer,artificial intelligence computing power,and supercomputing are widely distributed.Collaborative service of computing power resources is a key issue in computing power network research.In the process of computing power resource collaboration,on the one hand,it faces the high concurrent requests and low latency response requirements of massive terminal computing power ser-vices,on the other hand,it is difficult to give full play to the high throughput and low latency advantages of computing power resources in data center,and then it is difficult to provide efficient computing power services for users.Aiming at the above challenges,a user-space proxy system(USPS) based on the user-space protocol stack and remote direct memory access(RDMA) techno-logy is proposed.The user space protocol stack is used to respond to client's for high concurrent computing power requests,and the high throughput and low latency services of data center computing power based on RDMA is realized under dynamic batch processing strategy coordination.In terms of communication,USPS has implemented an efficient remote procedure call(RPC) communication mechanism,which can make full use of RDMA NIC bandwidth and provide high-speed message communication.In terms of request processing,a dynamic batch processing scheduling method is proposed,which can maximize the batch processing efficiency while meeting the user's delay requirements.Experiment shows that the service response latency of USPS is only 7.8%~23.1% of that of the traditional kernel-space Nginx proxy system,and 17.3%~24.7% of that of other user-space proxy systems.The throughput is 3.4~8.9 times higher than that of the traditional kernel-space Nginx agent system,and 3.2~4.2 times higher than that of other user-space proxy systems.

Backdoor Defense of Horizontal Federated Learning Based on Random Cutting and GradientClipping

XU Wentao, WANG Binjun

Computer Science. 2023, 50 (11): 356-363. doi:10.11896/jsjkx.221200005

Abstract

PDF(3046KB) ( 1541 )

References | Related Articles | Metrics

Federated learning is a methodology that solves the contradiction of big data between user privacy and data sharing,and realize the concept of “data is invisible but available”.However,the federated model is at risk of backdoor attacks in the training process.The attacker trains a attack model containing a backdoor task locally,and amplifies the model parameters by a certain proportion to implant the backdoor into the federated model.Facing the backdoor threat in the training process of the horizontal federated learning,from the perspective of the game theory,this paper proposes a backdoor defense strategy and technical proposal based on the combination of random cutting and gradient clipping.After receiving the gradient from the participants,the central server randomly chooses the neural network layer from each participant,and aggregates the gradient contributions of each participant layer by layer.Then,the central sever clips gradient parameters according to gradient threshold.Gradient clipping and random cutting can weaken the influence generated by abnormal data from minority participants.It falls into platform state when the federated model learning backdoor features,so that it keeps failing on learning backdoor features without affecting the lear-ning process of target tasks.If the central server completes the federated learning during platform state,it can defend against backdoor attacks.Experimental results show that the proposed method can effectively defend against potential backdoor threats in fe-derated learning.At the same time,the accuracy of the model is ensured.Therefore,it can be applied in horizontal federated lear-ning scenarios,providing security protection for federated learning.

FL_Raft:Election Consensus Programme Based on Federated Learning Model

RONG Baojun, ZHENG Zhaohui

Computer Science. 2023, 50 (11): 364-373. doi:10.11896/jsjkx.221000134

Abstract

PDF(2780KB) ( 1735 )

References | Related Articles | Metrics

Aiming at the problems of low throughput,high consensus delay and low security caused by vote splitting and frequent leader change of Raft consensus algorithm in heterogeneous clusters,a Raft election consensus programme FL_Raft based on fe-derated learning model is proposed.First,federated learning aggregation runs after each leader iteration,invokes the local characteristic data of nodes,and filters high-performance node groups through the federated learning training model.Secondly,a beha-vior-based equity calculation model is established to dynamically adjust the equity value of each node in the cluster.Finally,the equity election model is established to elect the quasi leader node,which becomes the final leader node after all nodes vote.Experimental results show that under the premise of ensuring the data privacy of each node,compared with Raft,the FL_Raft election delay reduces by 50%,the leader reliability is more than 95%,the consensus delay reduces by 20%,and the throughput increases by 13%.The FL_Raft consensus algorithm ensures the efficiency and security of the election,and improves the stability of the cluster and the availability of leaders.

ZUC High Performance Data Encryption Scheme Based on FPGA

ZHANG Bolin, LI Bin, YAN Yunfei, WEI Yuanxin, ZHOU Qinglei

Computer Science. 2023, 50 (11): 374-382. doi:10.11896/jsjkx.221100070

Abstract

PDF(2131KB) ( 1604 )

References | Related Articles | Metrics

ZUC algorithm is a stream cipher algorithm independently developed by China,which has been adopted as the fourth generation mobile communication encryption standard by 3GPP LTE.In order to meet the high requirements of the big data era for the performance of domestic passwords,a set of high-performance data encryption scheme with ZUC algorithm as the core is designed.The scheme includes two encryption algorithm cores of different structure forms.Aiming at two different application situations of short message and long message respectively,based on the FPGA platform,the semi-pipelined and full-pipelined ZUC stream cipher circuit structures are designed by using CLA and CSA adders.With the improved ZUC encryption mode,combined with high-speed memory communication and multi iv parallel encryption,the high-performance encryption scheme is realized,which greatly improves the encryption and decryption efficiency.When the scheme works,the encryption algorithm can be configured using the control module.Experimental results show that,compared with other schemes,the working frequency of the proposed algorithmis increased by 40.8%~209.5% and 62.1%~445.4% respectively,and the data throughput reaches 25.728 Gb/s and 46.08 Gb/s,meeting the high-performance encryption scenarios such as edge devices and Internet of Vehicles data encryption.