Computer Science

CONTENTS

Computer Science. 2025, 52 (3): 0-0.

Abstract

PDF(13034KB) ( 397 )

RelatedCitation | Metrics

Survey of Metaverse Technology Development and Applications

CAO Mingwei, ZHANG Di, PENG Shengjie, LI Ning, ZHAO Haifeng

Computer Science. 2025, 52 (3): 4-16. doi:10.11896/jsjkx.241000095

Abstract

PDF(3868KB) ( 341 )

References | Related Articles | Metrics

The metaverse is a virtual world that integrates virtual reality,augmented reality,artificial intelligence,and internet technologies,offering not only a digital immersive environment but also a new paradigm for social interaction and economic activity.With the active investment of technology giants and innovative companies,the rapid development of the metaverse has garnered widespread attention across various fields.This paper reviews the origins,technological foundations,current applications,and socio-economic impacts of the metaverse,while also examining the privacy and security challenges it faces.By providing a comprehensive analysis of different aspects of the metaverse,this paper aims to offer a framework for understanding and exploring this complex frontier and to serve as a reference for future research and practice.

Survey on 3D Scene Reconstruction Techniques in Metaverse

SONG Xingnuo, WANG Congyan, CHEN Mingkai

Computer Science. 2025, 52 (3): 17-32. doi:10.11896/jsjkx.241000043

Abstract

PDF(3463KB) ( 345 )

References | Related Articles | Metrics

With the development of various technologies such as virtual reality(VR),augmented reality(AR),blockchain,and artificial intelligence(AI),the metaverse is gradually being applied in many fields such as gaming,education,healthcare,and business.As the core technology of the metaverse,3D reconstruction technology has attracted attention due to its extremely high research value and wide application prospects.Traditional 3D reconstruction techniques perform poorly in processing metaverse tasks characterized by real-time interactivity,with significant room for improvement in computational efficiency and reconstruction model accuracy.Therefore,how to optimize 3D reconstruction technology,improve accuracy and robustness,and provide users with a more realistic and real-time interactive experience has become a current research hotspot.This paper tracks and summarizes the 3D reconstruction techniques based on scene generation in the metaverse in recent years.Firstly,we review the deve-lopment history of the metaverse,point out the challenges faced by 3D reconstruction technology,and propose solutions based on two different 3D representations.Then,3D reconstruction techniques based on 3D Gaussian and Neural Radiance Field(NeRF) representations are sorted out separately.Next,the innovative fusion methods of 3D reconstruction technology with tactile signals and large language models are mainly analyzed.Finally,the challenges faced by scenebased 3D reconstruction technology in the metaverse are discussed in detail,and corresponding future research directions are proposed.

LpDepth:Self-supervised Monocular Depth Estimation Based on Laplace Pyramid

CAO Mingwei, XING Jingjie, CHENG Yifeng, ZHAO Haifeng

Computer Science. 2025, 52 (3): 33-40. doi:10.11896/jsjkx.240800069

Abstract

PDF(4291KB) ( 312 )

References | Related Articles | Metrics

Self-supervised monocular depth estimation has attracted widespread attention from researchers both domestically and abroad.Existing self-supervised monocular depth estimation methods based on deep learning mainly use encoder-decoder structures.However,these methods perform down-sampling operations on the input image during the encoding process,resulting in the loss of some image information,particularly boundary information,which leads to the degradation of the accuracy of the estimated depth map.To address this issue,this paper proposes a new self-supervised monocular depth estimation method based on the Laplacian pyramid.Specifically,the method enriches the encoded features using Laplacian residual images,compensates for the loss of information during down-sampling,highlights and amplifies features during the down-sampling process using maximum-pooling layers,which facilitates feature extraction for model training by the encoder.The method also leverages residual modules to mitigate potential overfitting issues and improve the decoder’s efficiency in feature utilization.Finally,we test the proposed method on benchmark datasets such as KITTI and Make3D and compare its performance with state-of-the-art methods,with experimental results demonstrating the effectiveness of the proposed method.

Selective Feature Fusion for 3D CT Image Segmentation of Renal Cancer Based on Edge Enhancement

WANG Tao, BAI Xuefei, WANG Wenjian

Computer Science. 2025, 52 (3): 41-49. doi:10.11896/jsjkx.240300091

Abstract

PDF(3375KB) ( 291 )

References | Related Articles | Metrics

Aiming at the problems of multi-scale lesion areas,sparse edge pixels,low contrast,as well as complex and irregular tumor shape in 3D CT images of renal cancer,this paper proposes a selective feature fusion 3D CT image segmentation network based on edge enhancement(EE-SFF U-Net).EE-SFF U-Net adopts the symmetric encoder-decoder network architecture based on U-Net,and the encoding path contains an edge enhancement module for strengthening edge information,which can effectively mine and utilize shallow feature information to alleviate the sparsity problem of edge pixels and avoid missing detection of small targets.In addition,in the skip connections of the network,a selective feature fusion module is designed to make the deep and shallow features complement each other and realize the effective aggregation of different information.Finally,a hybrid loss function with Generalized Dice Loss and Focal Loss is proposed.The dynamic weight adjustment strategy is used to realize the optimal training of the loss function,and to improve the influence of multi-scale lesions and irregular tumor shape and size.The proposed method not only ensures the accuracy of the overall localization of the lesion area,but also strengthens the mining and utilization of small target feature information,so as to improve the accuracy and robustness of segmentation.The experimental results on KiTS19 public dataset show that the proposed method performs well in various indexes and significantly improves the segmentation performance compared with other segmentation algorithms.

Animatable Head Avatar Reconstruction Algorithm Based on Region Encoding

WANG Jie, WANG Chuangye, XIE Jiucheng, GAO Hao

Computer Science. 2025, 52 (3): 50-57. doi:10.11896/jsjkx.240200060

Abstract

PDF(2876KB) ( 224 )

References | Related Articles | Metrics

Traditional head avatar reconstruction methods are mostly based on 3D Morphable Models(3DMM),which,while convenient for animating,cannot represent non-rigid structures like hairs.Recently,head avatar approaches based on the neural radiance field achieve impressive visual results but suffer from shortcomings in animation and training efficiency.To address these issues,monocular videos are used as raw data,and a dynamically expanding point cloud is utilized,to construct an animatable virtual head avatar.The point cloud can be rapidly rendered into images by rasterization,significantly reducing training time.In terms of texture representation,color is decoupled into albedo and shading,with shading further decomposed into normal and a combination of region features obtained through sparse encoding of points,resulting in more precise textures.However,the inherent discreteness of point clouds can lead to holes.Therefore,a normal smoothing strategy is employed to enhance texture continuity,successfully eliminating texture holes in regions like teeth and tongue.A large number of experiments on multiple subjects show that compared to the state-of-the-art head avatar construction algorithms,such as IMavatar,PointAvatar,NerFace,and StyleAvatar,the animatable avatars constructed based on point clouds,combined with region encoding and normal smoothing strategy,exhibit an improvement of average 3.41% on the PSNR metric.Ablation experiments show that the PSNR metric is improved by approximately 3.50% and 3.44% respectively over not using region encoding and normal smoothing strategy.

Talking Portrait Synthesis Method Based on Regional Saliency and Spatial Feature Extraction

WANG Xingbo, ZHANG Hao, GAO Hao, ZHAI Mingliang, XIE Jiucheng

Computer Science. 2025, 52 (3): 58-67. doi:10.11896/jsjkx.240300030

Abstract

PDF(4590KB) ( 209 )

References | Related Articles | Metrics

Audio-driven talking portraits synthesis endeavors to convert arbitrary input audio sequences into realistic talking portrait videos.Recently,several works on synthesizing talking portraits leveraging neural radiance fields(NeRF) have achieved superior visual results.However,such works still generally suffer from poor audio-lip synchronization,torso jitter,and low clarity in the synthesized videos.To address these issues,a method based on regional saliency features and spatial volume features is proposed to achieve high-fidelity synthesis of talking portraits.On one hand,a regional saliency-aware module is developed,dynamically adjusting the volumetric attributes of spatial points in the head region with multimodal input data and optimizing feature storage through hash tables,thus improving the precision and efficiency of facial detail representation.On the other hand,a spatial feature extraction module is designed for independent torso modeling.Unlike conventional methods that estimate color and density directly from torso surface spatial points,this module constructs a torso field using reference images to provide relevant texture and geometric priors,thereby achieving more precise torso rendering and natural movements.Experiments applied to multiple subjects demonstrate that,in self-reconstruction scenarios,the proposed method improves image quality(PSNR,LPIPS,FID,LMD) by 10.15%,12.12%,0.77%,and 1.09% respectively,and enhances lip-sync accuracy(AUE) by 14.20% compared to the current state-of-the-art baseline model.Concurrently,there is a notable increase of 14.20% in lip synchronization accuracy as measured by Sync metrics.Under cross-driving conditions with out-of-domain audio sources,the lip synchronization accuracy is achieved improvements of 4.74%.

Multi-view Multi-person 3D Human Pose Estimation Based on Center-point Attention

JIANG Yiheng, LI Yang, LIU Chunyan , ZHAO Yunlong

Computer Science. 2025, 52 (3): 68-76. doi:10.11896/jsjkx.240600063

Abstract

PDF(3046KB) ( 228 )

References | Related Articles | Metrics

Multi-view multi-person 3D human pose estimation is widely used in various computer vision tasks.Current spatial voxel-based methods are difficult to achieve real-time computing on edge computing devices due to huge resource consumption.However,the regression method has limited generalization ability due to the lack of geometric constraints.In a new environment,it cannot be directly applied and needs to collect data for fine-tuning.By combining the spatial voxel method and the regression-based pose estimation method,we propose a multi-view multi-person 3D human pose estimation model based on center point attention regression.The model roughly estimates the position of the human body center through a small-scale voxel network,and constructs the initial pose based on it.Then the regression prediction is carried out within the range of the human body center point to obtain more accurate human pose.In this study,by combining the spatial key point positions,the regression prediction of the model is more accurate,and the average accuracy is improved by 1.16% on large scales.At the same time,the model is very easy to train,and the accuracy is improved by up to 12% in small sample fine-tuning.This allows regression-based models to greatly expand the generalization performance and versatility of such models in new scenarios by rapidly deploying them with small amounts of training data.

3D Reconstruction of Single-view Sketches Based on Attention Mechanism and Contrastive Loss

ZHONG Yue, GU Jieming

Computer Science. 2025, 52 (3): 77-85. doi:10.11896/jsjkx.240200102

Abstract

PDF(2857KB) ( 229 )

References | Related Articles | Metrics

The metaverse is a three-dimensional(3D) virtual space that is immersive and interconnected.With the development of technologies such as virtual reality and artificial intelligence,the metaverse is reshaping human lifestyles.3D reconstruction is a core technique for the metaverse,and deep learning-based 3D reconstruction has become a popular research direction in computer vision.To address the problems of inevitable foreground and background ambiguity,drawing style variations,and viewpoint differences in hand-drawn sketches,a single-view sketch 3D reconstruction model based on attention mechanisms and contrastive losses without requiring additional annotations or user interactions is proposed.The model first rectifies the spatial layout of the input sketch using spatial transformers,and then uses the normalized attention module to establish long-distance and multi-level dependencies on the sketch.The global structure information of the sketch is used to alleviate the reconstruction difficulty caused by the ambiguity of the foreground and background.Furthermore,the contrastive loss function is designed to encourage the model to learn view-invariant and style-invariant latent space features of the sketches,so as to improve robustness.Experimental results on multiple datasets demonstrate the effectiveness and advancement of the proposed model.

Triplet Interaction Mechanism in Cross-view Geo-localization

ZHOU Bowen, LI Yang, WANG Jiabao, MIAO Zhuang, ZHANG Rui

Computer Science. 2025, 52 (3): 86-94. doi:10.11896/jsjkx.240500020

Abstract

PDF(3306KB) ( 188 )

References | Related Articles | Metrics

Cross-view geo-localization refers to inferring the geographical location from images of different viewpoints,which is usually viewed as an image retrieval task.However,most existing methods neglect the global position information and feature completeness,which makes the model can not conducive to capturing deep semantic information.Additionally,the current two-dimensional interaction methods do not fully utilize the relationships between dimensions,leading to insufficient cross-dimensional interaction.To address these issues,this paper designs a triplet interaction mechanism for cross-view geo-localization.This method uses ConvNeXt as the feature extraction network,followed by a proposed triplet interaction mechanism,for feature enrichment operations.Finally,a joint loss function is utilized to guide model training.It performs multiple dimensional interactions within the model,reducing the problem of information loss in the two-dimensional feature projection.The proposed method includes a triplet interaction mechanism that uses different attention mechanisms in three channels,making the model robust to translations,scaling,and rotations for different cross-view images.Experimental results demonstrate that the proposed method can significantly outperforms other methods for both drone view localization and drone navigation tasks on University-1652 dataset.

Study on Active Privacy Protection Method in Metaverse Gaze Communication Based on SplitFederated Learning

LUO Zhengquan, WANG Yunlong, WANG Zilei, SUN Zhenan, ZHANG Kunbo

Computer Science. 2025, 52 (3): 95-103. doi:10.11896/jsjkx.240500038

Abstract

PDF(2925KB) ( 197 )

References | Related Articles | Metrics

In the rapidly evolving metaverse,gaze interaction has emerged as a pivotal mode of communication.However,gaze data encompasses more than mere gaze orientation and ocular mobility.It can also be applied for identification and recognition of soft biometrics,including age,gender,and ethnicity.Furthermore,it has the potential to disclose an individual's emotion,cognitive processes,and decision-making patterns.Given its sensitive nature,the development of robust gaze data privacy protection mechanisms has become imperative,attracting considerable interest.Additionally,numerous gaze-driven applications necessitate specific privacy attributes for functional support,yet active selection and protection of gaze privacy remains unexplored in current research.To this end,this study initially conducts hierarchical and quantitative analyses to uncover the severe state of gaze privacy breaches.Subsequently,it introduces an innovative gaze privacy safeguarding framework that integrates federated learning with split learning,significantly mitigating leakage risks.Moreover,this research proposes an active privacy protection strategy employing adversarial training and information bottleneck technique,which ensures targeted privacy filtration alongside enhancements in model generalization.Comprehensive experiments confirm that the devised APSFGaze approach excels in both privacy protection and performance.This study offers a novel pathway and technological framework for privacy preservation in metaverse gaze interactions.

3D Object Detection with Dynamic Weight Graph Convolution

LI Zongmin, RONG Guangcai, BAI Yun, XU Chang , XIAN Shiyang

Computer Science. 2025, 52 (3): 104-111. doi:10.11896/jsjkx.240700041

Abstract

PDF(2810KB) ( 234 )

References | Related Articles | Metrics

3D object detection is one of the most critical technologies in autonomous driving,and 3D object detection based on LiDAR is usually carried out in the scene of point cloud construction.The current methods cannot fully use the point cloud’s structural information,leading to false and missed target detection.To solve this problem,we propose a DEG R-CNN based on dyna-mically weighted graph convolution.Firstly,the primary neighbour and subordinate neighbour are set for the node in RoI,and the graph structure of the point cloud is constructed.The geometric information of the object is restored.Then,Gaussian and 1D convolution are used in the graph to efficiently aggregate the point cloud’s structural features.Finally,the cross-attention mechanism adaptively fuses image features of different granularities to supplement the image semantic information.Experiments are conducted on KITTI dataset,and the effectiveness of modules is verified.The 3D mAP of the method reaches 88.80%,which is 1.22% higher than that of the baseline model.At the same time,the results of 3D object detection are visualized and analyzed in detail to understand performance and accuracy of the method better.

Survey on Deep Learning-based Meteorological Forecasting Models

WANG Yuan, HUO Peng, HAN Yi, CHEN Tun, WANG Xiang, WEN Hui

Computer Science. 2025, 52 (3): 112-126. doi:10.11896/jsjkx.240900095

Abstract

PDF(2812KB) ( 377 )

References | Related Articles | Metrics

Accurate and timely weather forecasting is crucial for people’s livelihoods,environmental ecology,and military decision-making,attracting extensive attention and focused research from various sectors.Numerical weather prediction(NWP) is currently the mainstream forecasting method.Over long-term development,the accuracy and reliability of NWP have continuously improved.However,it still faces significant challenges,such as unavoidable systematic errors,ineffective utilization of historical observation data,and substantial computational costs.With the rapid rise of deep learning,data-driven artificial intelligence me-thods are gradually being applied to the field of weather forecasting,offering novel techniques to overcome these challenges.Against this backdrop,this paper comprehensively summarizes the current research status of NWP and deep learning-based weather forecasting.It systematically reviews the relevant concepts and input data for deep learning-based weather forecasting models,tho-roughly explains representative models applied to various weather forecasting tasks,and provides a detailed comparison of the technical architectures and performance metrics of different models.Additionally,it analyzes and discusses the existing challenges and the future directions in this field.The ultimate purpose of this survey is to provide reference information for related research.

Study on Data Entry Transaction and Trusted Circulation System Construction Based on Multi-agent Evolutionary Game Equilibrium Model

ZHANG Lili , ZHANG Zheng

Computer Science. 2025, 52 (3): 127-136. doi:10.11896/jsjkx.240200003

Abstract

PDF(2593KB) ( 178 )

References | Related Articles | Metrics

Compared with traditional production factors,the characteristics of non exclusivity,non competitiveness,and non standardization of data factors determine that a trustworthy on exchange trading system is the key to achieving sustainable deve-lopment of the data factor market.Guiding both parties to enter the market for trading is a breakthrough in solving the current lack of reliable data element circulation system,insufficient data flow,and inactive on exchange trading in China.By constructing an evolutionary game model that includes the government,data providers,data demanders,and conducting policy simulation analysis,it is found that the government’s measures such as increasing the penalty amount for over-the-counter transactions,streng-thening the punishment intensity for over-the-counter transactions,reducing entry transaction costs,and increasing entry transaction profits can help promote the implementation of entry trading behavior by both parties.However,the policy variables such as the subsidy incentive,the entry incentive or the disciplinary intensity of OTC transactions,and the OTC transaction income of the data demander are not significant in promoting the entry transactions of the relevant subjects.The influence of these variables can be ignored in the design of policy mechanism.

Risk Minimization-Based Weighted Naive Bayesian Classifier

OU Guiliang, HE Yulin, ZHANG Manjing, HUANG Zhexue , Philippe FOURNIER-VIGER

Computer Science. 2025, 52 (3): 137-151. doi:10.11896/jsjkx.240600045

Abstract

PDF(4091KB) ( 170 )

References | Related Articles | Metrics

Naive Bayesian classifier(NBC),which is famous for its sound theoretical basis and simple model structure,is a classical classification algorithm which has been deemed as one of the top 10 algorithms in the fields of data mining and machine lear-ning.However,the dependence assumption of NBC limits its prediction performance when attribute dependence exists.Weighted NBC(WNBC) is an improved version of NBC,which has good generalization performance and low training complexity.This paper proposes a risk minimization-based WNBC(RM-WNBC) by considering both empirical risk and structural risk,in which the empirical risk measures the classification performance of RM-WNBC and structural risk depicts the dependence expression capability of RM-WNBC.Unlike existing improvements to NBC,RM-WNBC alleviates the dependence assumption and further enhances the generalization capability of NBC by considering with the internal characteristics of NBC rather than its external characteristics.The empirical risk is represented by the estimation quality of posterior probabilities,while the structural risk is represented by the mean squared error of joint probabilities.The minimization of empirical risk and structural risk guarantees that RM-WNBC can achieve both good classification performance and appropriate dependence representation.To obtain the optimal weights of marginal probabilities,an efficient and convergent updating strategy is designed by minimizing the empirical and structural risks.A series of persuasive experiments is conducted to validate the feasibility,rationality and effectiveness of RM-WNBC on 31 benchmark data sets.The experimental results show that the optimization process of RM-WNBC weights is convergent and RM-WNBC not only well deals with the attribute dependence but also obtains better training and testing accuracies than the classical NBC,three typical Bayesian networks,four WNBCs and feature selection-based NBC.

Federated Learning Evolutionary Multi-objective Optimization Algorithm Based on Improved NSGA-III

HU Kangqi, MA Wubin, DAI Chaofan, WU Yahui, ZHOU Haohao

Computer Science. 2025, 52 (3): 152-160. doi:10.11896/jsjkx.240600014

Abstract

PDF(2620KB) ( 200 )

References | Related Articles | Metrics

Federated learning is a novel distributed machine learning method that can train models without sharing raw data.Current federated learning methods suffer from the problem that it is difficult to optimize multiple objectives simultaneously,such as optimizing the model accuracy,optimizing communication costs and balancing the distribution of participants’ performance.A four-objective optimization model and a solution algorithm for federated learning are proposed to address this problem.Data usage cost,global model error rate,model accuracy distribution variance and communication cost are taken as the optimization objectives to construct the optimization model.Aiming at the problem that the solution space of this model is large and the traditional NSGA-III algorithm is difficult to find the optimal solution,the improved NSGA-III federated learning multi-objective optimization algorithm GPNSGA-III based on the Good Point Set initialization strategy is proposed to find the Pareto optimal solution.The algorithm uniformly distributes the limited initialization populations in the objective solution space through the good point set initialization strategy,so that the first-generation solution is maximally close to the optimal value and the ability to find the optimal value is improved compared with the original algorithm.Experimental results show that the hypervolume value of the Pareto solution obtained by the GPNSGA-III algorithm is improved by 107% on average compared with the NSGA-III algorithm;the Spacing value is reduced by 32.3% on average compared with the NSGA-III algorithm;compared with the other multi-objective optimization algorithms,the GPNSGA-III algorithm is more effective in achieving the accuracy of the model while guaranteeing the ba-lance of model accuracy distribution variance,communication cost and data cost.

Negative Sampling Method for Fusing Knowledge Graph

LU Haiyang, LIU Xianhui, HOU Wenlong

Computer Science. 2025, 52 (3): 161-168. doi:10.11896/jsjkx.240500015

Abstract

PDF(1702KB) ( 188 )

References | Related Articles | Metrics

In order to solve the problem of information overload,recommender systems have been widely studied.Since it is difficult to obtain a large amount of high-quality explicit feedback data,implicit feedback data becomes the mainstream choice for training re-commender systems.Sampling negative instances from unlabeled data,i.e.negative sampling,is crucial for training recommendation models based on implicit feedback data.The previous negative sampling methods often focus on how to select hard negative instances that contain more user preference information,without considering the false negative problem.In order to reduce the false negative probability of negative instances obtained from sampling and make them more informative,a negative sampling method that integrates knowledge graph is proposed.Firstly,constructing a candidate instance set based on the user-item knowledge graph.Then,the negative instance with the lowest false negative probability is selected from the candidate set through a Bayesian classification approach.Finally,based on the Mixup strategy,positive mixing technology is introduced to construct the hard negative instance.To evaluate the effectiveness of the proposed method,validation was conducted on two public datasets.The results show that compared with previous methods,the method proposed in this paper performs better.

Mobility Data-driven Location Type Inference Based on Crowd Voting

XIONG Keqin, RUAN Sijie, YANG Qianyu, XU Changwei , YUAN Hanning

Computer Science. 2025, 52 (3): 169-179. doi:10.11896/jsjkx.240600164

Abstract

PDF(4247KB) ( 159 )

References | Related Articles | Metrics

Geographic information serves as fundamental data for economic and social development.One of the common and vital type of data in this field is point-of-interest(POI) data.Previously,POI data are collected by map manufacturers,which are costly,have limited spatial coverage,and are not fine-grained enough,affecting the effectiveness of downstream applications.Fortunately,the popularization of the mobile Internet has generated vast amounts of mobility data that reveal the existence of POIs and have the potential to infer their location types.However,such potentiality is challenged by sparse visited locations by users,complex contextual dependency,and random individual behaviors,which are not adequately addressed by existing work.Therefore,we propose a mobility data-driven location type inference method based on crowd voting,namely Milotic.This method refines the task of predicting location types to each trajectory,models complex relationships between locations with graph models,fully retains and integrates fine-grained trajectory context information through check-in embeddings and Bi-LSTM,and overcomes the randomness of individual behaviors through a voting mechanism.Experimental results demonstrate that Milotic achieves weighted F1 score improvements of 7.5% and 13.3% respectively over the best baseline on two real-world mobility datasets.

Heterogeneous Graph Attention Network Based on Data Augmentation

YANG Yingxiu, CHEN Hongmei, ZHOU Lihua , XIAO Qing

Computer Science. 2025, 52 (3): 180-187. doi:10.11896/jsjkx.231200138

Abstract

PDF(1996KB) ( 206 )

References | Related Articles | Metrics

Heterogeneous graph is a graph composed of different types of nodes and edges,which can model various types of objects and their relationships in the real world.Heterogeneous graph embedding aims to learn the embedding vectors of nodes by capturing rich attribute,structural and semantic information in the graph,which can be used in tasks such as node classification and link prediction,further achieving applications such as user recognition and product recommendation.The existing embedding methods exploit meta-paths to capture high-order structural and semantic information between nodes.However,the existing methods ignore the differences between different types of nodes in meta-path instances or different types of neighbor nodes in the graph,resulting in information loss,which in turn affects the quality of node embedding.In view of the above issues,this paper proposes heterogeneous graph attention network based on data augmentation(HANDA) to better learn the embedding vectors of nodes.Firstly,edge augmentation method based on meta-path neighbors is proposed.The method obtains neighbors of nodes based on meta-paths,and generates semantic edges between nodes and meta-path neighbors.These edges not only contain high-order structural and semantic information between nodes,but also alleviate the sparsity issue of the graph.Secondly,a node embedding method incorporating node type attention is presented.The method adopts the multi-head attention incorporating node types to obtain the importance of neighbors formed by both edges and semantic edges.Further,the method simultaneously captures attribute,high-order structural and semantic information by message passing and two kinds of neighbors,resulting in improving the embedding vectors of nodes.Experimental results on real datasets show that the proposed HANDA outperforms the baselines in both node classification and link prediction.

FedRCD:A Clustering Federated Learning Algorithm Based on Distribution Extraction andCommunity Detection

WANG Ruicong, BIAN Naizheng, WU Yingjun

Computer Science. 2025, 52 (3): 188-196. doi:10.11896/jsjkx.240100213

Abstract

PDF(3478KB) ( 201 )

References | Related Articles | Metrics

Clustering clients and conducting federated learning within clusters is an effective method to mitigate the poor perfor-mance of traditional federated learning algorithms in non-independently and identically distributed(Non-IID) data scenarios.Such methods primarily utilize the parameters of a client’s local model to characterize data features,and evaluate similarity through the “distance” between parameters,thereby realizing client clustering.However,due to the permutation invariance of neurons in neural networks,this could lead to inaccurate clustering results.Moreover,these methods typically require a predetermined number of clusters,which might result in unreasonable clusters,or they may require clustering during the algorithmic iterative process,lea-ding to substantial communication overhead.After in-depth analysis of the shortcomings of existing methods,a novel federated learning algorithm named FedRCD is proposed.This algorithm combines autoencoders and K-Means algorithms,directly extracting distribution information from a client’s dataset to represent its characteristics,thereby reducing reliance on model parameters.FedRCD also organizes the relationships between clients into a graph structure,and employs the Louvain algorithm to construct client clustering relationships.This process does not require pre-setting the number of clusters,which makes the clustering results more reasonable.Experimental results show that FedRCD can more effectively unearth latent clustering relationships between clients.In a variety of Non-IID data scenarios,compared to other federated learning algorithms,it significantly improves the training effect of neural networks.On the CIFAR10 dataset,the accuracy of FedRCD surpasses the classical FedAvg algorithm by 37.08%,and even outperforms the newly released FeSEM algorithm by 1.89%,demonstrating superior fairness performance.

Knowledge Tracing Model Based on Exercise-Knowledge Point Heterogeneous Graph andMulti-feature Fusion

XIE Peizhong, LI Guanjin, LI Ting

Computer Science. 2025, 52 (3): 197-205. doi:10.11896/jsjkx.240700151

Abstract

PDF(2716KB) ( 195 )

References | Related Articles | Metrics

Knowledge tracing(KT) aims to predict a learner’s future performance based on their historical responses and assess changes in their knowledge state.Exploring these changes facilitates personalized services,such as course and exercise recommendations.However,most existing knowledge tracing models do not consider comprehensive features,leading to incomplete assessments of a learner’s knowledge state changes.To address this issue,a new knowledge tracing model－knowledge tracing model based on exercise-knowledge point heterogeneous graph and multi-feature fusion(EKMFKT)is propose.Specifically,we study two behavioral features(attempt and hint) and two temporal features(response time and interval time) and their impact on the knowledge state.To simulate knowledge acquisition and forgetting,we design learning and forgetting gates,which comprehensively update the knowledge state.For model inputs,a graph embedding method based on the question-knowledge point heteroge-neous graph is designed to pre-train question representations,preserving the associations between questions and knowledge points.Experimental results on two public datasets demonstrate that EKMFKT outperforms existing models in predictive performance.By incorporating multiple features and ensuring the connection between question and knowledge point,EKMFKT provides a more reasonable representation of knowledge state changes,enhancing the interpretability of the model.

Class-incremental Source-free Domain Adaptation Based on Multi-prototype Replay andAlignment

TIAN Qing, KANG Lulu, ZHOU Liangyu

Computer Science. 2025, 52 (3): 206-213. doi:10.11896/jsjkx.240100166

Abstract

PDF(2165KB) ( 167 )

References | Related Articles | Metrics

Traditional source-free domain adaptation usually assumes that all the target domain data is available,but in practice,the target domain data often appears in the form of streams,that is,the classes in the unlabeled target domain will increase sequentially,which undoubtedly brings new challenges.First,in each time step,the label space of the target domain is a subset of the source domain,and blind alignment will cause the performance of the model to deteriorate.Secondly,in the process of learning new classes,it will destroy the previously learned knowledge,resulting in the catastrophic forgetting of the previous knowledge.In order to solve these problems,this paper proposes a method based on multi-prototype replay and alignment(MPRA).In this method,the shared classes in the target domain are detected by cumulative prediction probabilities,the label space inconsistency problem is solved,and the multi-prototype replay is used to deal with catastrophic forgetting and improve the memory ability of the model.Additionally,the method incorporates cross-domain contrastive learning based on multi-prototype and source model weights to align feature distributions and improve model robustness.A large number of experiments show that the proposed method has achieved superior performance on 3 benchmark datasets.

Speaker Verification Method Based on Sub-band Front-end Model and Inverse Feature Fusion

WANG Mengwei, YANG Zhe

Computer Science. 2025, 52 (3): 214-221. doi:10.11896/jsjkx.240100222

Abstract

PDF(1742KB) ( 184 )

References | Related Articles | Metrics

Two problems with time delay neural networks(TDNN) used to extract frame-level features in existing speaker confirmation methods are the lack of the ability to model local frequency features and the inability of the multilayer feature fusion approach to effectively model the complex relationships between high-level and low-level features.Therefore,a new front-end model as well as a new multilayer feature fusion approach are proposed.In the front-end model,by dividing the input feature map into multiple sub-bands and expanding the frequency range of the sub-bands layer by layer,the TDNN can model the local frequency features progressively.Meanwhile,a new inverse path passing from higher to lower layers is added to the backbone model to model the relationship between the output features of two adjacent layers,and the outputs of each layer in the inverse path are concatenated to serve as the fused features.In addition,the design of the inverse bottleneck layer is used in the backbone model to further improve the performance of the model.Experimental results on the VoxCeleb1 test set show that the proposed method has a relative reduction of 9% in the equal error rate and 14% in the minimum cost detection function,compared to the current TDNN method,while the number of parameters is only 52% of the current method.

Semi-supervised Sound Event Detection Based on Meta Learning

SHEN Yaxin, GAO Lijian , MAO Qirong

Computer Science. 2025, 52 (3): 222-230. doi:10.11896/jsjkx.240100191

Abstract

PDF(2324KB) ( 181 )

References | Related Articles | Metrics

Existing semi-supervised sound event detection methods directly utilize strongly labeled synthetic samples,weakly labeled real samples,and unlabeled real samples for training to alleviate the issue of insufficient labeled samples.However,there is an inevitable distribution gap between synthetic and real domains,which can interfere with the direction of model gradient optimization,thereby restricting generalization ability of these models.To address this challenge,a novel semi-supervised sound event detection learning paradigm,meta mean teacher(MMT),is proposed based on meta-learning.Specifically,for each batch of trai-ning data,it is divided into a meta-training set consisting of synthetic samples and a meta-test set consisting of real samples.The meta-gradient calculated on the meta-training set serves as guidance for updating the meta-test gradient,allowing the model to perceive and learn more generalized knowledge.Experimental results on the DCASE2021 Task 4 dataset show that,compared to the official baseline,the proposed learning paradigm MMT has a relative improvement of 8.9%,6.6%,and 1.1% in the F1,PSDS1,and PSDS2 metrics,respectively.Compared to the current state-of-the-art methods in the field,the proposed learning paradigm MMT still demonstrates a significant performance advantage.

Multi-view Stereo Reconstruction with Context-guided Cost Volume and Depth Refinemen

CHEN Guangyuan, WANG Zhaohui, CHENG Ze

Computer Science. 2025, 52 (3): 231-238. doi:10.11896/jsjkx.231200111

Abstract

PDF(2880KB) ( 192 )

References | Related Articles | Metrics

In response to the challenges in deep learning-based multi-view stereo(MVS) reconstruction algorithms,which include incomplete image feature extraction,ambiguous cost volume matching,and the accumulation of depth errors leading to poor reconstruction results in textureless and repetitive texture regions,a cascaded MVS network based on context-guided cost volume construction and depth refinement is proposed.First,the feature fusion module based on non-reference attention is used to filter out irrelevant features and address the inconsistency in multi-scale features through feature fusion.Then,the context-guided cost vo-lume module is used to fuse global information to enhance the accuracy and robustness of cost volume matching.Finally,the depth refinement module is employed to learn and reduce depth errors,to improve the accuracy of the low-resolution depth maps.The experimental results show that compared with MVSNet,the integrity error of the network on the DTU dataset is reduced by 24.4%,the accuracy error is reduced by 4.1 %,and the overall error is reduced by 14.3 %.The performance on the Tanks and Temples dataset is also better than most algorithms,showing strong competitiveness.

Study on Evaluation Framework of Large Language Model’s Financial Scenario Capability

CHENG Dawei, WU Jiaxuan, LI Jiangtong, DING Zhijun, JIANG Changjun

Computer Science. 2025, 52 (3): 239-247. doi:10.11896/jsjkx.240900123

Abstract

PDF(1655KB) ( 327 )

References | Related Articles | Metrics

With the rapid development of large language models(LLMs),its application in the financial sector has become a dri-ving force for industry transformation.Establishing a standardized and systematic evaluation framework for financial capabilities is a crucial way to assess large language models’ abilities in financial scenarios.However,current evaluation methods have limitations,such as weak generalization of evaluation datasets and narrow coverage of task scenarios.To address these issues,this paper proposes a financial large language model benchmark,named CFBenchmark,which consists of four core assessment modules:financial natural language processing,financial scenario computation,financial analysis and interpretation,and financial compliance and security.High-quality tasks and systematic evaluation metrics are designed based on multi-task scenarios within each module,providing a standardized and systematic approach to assessing large models in the financial domain.Experimental results indicate that the performance of large language models in financial scenarios is closely related to their parameters,architecture,and trai-ning process.As the application of LLMs in the financial sector becomes more widespread in the future,the financial LLM benchmark will need to include more real-world application designs and high-quality evaluation data collection to help enhance the generalization ability of LLMs across diverse financial scenarios.

Generative Task Network:New Paradigm for Autonomic Task Planning and Execution Based on LLM

HUANG Xueqin, ZHANG Sheng, ZHU Xianqiang, ZHANG Qianzhen, ZHU Cheng

Computer Science. 2025, 52 (3): 248-259. doi:10.11896/jsjkx.241100068

Abstract

PDF(3190KB) ( 231 )

References | Related Articles | Metrics

Owing to the development of generative artificial intelligence,the intelligent planning technology of unmanned systems is set to undergo a new transformation.This paper first analyzes the shortcomings of traditional intelligent task planning paradigms in terms of generalization,transferability and coherence.In response,it proposes a new paradigm for task planning and execution based on large language models,namely generative task network.This method enables unmanned systems to autonomously discover tasks,intelligently plan,and automatically execute them,forming a closed-loop from problem to solution.It also endows the task planning process of unmanned systems with the advantages of generalization and ease of transfer.This paper then introduces the concept of the generative task network,defines its key elements and models its process,and subsequently designs a ge-neral application architecture.Finally,an application analysis is conducted taking the aviation materials warehouse of N Airlines as the scenario,effectively enhancing the intelligence and automation levels of unmanned systems in warehouse management.

Learning Rule with Precise Spike Timing Based on Direct Feedback Alignment

NING Limiao, WANG Ziming, LIN Zhicheng, PENG Jian, TANG Huajin

Computer Science. 2025, 52 (3): 260-267. doi:10.11896/jsjkx.240100195

Abstract

PDF(2370KB) ( 172 )

References | Related Articles | Metrics

Due to the complex spatiotemporal dynamics of spike neurons and synapses,training spike neural networks(SNNs) is relatively challenging,and there are currently no widely accepted core training algorithms and techniques.In this paper,we propose a learning rule with precise spike timing based on direct feedback alignment(PREST-DFA).Inspired by the learning algorithm called spike layer error reassignment(SLAYER),PREST-DFA uses error signals based on spike convolution differences.The output layer iteratively calculates the error values,and utilizes direct feedback alignment(DFA) to broadcast the error to hidden layer neurons,finally achieving synaptic weights update.We implement time-driven PREST-DFA,and simulation experiments demonstrate that PREST-DFA has precise spike timing learning capabilities and good biological plausibility.Based on literature search results,this is the first time to verify that learning algorithm based on DFA can control the precise fire time of spikes in deep networks,indicating that the DFA mechanism can be applied to algorithm design based on spike timing.We also compare learning performance and training speed.Experimental results show that PREST-DFA can achieve good learning performance with lower inference latency and can accelerate training speed compared to learning algorithms trained using backpropagation with the same learning rule.

Automatic Scheduling Search Optimization Method Based on TVM

HAN Lin, WANG Yifan, LI Jianan, GAO Wei

Computer Science. 2025, 52 (3): 268-276. doi:10.11896/jsjkx.240100126

Abstract

PDF(2394KB) ( 182 )

References | Related Articles | Metrics

With the rapid development of artificial intelligence and the continuous emergence of new operators and hardware,the development and maintenance of operator libraries face enormous challenges.Relying solely on manual optimization can no longer meet the needs of improving AI model performance.Ansor is an operator automatic scheduling technique based on TVM,which can search for the best scheduling schemes for different backend deep learning models or operators,generate high-performance code without the need for users to manually define templates.However,the huge search space results in low search efficiency.Therefore,two optimization schemes are proposed.One is to select the optimal performance sketch based on Reinforcement lear-ning algorithm,and the other is to predict mutation rules based on machine learning models.Two optimization schemes aim to reduce the search time for the optimal scheduling scheme and quickly generate high-performance operators.To evaluate the effectiveness of the optimization plan,three models such as Resnet-50 and three operators such as conv2d are tested and evaluated.The results show that the optimized Ansor can generate target programs with the same or even better performance as before in only 70%～75% search time.Moreover,under the optimal iteration number,the inference speed of the target program can be improved by up to 5%.

Rumor Detection on Potential Hot Topics with Bi-directional Graph Attention Network

LI Shao, JIANG Fangting, YANG Xinyan, LIANG Gang

Computer Science. 2025, 52 (3): 277-286. doi:10.11896/jsjkx.240100204

Abstract

PDF(3758KB) ( 210 )

References | Related Articles | Metrics

Most of the existing methods for detecting rumors on social media networks typically focus on individual posts as the target of detection,which leads to a cold start problem due to insufficient data,adversely affecting the detection performance.Moreover,these methods do not filter out the vast amount of irrelevant information in social media networks,resulting in longer detection latency and poorer performance.Additionally,current methods tend to emphasize static features during the spread of rumors when analyzing the characteristics of rumor propagation,making it difficult to fully leverage the dynamic relationships between nodes to model the complex propagation process.To address these issues,a rumor detection method based on potential hot topics and graph attention neural networks is proposed.The method employs a neural topic model and a potential hot topic discover model for topic-level rumor detection to overcome the cold start problem.Furthermore,a detection model named TPC-Bi-GAT is designed to analyze the dynamic features of rumor topic propagation for authenticity detection.Experiments on 3 public datasets show that the proposed method achieves a significant improvement of 3%~5% in accuracy compared with the existing methods,which verifies the effectiveness of the proposed method.

Joint Relational Patterns and Analogy Transfer Knowledge Graph Completion Method

SONG Baoyan, LIU Hangsheng, SHAN Xiaohuan, LI Su, CHEN Ze

Computer Science. 2025, 52 (3): 287-294. doi:10.11896/jsjkx.240700156

Abstract

PDF(2245KB) ( 204 )

References | Related Articles | Metrics

In recent years,knowledge graph embedding(KGE) has emerged as a mainstream approach and achieved significant results in the task of knowledge graph completion.However,existing KGE methods only consider the information of triplets at the data level,neglecting the semantic relational patterns that exist between different triplets at the logical level,leading to certain performance deficiencies in current methods.To address this issue,a knowledge graph completion method(RpAT) that integrates relational patterns and analogy transfer is proposed.Firstly,at the logical level,different relational patterns are refined according to the semantic hierarchy of entity relationships.Secondly,at the data level,a method for generating pattern analogy objects is proposed,which utilizes the properties of relational patterns to generate similar analogy objects for target triplets and transfers missing information based on these analogy objects.Finally,a comprehensive scoring function that integrates the reasoning capabilities of the original knowledge graph embedding model and the analogy transfer capabilities is proposed to enhance the performance of graph completion.Experimental results show that,compared to other baseline models,the RpAT method pimproves the MRR values by 15.5% and 1.8% on the FB15k-237 and WN18RR datasets,respectively,demonstrating its effectiveness in the task of knowledge graph completion.

Multi-hop Knowledge Base Question Answering Based on Differentiable Knowledge Graph

WEI Qianqiang, ZHAO Shuliang, ZHANG Siman

Computer Science. 2025, 52 (3): 295-305. doi:10.11896/jsjkx.240600095

Abstract

PDF(1982KB) ( 184 )

References | Related Articles | Metrics

Knowledge base question answering(KBQA) is a challenging and popular research direction.Currently,embedding-based methods obtain the answer to a question through implicit reasoning and cannot generate complete reasoning paths.Models based on differentiable knowledge graphs only needs the question-answer pairs as weak supervision signals to generate explainable results.An end-to-end encoder-decoder model based on differentiable knowledge graphs is proposed.The encoder uses multi-head attention mechanism and LSTM to model the fine-grained sequence of questions,generating query vectors that can effectively represent the semantic features of each step of the question.The decoder uses feedforward neural networks to effectively represent the weights of each hop in the entire question.Our model solves the problem of information loss caused by previous coarse-grained and non-sequential modeling methods.The experiments are conducted on five datasets:MetaQA-1hop,MetaQA-2hop,MetaQA-3hop,WebQSP and CWQ,and the model achieves accuracy of 97.5%,100%,100%,77.8% and 51.4%,respectively.The ablation experiment shows that each module contributes to the overall performance improvement of the model.

Overview of Neighbor Discovery Algorithms in Directional Wireless Ad Hoc Networks

LI Xiang, ZHU Xiaojun, FENG Simeng, DONG Chao, ZHANG Lei

Computer Science. 2025, 52 (3): 306-317. doi:10.11896/jsjkx.240600108

Abstract

PDF(2103KB) ( 178 )

References | Related Articles | Metrics

A systematic summary is provided for the research achievements in directional neighbor discovery algorithms within the current domain of wireless directional ad hoc networks.Initially,stemming from the crucial significance of directional ad hoc networks in wireless communication,relevant background knowledge and fundamental concepts are introduced,delineating its research prospects in the realm of wireless communication.Subsequently,based on distinct technical standards,directional neighbor discovery algorithms are categorized and compared across multiple dimensions,delving into various applicable scenarios and associated limitations.Specific classifications include deterministic and random algorithms based on scanning sequence design,synchronous and asynchronous algorithms,purely directional and omnidirectional-assisted discovery algorithms,blind and semi-blind algorithms,as well as direct and indirect neighbor discovery algorithms.Moreover,by integrating the proposed classification methodology with the practical application scenarios of algorithms,the design principles and convergence processes of significant directional neighbor discovery algorithms are elaborately elucidated,including deterministic and random neighbor discovery algorithms,asynchronous neighbor discovery algorithms,and optimization algorithms utilizing machine learning techniques.Lastly,the future research directions and application trends of directional neighbor discovery algorithms are deliberated upon.

Study on MAC Protocol of LoRa Network Hidden Terminal Based on BTMA

WANG Hao, CAI Yuhang, CHEN Guojie, WANG Lu

Computer Science. 2025, 52 (3): 318-325. doi:10.11896/jsjkx.240700203

Abstract

PDF(3312KB) ( 170 )

References | Related Articles | Metrics

The emergence of low power wide area network(LPWAN) technology allows for longer-distance communication while minimizing power consumption and reducing transmission costs.LoRa(long range) technology,as a standout in this field,is highlyfavored in both industrial and academic circles due to its long-range capabilities,low power consumption,high capacity,strong anti-interference,and high reception sensitivity.However,the widely used ALOHA-based LoRaWAN protocol in the industry struggles to effectively address severe data packet collisions resulting from the massive access of terminal devices to the LoRa network,as well as the hidden terminal problem caused by the LoRa CAD(channel activity detection) feature.This paper proposes a BTMA(busy tone multiple access)-based MAC protocol for LoRa networks,known as the BT-MAC protocol.This protocol leverages LoRa’s high reception sensitivity,with the gateway using “busy tone” beacons to inform each node of the gateway’s operational status,thereby reducing the transmission of invalid packets.Simultaneously,nodes maintain a logical channel matrix with “busy tone” information and local information.By employing an optimal channel selection algorithm,nodes select the best logical channel for transmission,reducing collisions among uplink data packets from end nodes.This effectively mitigates the hidden terminal problem and congestion in LoRa networks.A LoRa network MAC protocol testing platform is built to test the effectiveness of BT-MAC.Extensive concurrent experiments and energy consumption tests are conducted in both indoor and outdoor environments.The experimental results show that the throughput of the BT-MAC protocol is 1.6 times that of the LMAC-2 protocol and 5.1 times that of the ALOHA protocol.Additionally,its packet reception rate is 1.53 times that of the LMAC-2 protocol and 17.2 times that of the ALOHA protocol..The average energy consumption per packet is approximately 64.1% of that of the LMAC-2 protocol and 14.2% of that of the ALOHA protocol.

Edge-side Federated Continuous Learning Method Based on Brain-like Spiking Neural Networks

WANG Dongzhi, LIU Yan, GUO Bin, YU Zhiwen

Computer Science. 2025, 52 (3): 326-337. doi:10.11896/jsjkx.240900070

Abstract

PDF(5090KB) ( 146 )

References | Related Articles | Metrics

Mobile edge computing has become an important computing model adapted to the needs of smart IoT applications,with advantages such as low communication cost and fast service response.In practical application scenarios,on the one hand,the data acquired by a single device is usually limited;on the other hand,the edge computing environment is usually dynamic and variable.Aiming at the above problems,this paper focuses on edge federated continuous learning,which innovatively introduces spiking neural networks (SNNs) into the edge federated continuous learning framework to solve the catastrophic forgetting problem faced by local devices in dynamic edge environments while reducing the consumption of device computation and communication resources.The use of SNNs to solve the edge federated continuous learning problem faces two main challenges.First,traditional spiking neural networks do not take into account the continuously increasing input data,and it is difficult to store and update the knowledge over a long time span,which results in the inability to realize effective continuous learning.Second,there are variations in the SNN models learned by different devices,and the global model obtained by traditional federated aggregation fails to achieve a better performance on each edge device achieve better performance on each edge device.Therefore,a new spiking neural network-enhanced edge federation continuous learning (SNN-Enhanced Edge-FCL) method is proposed.To address challenge I,a brain-like continuous learning algorithm for edge devices is proposed,which employs a brain-like spiking neural network for local training on a single device,and at the same time adopts a sample selection strategy based on the flocking effect to save representative samples of historical tasks.To address challenge II,a global adaptive aggregation algorithm with multi-device collaboration is proposed.Based on the working principle of SNN,the spiking data quality index is designed,and through the data-driven dynamic weighted aggregation method to assign corresponding weights to different device models to enhance the generalization of the glo-bal model when the global model is aggregated.The experimental results show that compared with the edge federation continuous learning method based on traditional neural networks,the communication and computational resources consumed by the proposed method on the edge devices are reduced by 92%,and the accuracy of the edge devices on the test set for five continuous tasks is above 87%.

Graph Reinforcement Learning Based Multi-edge Cooperative Load Balancing Method

ZHENG Longhai, XIAO Bohuai, YAO Zewei, CHEN Xing, MO Yuchang

Computer Science. 2025, 52 (3): 338-348. doi:10.11896/jsjkx.240100091

Abstract

PDF(3933KB) ( 157 )

References | Related Articles | Metrics

In mobile edge computing,devices can effectively relieve latency and energy consumption by offloading computation-intensive tasks to nearby edge servers.In order to improve the quality of service,edge servers need to collaborate with each other rather than working alone.For the load balancing problem of multi-edge collaboration,the existing solutions often depend on accurate mathematical models or make fair use of edge topological relationships.To solve this problem,an offloading decision-ma-king method based on graph reinforcement learning is proposed in this paper.Firstly,the load balancing scenario with multi-edge collaboration is abstracted as graph data,then a graph embedding process based on graph convolutional neural network is used to extract the information features of the graph,for assisting the deep Q-network to make offloading decisions,and finally the objective load balancing plan is found through a centralized feedback-control mechanism.Simulation experiments are conducted in multiple scenarios,the results verify the effectiveness of the proposed method in shortening the average response latency of the tasks,and the load balancing effect which is better than the comparison algorithms and close to the ideal plan can be obtained in a short period of time.

Network Slicing End-to-end Latency Prediction Based on Heterogeneous Graph Neural Network

HU Haifeng, ZHU Yiwen, ZHAO Haitao

Computer Science. 2025, 52 (3): 349-358. doi:10.11896/jsjkx.240800067

Abstract

PDF(3453KB) ( 177 )

References | Related Articles | Metrics

End-to-end latency,as a crucial performance metric for network slicing,is difficult to predict accurately via modeling due to the influences of network topology,traffic model,and scheduling policies.To tackle the above issues,we propose a heterogeneous graph neural network-based network slicing latency prediction(HGNN) algorithm,where the hierarchical heterogeneous graph of slice-queue-link is constructed to implement the hierarchical feature representation of the slice.Then,considering the attribute characteristics of three types of nodes in the hierarchical graph,i.e.slices,queues,and links,a heterogeneous graph neural network is presented to extract the underlying slice-related features such as topological dynamic changes,edge feature information,and long dependency relationships.Specifically,the graph neural network GraphSAGE,the graph neural network EGRET,and gated recurrent unit GRU are respectively adopted to extract the features of slices,queues,and link.Meanwhile,the iterative update of network slice feature representation and accurate prediction of slice latency are achieved using deep regression based on the heterogeneous graph neural network.Finally,a slice database with various topologies,traffic models,and scheduling policies is constructed using OMNeT++,and the effectiveness of HGNN in predicting slice end-to-end latency is validated on this database.Additionally,by comparing with other graph deep learning-based slice latency prediction methods,the superiority of HGNN in terms of prediction accuracy and generalization is further verified.

Self-learning Star Chain Space Adaptive Allocation Method

DU Likuan, LIU Chen, WANG Junlu, SONG Baoyan

Computer Science. 2025, 52 (3): 359-365. doi:10.11896/jsjkx.240700140

Abstract

PDF(2946KB) ( 169 )

References | Related Articles | Metrics

Blockchain sharding technology is an effective method to improve the throughput of blockchain systems.Existing blockchain sharding methods mostly adopt parallel architecture sharding schemes,which have not solved the problem of high cross-shard transaction ratios,leading to reduced throughput and potential infinite transaction confirmation delays.To address these issues,a self-learning-based star chain space adaptive allocation structure is proposed.Firstly,to address the issue of high cross-shard transaction ratios in blockchain sharding systems,a beacon chain-shard chain architecture throughput model is proposed.Secondly,considering the relationship between the throughput and latency of sharded blockchain,a star chain space dyna-mic decision-making process is designed,with a reward function for star chain space.Finally,a distributed multi-agent reinforcement learning dynamic clustering method is proposed,treating each shard as an agent to collectively learn cooperative strategies.Experimental results show that the proposed method improves throughput,cross-shard transaction ratio,and transaction confirmation delay by approximately 31.74%,35.96%,and 37.13% respectively,compared with existing methods.

Electric Taxi Charging Pile Rental Model and Cost Optimization

XU Jia, ZHANG Yiming, CHEN Wenbin, YU Xinshi

Computer Science. 2025, 52 (3): 366-376. doi:10.11896/jsjkx.240100121

Abstract

PDF(3089KB) ( 174 )

References | Related Articles | Metrics

In recent years,the proliferation of private electric vehicles has largely increased the competition for public charging stations between electric taxis and private electric vehicles,consequently decreasing the charging efficiency of electric taxis.This paper proposes the electric taxi charging pile rental mode,which can meet the taxis charging demand by renting widely distributed public charging piles as temporary exclusive charging piles.This charging model can reduce the construction cost of charging stations,offer priority charging services for electric taxis and mitigate the charging competition between electric taxis and private electric vehicles.This paper proposes two charging rental cost models for electric taxis.We first formalize the task number based charging allocation(NCA) problem with the objective of minimizing total charging cost.We present a task number-based charging allocation algorithm(NCAA) and provide its approximation.Furthermore,we formalize task finishing time based charging allocation(TCA) problem and proposes the task finishing time based charging allocation algorithm(TCAA).The simulation results based on real data sets show that,compared with the baseline algorithms,NCAA and TCAA can reduce the total charging cost at most by 16.15% and 17.49%,respectively.

Survey of Federated Incremental Learning

XIE Jiachen, LIU Bo, LIN Weiwei , ZHENG Jianwen

Computer Science. 2025, 52 (3): 377-384. doi:10.11896/jsjkx.240300035

Abstract

PDF(2025KB) ( 273 )

References | Related Articles | Metrics

Federated learning,with its unique distributed training mode and secure aggregation mechanism,has become a research hotspot in recent years.However,in real-life scenarios,local model training often faces new data,leading to catastrophic forgetting of old data.Therefore,effectively integrating federated learning with incremental learning is crucial for achieving sustainable development of federated ecosystems.This paper first conducts an in-depth investigation and analysis of federated incremental learning,exploring its concepts.Subsequently,it elaborates on federated incremental learning methods based on data,model,architecture,and multi-aspect joint optimization,while also categorizing and comparing various existing methods.Finally,building upon this foundation,it analyzes and summarizes future research directions for federated incremental learning,such as scalability,small sample handling,security,reliability,and multi-task scenarios.

Malicious Code Detection Method Based on Hybrid Quantum Convolutional Neural Network

XIONG Qibing, MIAO Qiguang, YANG Tian, YUAN Benzheng, FEI Yangyang

Computer Science. 2025, 52 (3): 385-390. doi:10.11896/jsjkx.240800006

Abstract

PDF(2335KB) ( 186 )

References | Related Articles | Metrics

Quantum computing is a new computing model based on quantum mechanics,with powerful parallel computing capabi-lity far beyond classical computing.Hybrid quantum convolutional neural network combines the dual advantages of quantum computing and classical convolutional neural network,and gradually becomes one of the research hotspots in the field of quantum machine learning.Currently,the scale of malicious code is still growing at a high speed,its detection model is getting more and more complex,the number of parameters is getting bigger and bigger,and there is an urgent need for an efficient and lightweight detection model.For this,this paper designs a hybrid quantum convolutional neural network model,which integrates quantum computing into classical convolutional neural network to improve the computational efficiency of the model.The model contains a quantum convolutional layer,a pooling layer,and a classical fully connected layer.The quantum convolutional layer is implemented using a low-depth,strong entangled and lightweight parameterized quantum circuit,using only two types of quantum gates:quantum rotation gate Ry and CNOT(controlled-NOT),and using only two qubits to implement the convolutional computation.The pooling layer implements three pooling methods based on classical and quantum computation.The simulation experiments in this paper are conducted on Google TensorFlow Quantum.Experimental results show that the classification performance(accuracy,F1-score) of the model in this paper on the open-source malicious code datasets DataCon2020 and Ember,reaches(97.75%,97.71%) and(94.65%,94.78%),which are both significantly improved.

Tor Network Path Selection Algorithm Based on Similarity Perception

SUI Jiaqi, HU Hongchao, SHI Xin, ZHOU Dacheng, CHEN Shangyu

Computer Science. 2025, 52 (3): 391-399. doi:10.11896/jsjkx.240100151

Abstract

PDF(2521KB) ( 171 )

References | Related Articles | Metrics

Due to the low threshold construction conditions and open participation mechanism of Tor,attackers can conduct Sybil assaults on Tor networks by controlling a significant number of malicious Sybil nodes,posing a serious threat to user privacy.One class ofmethod defenses by identifying malicious nodes.This class of security suffers from a lack of accuracy in evaluating node similarities and challenges in recognizing malicious nodes that have targeted concealment.The other class protects by strengthening the security of the Tor path selection algorithm,which has issues such as being unable to withstand repeated Sybil attacks and finding it challenging to satisfy the needs of both performance and security.To make up for the vulnerability problem of the existing defense methods themselves,it is proposed to apply the malicious node identification methods and path selection algorithms comprehensively.First,the information of relay nodes is collected from multiple data sources,and the data from multiple sources are verified,filtered,and fused to improve the security at the data level.Second,the selection tendency of dependable nodes with long-term bandwidth stability is somewhat increased by the optimization of bandwidth measurements based on historical data,increasing the cost of deploying malicious Sybil nodes for attackers.Then,the relay node similarity assessment method is optimized,and a nearest-neighbor sorting algorithm based on aggregated similarity scores is proposed to improve the accuracy of the node similarity analysis.Finally,the optimized similarity assessment method is integrated into the path selection algorithm design,and a path selection algorithm based on similarity perception is proposed.Experimental results show that the algorithm not only shows better defense effect against multiple Sybil attacks,but also ensures that the performance requirements of the link are met.

Windows Domain Penetration Testing Attack Path Generation Based on Deep Reinforcement Learning

HUO Xingpeng, SHA Letian, LIU Jianwen, WU Shang, SU Ziyue

Computer Science. 2025, 52 (3): 400-406. doi:10.11896/jsjkx.231200074

Abstract

PDF(1931KB) ( 174 )

References | Related Articles | Metrics

Windows domain is a prime target for intranet penetration.However,the scenarios and methods of Windows domain penetration testing are very different from those of conventional intranet penetration..Existing research on intelligent path discovery is not suitable for the intricacies of Windows domain environments.Therefore,the current conventional intelligent path discovery research is not applicable to the Windows domain environment.In order to enhance the security protection of Windows domain,an automatic generation method of Windows domain penetration testing path based on deep reinforcement learning is proposed.Firstly,Windows domain penetration testing scenario is modeled as Markov decision process,and a simulator suitable for reinforcement learning is designed through Gymnasium of OpenAI.Secondly,in response to the challenge of limited exploration in large action and observation spaces,prior knowledge is leveraged to eliminate redundant actions and streamline the observation space.Lastly,the virtual machine technology is used to deploy the Windows domain environment in the small server,and the NDD-DQN is used as the basic algorithm to realize the whole process automation from information collection,model construction to path generation in the real environment.Experimental results show that the proposed method exhibit effective simulation and training effect in complex,real-world Windows domain environments.