Computer Science

Formal Theorem Proving Empowered by Large Language Model:Survey and Perspectives

HU Junjie, CHEN Yujie, HU Yikun, WEN Cheng, CAO Jialun, MA Zhi, SU Jie, SUN Weidi, TIAN Cong, QIN Shengchao

Computer Science. 2026, 53 (4): 1-23. doi:10.11896/jsjkx.251000067

Abstract

PDF(4206KB) ( 6568 )

References | Related Articles | Metrics

Theorem proving,as the intersection of logic and computer science,not only lays the formal foundation for modern mathematical reasoning,but also serves as a litmus test for evaluating the logical reasoning ability of artificial intelligence,while underpinning the fundamental demand for high reliability in software engineering.However,traditional theorem proving relies heavily on rigorous logical reasoning and labor-intensive human-computer interaction,long constrained by limited automation,insufficient reasoning efficiency,and strong dependence on expert experience.With the rapid development of large language models(LLM),their breakthrough capabilities in natural language understanding,code generation,and logical reasoning provide new opportunities to enhance the automation and intelligence of theorem proving.This paper systematically reviews the current state and emerging trends of LLM-driven formal theorem proving,focusing on two major application scenarios:1)In interactive theorem proving,we analyze how existing works alleviate the significant manual overhead,and take the Prover series in the Lean language as an example to systematically summarize its technical evolution path.2)In automated theorem proving,we explore how large language models combine techniques such as static analysis and verifier feedback to automatically generate formal specifications including function contracts and loop invariants,thereby significantly lowering the verification barrier.Finally,the paper highlights common challenges in this field,including specification completeness,reasoning reliability,data scarcity,and tool chain integration,and discusses future directions for research and development.

Recent Advances in Efficient Algorithms for k-Means Clustering on High-dimensional Big Data

GAO Guichen, JIANG Shaofeng

Computer Science. 2026, 53 (4): 24-32. doi:10.11896/jsjkx.251000037

Abstract

PDF(1579KB) ( 5575 )

References | Related Articles | Metrics

Clustering is a classic task in machine learning.The goal of clustering is to partition data points into groups,with respect to a similarity measure.As one of the most fundamental models for clustering,k-means has been extensively studied and widely applied.This paper focuses on the computational issue of solving k-means efficiently,and discusses the progress of(near-) linear time approximation algorithms for k-means,from the perspective of theoretical computer science.It also briefly discusses the status of clustering algorithms in various big data computational models,including dynamic,streaming and distributed computing.

Rethinking Deep Generalization Mechanisms:Establishment of Uniform Convergence Bounds Under Overparameterization and High-dimensional Noise Perturbations

LI Pengqi, DING Lizhong, ZHANG Chunhui, FU Jiarun

Computer Science. 2026, 53 (4): 33-39. doi:10.11896/jsjkx.250600129

Abstract

PDF(2598KB) ( 5147 )

References | Related Articles | Metrics

Deep neural networks demonstrate both powerful expressive capabilities and exceptional generalization performance,which fundamentally conflicts with the classical statistical learning tenet that “model complexity harms generalization”,rendering the analysis of deep generalization mechanisms under traditional frameworks intractable.Classic uniform convergence theory,constrained by its reliance on parameter space dimensionality and neglect of algorithmic implicit bias,fails to directly align with the core characteristics of deep networks.To address this theoretical gap,this paper constructs a novel statistical learning framework that integrates key features of deep models,thereby redefining the explanatory paradigm of uniform convergence theory for deep generalization mechanisms.It derives the first effective uniform convergence bound for deep networks by introducing a surrogate linear model that preserves overparameterization and high-dimensional noise-perturbation features,which reveals a benign role of high-dimensional noise in improving generalization beyond classical low-dimensional theory.Building on this deep generalization mechanism,it further proposes a scale-sensitive regularized training scheme and shows that the bound and the generalization error decay with increasing sample complexity.Supported by both theoretical and empirical evidence,this work breaks through the adaptability bottleneck of uniform convergence bounds and reopens the door for uniform convergence theory to analyze the generalization of deep models.

Automatic Theorem Proving Based on Pre-trained Language Models and Unification

CHEN Hongxiu, ZENG Xia, LIU Zhiming, ZHAO Hengjun

Computer Science. 2026, 53 (4): 40-47. doi:10.11896/jsjkx.251000066

Abstract

PDF(3179KB) ( 5796 )

References | Related Articles | Metrics

Pre-trained language models(PLMs) have demonstrated considerable potential in proving theorems within formal environments such as Metamath.However,this potential remains difficult to translate into reliable reasoning ability.Existing approaches typically require PLMs to directly predict substitutions,which come from an open and potentially infinite expression space.The absence of logical constraints in this process limits the generalization capability of the models.To address this challenge,a new proof paradigm is introduced:work variables temporarily substitute concrete terms during inference,and their specific instances are derived through the unification algorithm.This design enables the model to focus on theorem selection rather than generating unguided substitutions.Based on this paradigm,a dataset is extracted from the Metamath library to construct UnPro-ver(Unification-Driven Prover),which is trained on this data.In addition,several data augmentation strategies are designed to further enhance UnProver’s performance.Experimental results show that UnProver consistently outperforms baseline methods and GPT-4o on test sets,achieving superior performance in both proving capability and efficiency.Moreover,UnProver discove-ries six new and shorter proofs,which have been formally accepted into the official Metamath library.

Dynamic Ensemble Stacking Broad Learning System for High-dimensional Data

YUN Fan, YU Zhiwen, YANG Kaixiang

Computer Science. 2026, 53 (4): 48-56. doi:10.11896/jsjkx.251000068

Abstract

PDF(3375KB) ( 5434 )

References | Related Articles | Metrics

In high-dimensional small sample classification tasks,BLS (Broad Learning System) has garnered much attention due to its efficiency.However,the feature extraction capability of the single-layer BLS is limited,making it difficult to handle complex high-dimensional data.The random node generation mechanism induces node redundancy when directly stacking BLS hidden la-yers,thereby hindering improvements in model performance.To address these issues,an ensemble stack BLS(E-SBLS) algorithm is proposed.E-SBLS utilizes the output of the previous BLS layer as enhanced features,concatenates them with the original feature weighted by classification confidence,and sends them into the subsequent BLS to continuously enhance the feature representation capability in deeper layers.By integrating the outputs of multiple BLS layers through a meta-learner pool,the high-dimensional feature extraction ability of the original single-layer BLS is augmented,thereby improving the generalizationperfor-mance of the proposed model.Furthermore,considering the complex and variable characteristics of high-dimensional data,a dynamic ensemble framework is designed to adjust the complexity of the model dynamically based on data difficulties.The proposed method further enhances ensemble efficiency while maintaining model performance.Ablation experiments validate the effectiveness of each module in the proposed algorithm,and comparative experiments demonstrate the superior classification performance of the proposed model on high-dimensional disease data.

Research on Efficient Construction of Plateaued Functions Based on DQN-enhanced Genetic Algorithm

WU Yansheng, CAO Xinyi, FAN Weibei

Computer Science. 2026, 53 (4): 57-65. doi:10.11896/jsjkx.251100083

Abstract

PDF(2352KB) ( 4373 )

References | Related Articles | Metrics

Plateaued functions,as an important generalization of Bent functions,inherit many of the desirable cryptographic pro-perties of Bent functions and hold significant application value.However,traditional methods for constructing plateaued functions suffer from issues such as high computational complexity and limited flexibility.To address these challenges,this paper proposes an adaptive genetic algorithm enhanced by a deep Q-Network(DQN).This algorithm deeply integrates DQN with the genetic algorithm,constructing a multi-dimensional state space to perceive population evolutionary characteristics.Through a group consensus mechanism,it intelligently selects from six combinations of crossover and mutation strategies,enabling adaptive control of genetic parameters.Experimental results demonstrate that the proposed algorithm achieves a fitness improvement of 0.20~0.35,exhibits faster convergence speed and higher stability,and can generate an average of 230~300 valid Plateaued function truth sequences,significantly outperforming both the standard genetic algorithm and the basic Q-learning-enhanced genetic algorithm.The algorithm intelligently adjusts the mutation rate within the range of 0.235~0.276 and maintains the crossover operation usage rate between 70% and 90%,effectively optimizing the Walsh spectrum distribution while preserving population diversity.Although computational overhead increases slightly,its significant advantages in solution quality,convergence performance,and strategic adaptability validate the effectiveness of deep reinforcement learning in the construction of cryptographic functions,providing a novel approach for the intelligent design of Boolean functions.

Causal Disentangled Representation Learning with Integrated Sparse Coding

HUANG Beibei, LIU Jinfeng

Computer Science. 2026, 53 (4): 66-77. doi:10.11896/jsjkx.251000012

Abstract

PDF(3995KB) ( 4707 )

References | Related Articles | Metrics

Deep learning models often lack of interpretability in their feature representations due to their “black-box” nature.Although existing disentangled representation learning methods can enhance interpretability to some extent by identifying independent factors within the data,they usually neglect complex correlations and potential causal structures,which limits their applicability in critical domains such as autonomous driving and medical diagnosis,especially in scenarios that require understanding and intervention of causal relationships.To address the insufficient causal modeling in current disentangled representation learning,a disentanglement framework integrating sparse coding with causal inference is constructed.Under appropriate supervision,this framework leverages a causal inference mechanism to precisely model causal relationships within the data,thereby not only generating high-quality and structured representations but also enabling the modeling and intervention of potential causal mechanisms,which significantly improves the model’s adaptability and robustness in causal tasks.Meanwhile,the embedded convolutional sparse coding layer imposes sparsity constraints to effectively filter key representations highly relevant to causal structures,further enhancing the model’s sensitivity and expressive capacity for higher-order causal relationships.Experimental results demonstrate that the proposed framework performs excellently on both the Pendulum and CelebA datasets,achieving a sample efficiency of 98.65% on the Pendulum dataset and 99.55% on the CelebA dataset.Moreover,it outperforms existing methods in terms of causal intervention effectiveness and distribution robustness,confirming its superiority in complex causal scenarios.

Mobile Robot Two-dimensional Full Coverage Path Planning Algorithm Based on MaklinkDiagram and Boustrophedon Path

LI Boyao, ZHAO Binbin, TAO Mingjie, CHEN Lu

Computer Science. 2026, 53 (4): 78-87. doi:10.11896/jsjkx.250700190

Abstract

PDF(4501KB) ( 5296 )

References | Related Articles | Metrics

With the increasingly widespread application of complete coverage path planning of mobile robots in production inspection and home cleaning fields,the problems existing in the current algorithms are becoming increasingly obvious,such as high repetitive coverage rate,non-optimal transition paths between sub-regions,and insufficient adaptability to concave polygonal obstacles.Therefore,this paper proposes a two-dimensional complete coverage path planning algorithm for mobile robots,which integrates Maklink graph theory,improved ant colony algorithm and Boustrophedon path.Firstly,the method employs Maklink graph theory to construct an environmental model by generating link lines.These lines are then used to partition the two-dimensional space into multiple convex polygonal subregions and establish an initial feasible path network.Secondly,the method transforms the connection sequence of the sub-regions into a generalized “Traveling Salesman Problem” and uses one-dimensional ant colony algorithm to compute the sequence of the sub-regions that the robot visits.Then,the ant colony algorithm for minimizing results of function and triangular pruning geometric optimization algorithm is applied to obtain the optimal transition paths between sub-regions.Finally,the method performs a zigzag traversal within each sub-region through the visiting sequence by the Boustrophedon path in order to achieve global coverage path planning.Simulation experiments conducted in multiple two-dimensional environments with varying complexities demonstrate that the proposed method can effectively adapt to scenarios containing a variety of polygonal obstacles,achieving a coverage rate of 100% with zero redundancy.Meanwhile,the comparative experiments with only using the traditional ant colony algorithm and the improved ant colony algorithm,this algorithm performs well in three aspects,which are the length of the transformation path,the length of the traversal path,and the repetition rate.The comparison with the traditional full traversal method of constructing environmental models using the raster method shows that the proposed algorithm has high modeling accuracy and excellent storage efficiency.

Multi-objective Intelligent Warehousing Path Planning Based on Conflict Free Path Algorithm

GONG Jing, YANG Yufa, ZHENG Yifan, SUN Zhixin

Computer Science. 2026, 53 (4): 88-100. doi:10.11896/jsjkx.250200035

Abstract

PDF(4710KB) ( 5258 )

References | Related Articles | Metrics

The research on warehouse path planning plays a crucial role in intelligent warehousing,as reasonable path planning can effectively avoid AGV path conflicts and improve in-warehouse transportation efficiency.To address the limitations of simplistic warehouse layouts and the lack of effective path conflict resolution strategies for complex environments,this paper proposes a multi-objective AGV path planning algorithm based on a coordinate reservation table and conflict classification.Firstly,a grid-based fish-bone layout scheme for intelligent warehousing is constructed.A distance calculation model between storage nodes is developed using a partition mechanism,forming a unidirectional directed graph representing the storage path network.Next,an AGV coordinate reservation table and a path conflict classification method are established,followed by the formulation of a hierarchical conflict resolution strategy.Then,a multi-objective intelligent warehouse path planning model is constructed with the goals of minimizing the total transportation distance,minimizing the maximum single transportation distance,and minimizing the waiting time for conflict resolution.Based on the proposed conflict resolution mechanism,a set of mutation operators and crossover operations is designed under an evolutionary genetic search framework.On top of the preference-guided multi-objective combinatorial optimization(P-MOCO) algorithm,an enhanced algorithm named CF-MOWVRP is proposed.This algorithm integrates preference-driven stochastic strategies,multi-objective dimensionality reduction,and reinforcement learning to obtain approximate Pareto-optimal solutions to the conflict-free multi-objective path planning model.Experimental results demonstrate that the proposed algorithm achieves faster convergence and better solution quality,successfully resolves AGV path conflicts,and provides feasible conflict-free path planning solutions.

KGMamba:Gene Regulatory Network Prediction Model Based on Kolmogorov-Arnold Network Optimizing Graph Convolutional Network and Mamba

GAO Tai, REN Yanzhang, WANG Huiqing, LI Ying, WANG Bin

Computer Science. 2026, 53 (4): 101-111. doi:10.11896/jsjkx.250500097

Abstract

PDF(4473KB) ( 6061 )

References | Related Articles | Metrics

Gene regulatory network(GRN) inference is pivotal for deciphering cell development mechanisms and propelling precisionmedicine research.However,existing deep learning approaches confront challenges of high computational complexity and inadequate global feature capture.To tackle this,a novel efficient prediction model integrating the Kolmogorov-Arnold network(KAN) driven graph convolutional network(KGCN) and Mamba module is proposed.Firstly,the multi-layer perceptron(MLP) in traditional graph convolution is replaced by KAN’s learnable spline functions,which retain local feature extraction while reducing redundancy through restructured computation,significantly improving efficiency.Secondly,the Mamba module is innovatively incorporated to prioritize attention to gene nodes critical for global regulation via its selective mechanism.Together,these components enable a unified optimization of both local and global feature modeling.Experimental comparisons with six other deep-learning models on public datasets are performed.Results demonstrate that this model outperforms others in AUC and AUPR performance metrics,while also showcasing remarkable robustness and computational efficiency,further demonstrating the superiority of the model.

High Frequency-Dense Quantum Gate Set Optimization Algorithm for Quantum Circuit in NISQ Era

LI Hui, LIU Shujuan, JU Mingmei, WANG Jiepeng, JI Yingsong

Computer Science. 2026, 53 (4): 112-120. doi:10.11896/jsjkx.241200213

Abstract

PDF(5282KB) ( 4591 )

References | Related Articles | Metrics

In NISQ era,considering the hardware coupling constraint limitations,not all quantum gates can be directly executed,and it is usually necessary to utilize the additional introduction of SWAP operation to realize the qubits exchange before the logical circuit can directly run on the physical hardware.In order to overcome the extra overhead of quantum gates brought about by the introduction of SWAP operation in the traditional quantum circuit mapping process,the qubit frequency is investigated,and the high frequency-dense quantum gate set strategy(HF-DQGS) is proposed and applied to the quantum circuit mapping.Based on the qubit frequency,the CNOT gate is prioritized,and the high frequency-dense quantum gate set is defined.The actual overhead of candidate SWAP gates is evaluated using a multivariate cost function to determine the SWAP operations to be performed.According to the evaluation criterion of optimal SWAP gate based on qubit frequency,the evaluation function after SWAP operation is compared to select the optimal SWAP gate.Experimental results show that HF-DQGS can significantly reduce the number of additional SWAP gates and,to some extent,the number of CNOT gates.Specifically,the test results on the t|ket〉 and Qiskit compilers show that the number of additional SWAP gates is reduced by an average of 36.6% and 47.8%,respectively,and the number of CNOT gates is reduced by an average of 13% and 13.4%,respectively.

Analysis of Data Trading Models and Transaction Challenges

CUI Jinjia, ZENG Chen, WANG Lu, PENG Xiaohui

Computer Science. 2026, 53 (4): 121-133. doi:10.11896/jsjkx.250900002

Abstract

PDF(1622KB) ( 284 )

References | Related Articles | Metrics

With the acceleration of digitalization,data have become a core resource across industries,driving continuous market growth.However,the current data trading market remains underdeveloped,mainly due to two reasons.Firstly,the high barriers for individuals to trade behavioral data.Secondly,the lack of sound compliance review mechanisms for inter-enterprise data tran-sactions,with incomplete trading rules that restrict market vitality.The fundamental difficulties in data trading lie in the unique characteristics of data,which differ from traditional “cash-and-carry” goods,leading to challenges in pricing,data rights confirmation,data quality assurance,non-repudiation of transactions,and the safeguarding of data sovereignty.This paper collects and organizes relatively mature data trading frameworks,classifies and compares them from the perspective of trading models,and provides a detailed introduction to the solutions proposed in the literature for the five major challenges mentioned above.Finally,in light of the current development of the data trading market,this study puts forward suggestions for future development.

Efficient Semantic-aware Trajectory Representation Learning Method via State Space Model

LIU Yichen, LIN Yan, ZHOU Zeyu, GUO Shengnan, LIN Youfang, WAN Huaiyu

Computer Science. 2026, 53 (4): 134-142. doi:10.11896/jsjkx.250600130

Abstract

PDF(2562KB) ( 298 )

References | Related Articles | Metrics

Vehicle trajectories provide crucial movement information for various traffic service applications.To better utilize vehicle trajectories,it is essential to develop trajectory representation learning methods that can effectively and efficiently extract tra-vel semantics,including movement behaviors and travel purposes,to support accurate downstream applications.However,this task presents two major challenges:1) movement behaviors are inherently spatio-temporally continuous,making them difficult to extract effectively from discrete trajectory points;2) travel purposes are related to the functionalities of areas and road segments traversed by vehicles,but these functionalities cannot be directly obtained from the raw spatio-temporal trajectory features,nor can they be extracted from the relevant complex textual features.To address these challenges,this paper proposes an efficient semantic-aware trajectory representation learning method called ESTRL.Firstly,a Mamba-based trajectory encoder is introduced.It uses high-order movement features to parameterize the trajectory state space model(Traj-SSM),which effectively and efficiently models continuous movement behaviors of vehicles.Secondly,a travel purpose-aware pre-training procedure is proposed.It integrates travel purposes into the learned trajectory embeddings through contrastive learning without introducing extra overhead to embedding calculation process.Extensive experiments on real-world datasets demonstrate that the proposed method outperforms state-of-the-art baseline models in both efficiency and accuracy.

Key Node Identification in Temporal Social Networks Based on Deep Learning and Multi-feature Fusion

ZHANG Xueqin, WANG Zhineng, LI Jinsheng, LU Yisong, LUO Fei

Computer Science. 2026, 53 (4): 143-154. doi:10.11896/jsjkx.250300147

Abstract

PDF(2579KB) ( 379 )

References | Related Articles | Metrics

Social network is the main channel of information dissemination,and identifying key nodes in social networks is important for discovering information dissemination hubs and performing information dissemination control.Realistic social networks are time-varying,and reasonable modeling of temporal networks with comprehensive description and deep mining of the spatial and temporal relationships of nodes is an important factor for accurately identifying key nodes in the network.In order to improve the accuracy of key node identification,a deep learning and multi-feature fusion based method MCNN(Multidimensional CNN) for key node identification in temporal social networks is proposed.The method firstly models the temporal network as a multidimensional relational network based on snapshots,and for a node,in each snapshot,the spatial,temporal,and spatio-temporal contexts of the node are extracted from the spatial structure,temporal coupling,and three types of spatio-temporal propagation relations,respectively,and the node feature matrix is constructed.In order to deeply analyze the spatio-temporal relationships of nodes in each snapshot,three types of node features are extracted using CNN,respectively,and fused to form the node snapshot features using the self-attention mechanism.To capture the evolution of node behaviors between snapshots,node snapshot features of all snapshots are combined as a time sequence,and LSTM is used to mine the snapshot sequence features.Finally,the influence of nodes is predicted using a fully connected layer.Experiments on six real temporal social networks show that MCNN outperforms the baseline approaches for key node identification in temporal social networks.

Pre-trained Spatio-Temporal Decoupling-based Traffic Flow Prediction Model

LI Jing, DU Shengdong, SHI Haochen, HU Jie, YANG Yan, LI Tianrui

Computer Science. 2026, 53 (4): 155-162. doi:10.11896/jsjkx.250600047

Abstract

PDF(2868KB) ( 348 )

References | Related Articles | Metrics

Traffic flow prediction,as a core technology for dynamic decision-making in smart cities,plays a crucial role in traffic signal control,route planning,and emergency management.With the expansion of urban road networks and the rapid growth of traffic data,traditional methods face challenges in accurately modeling the complex spatio-temporal interactions among road network nodes.Although pre-trained models can transfer knowledge across domains,they still encounter limitations when applied to traffic flow prediction,primarily due to coupled spatio-temporal features and the mismatch between pre-trained representations and traffic-specific characteristics.To address these issues,this paper proposes the pre-trained spatio-temporal decoupling-based traffic flow prediction model(PT-STD).The method employs a spatio-temporal decoupling module to disentangle the deep feature learning of spatial topological relationships and multi-granularity temporal patterns.Furthermore,it designs a hierarchical adaptive fine-tuning strategy that progressively unfreezes the normalization layers and attention parameters of the pre-trained model,gradually transferring the general knowledge learned in the pre-trained model to spatio-temporal feature modeling.Experimental results demonstrate that PT-STD achieves significant improvements on standard benchmark datasets,with a 3.89% reduction in mean absolute error(MAE) under data-scarce scenarios.

Denoising Diffusion Model-enhanced Algorithm for Battery Swap Demand Data Generation

LIU Dehua, YU Saixuan, QIAO Jinlan, HUANG Heqing, CHENG Wenhui

Computer Science. 2026, 53 (4): 163-172. doi:10.11896/jsjkx.250600205

Abstract

PDF(3967KB) ( 330 )

References | Related Articles | Metrics

In recent years,electric vehicle battery swap services have rapidly gained popularity due to their fast and convenient energy replenishment capabilities.Accurate prediction of user battery swap demand is crucial for optimizing the operational efficiency of battery swap platforms.However,in cities with newly deployed battery swap stations,conventional prediction models often suffer from insufficient training due to the lack of historical data,resulting in degraded prediction accuracy.To address this challenge,this paper proposes a denoising diffusion model-enhanced algorithm for generating battery swap demand data.By synthesizing samples that preserve the statistical distribution of real-world battery swap demand data,the proposed approach effectively augments the training dataset,thereby significantly improving the model’s prediction accuracy.Specifically,it firstly employs analog bit encoding to represent mixed-type battery swap demand data in continuous space,enabling it to be processed by the diffusion model.Furthermore,it designs a conditional denoising network incorporating the cross-attention mechanism,utilizing station information to guide the generation of high-quality battery swap demand data.Finally,the proposed algorithm is evaluated on a real-world battery swap dataset collected from 40 battery swap stations in Chengdu over a one-month period.Experimental results demonstrate that combining the generated data from the proposed algorithm with the original training data reduces the MAE,RMSE,and MAPE of battery swap demand prediction by 9.29%,8.56%,and 8.23%,respectively,compared to using the original training data alone.

Semi-supervised Learning Algorithm Based on Pointwise Manifold Structures and Uniform Regularity Constraints

XU Yamin, LI Xiaobin, ZHANG Run

Computer Science. 2026, 53 (4): 173-179. doi:10.11896/jsjkx.250300086

Abstract

PDF(1648KB) ( 304 )

References | Related Articles | Metrics

MR(Manifold Regularization) provides a powerful framework for semi-supervised classification using labeled and unlabeled datasets.Under the assumption of manifold,it enforces that similar instances should have similar classification results on the sample graph.It is notable that the core of MR lies in the pairwise smoothing on the sample graph,where smoothing constraints are applied to all instance pairs,treating each pair of instances as a whole.However,the smoothness can be point-to-point in essence,meaning that smoothness should be “everywhere” to correlate the behavior of each point or instance with that of its neighboring points.Therefore,this paper proposes a novel semi-supervised learning algorithm based on pointwise manifold and uniform regularity constraints(URC-PW-MR),which achieves semi-supervised learning by constraining individual local instances and introducing a fusion consistency regularization.This approach not only contains the pointwise nature of smoothness but also introduces the importance of individual instance by considering each instance rather than pairs of instances.The significance can be quantitatively characterized through factors such as local density.URC-PW-MR proposes a novel manifold smoothness realization approach that achieves semi-supervised learning through dual constraints:individual local instance regularization and fusion consistency regularization.Empirical evaluations demonstrate that URC-PW-MR exhibits more refined performance characteristics compared with conventional MR frameworks.

Modeling of Behavior-guided Multi-scale Bi-level Group Consensus Under Social Networks

CHANG Wenxia, ZHANG Chao, LI Wentao, ZHAN Jianming, LI Deyu

Computer Science. 2026, 53 (4): 180-187. doi:10.11896/jsjkx.250500066

Abstract

PDF(3128KB) ( 332 )

References | Related Articles | Metrics

As a key element of complex decision-making in the era of intelligence,group consensus aims to alleviate conflicts and reach a consensus by the interaction of opinions.To address the limitation of single-scale methods in fully reflecting information characteristics and to resolve conflicts arising from behavioral heterogeneity and unfairness,a behavior-guided multi-scale bi-level group consensus model is constructed.Firstly,a Choquet integral-based scale fusion model is proposed,where fuzzy measures characterize non-linear scale interactions and enable deep coupling.Next,the social network evaluation is applied to assess decision-maker behavior by internal performances measured by reliability and propagation strengths,and external performances measured by interaction density and cooperation intensity,providing a quantitative basis for behavior-guided strategies.Then,a bi-level consensus model under multi-granularity perspectives is constructed based on behavioral feature indicators,combining optimization models with rule mechanisms to balance the minimum cost and maximum fairness of opinion adjustments,optimizing resource allocation.Additionally,scoring functions that integrate cardinal with ordinal rankings are designed from both cardinal and ordinal perspectives,breaking through the limitations of traditional single-dimensional evaluations.Finally,a decision analysis of the ser-vice quality of the 5A-rated Jinci Scenic Area is performed based on online reviews from the Ctrip platform.

STWD-DLFRD:Multi-granularity Fake Review Detection via Sequential Three-way Decisions and Deep Learning

GU Bokai, LIU Dun, SUN Yang

Computer Science. 2026, 53 (4): 188-196. doi:10.11896/jsjkx.250500088

Abstract

PDF(2194KB) ( 250 )

References | Related Articles | Metrics

With the increasing influence of online reviews on consumer decision-making,fake review detection has become a critical task for maintaining the integrity of e-commerce ecosystems.Existing methods predominantly rely on static single-step detection,which overlooks dynamic feature evolution and cost-sensitive decision-making,leading to suboptimal efficiency.To address these limitations,this paper proposes a sequential three-way decisions and deep learning-based multi-granularity fake review detection framework(STWD-DLFRD).The framework constructs a multi-granularity feature space by extracting textual,behavioral and social relationship features from reviews.Leveraging a hierarchical decision mechanism of sequential three-way decisions,it dynamically identifies fake reviews with varying complexity levels.Experimental results demonstrate that STWD-DLFRD outperforms baseline models in both F1-score and accuracy,while significantly reducing total classification costs.This study provides an effective solution for high-cost-sensitive fake review detection in dynamic environments,balancing detection precision and decision efficiency through adaptive granularity refinement.

Study on Influence Mechanism and Control of Cross-diffusion on Rumor Spreading Pattern

FAN Xiaoling, DAI Shilong, XIAO Min, SUN Yonghui, XU Fengyu

Computer Science. 2026, 53 (4): 197-207. doi:10.11896/jsjkx.250300071

Abstract

PDF(6045KB) ( 445 )

References | Related Articles | Metrics

Turing instability is a key mechanism for generating spatial patterns in rumor spreading systems,where Turing patterns represent a self-organizing phenomenon in rumor propagation.However,most of the current researches on rumor spreading models focus on self-diffusion in one-dimensional space,which fails to accurately capture the complexities of rumor spreading and clarification.This paper proposes a two-dimensional cross-diffusion rumor spreading model and studies its Turing instability.The expansion of dimensions and the introduction of cross-diffusion mechanisms can effectively simulate the interactions between different groups.Through the stability analysis of the model,it explores how self-diffusion and cross-diffusion contribute to the onset of Turing instability,while also examining the suppressive effect of a PD controller on pattern formation.Finally,numerical simulations are conducted to validate the theoretical findings.This paper provides a theoretical basis for rumor control and has important practical guiding significance for maintaining social order and enhancing the reliability of information dissemination.

Concept-cognitive Learning and Incremental Learning in Complex Networks

QIN Haiqi, MI Jusheng

Computer Science. 2026, 53 (4): 208-214. doi:10.11896/jsjkx.250600216

Abstract

PDF(1500KB) ( 235 )

References | Related Articles | Metrics

In data analysis,concept-cognitive learning in networks is an important issue in the field of machine learning and artificial intelligence applied to network contexts.This paper applies the cognitive operator to complex networks,proposes the concept of network cognition,and quantifies network characteristics through the adjacency matrix and node degrees.This paper also discusses the concept of dynamic weighted networks,analyzes the situation where the connection strength of nodes changes over time,and proposes the definition of dynamic weighted network cognition.In addition,this paper proposes an incremental computation mechanism for object-oriented,attribute-oriented,and hybrid updates to cope with scenarios such as dynamic expansion of network nodes,evolution of edge attributes,and composite updates.In dynamic weighted networks,the paper proposes a method for local updates,which efficiently handles changes of edge weights through a sliding window mechanism and a trigger-based update method,reducing the computational burden and improving efficiency.Overall,by introducing the concepts of cognitive operators and dynamic weighted networks,this paper provides a new method for analyzing and updating the influence of nodes in complex networks.

NMTF-based Adaptive Algorithm for Community Detection in Complex Networks

LI Xilong, LIU Yan, JIA Mengmeng, ZHANG Zilin

Computer Science. 2026, 53 (4): 215-223. doi:10.11896/jsjkx.250500057

Abstract

PDF(2834KB) ( 298 )

References | Related Articles | Metrics

To address the limitations of existing non-negative matrix factorization(NMF)-based community detection methods,such as the requirement for preset the number of communities,susceptibility to local optima,and limited model generalization,this paper proposes Adp-NMTF,an adaptive community detection algorithm based on non-negative matrix tri-factorization(NMTF).The algorithm incorporates a dynamic evaluation and feedback mechanism to automatically search and determine the optimal number of communities without manual intervention.It introduces graph regularization,sparsity constraints,and inter-community independence constraints to balance generalization capability and interpretability.Additionally,semi-supervised initialization and warm-start strategies are employed to accelerate NMTF convergence and improve computational efficiency.Experimental results demonstrate that Adp-NMTF can autonomously determine a reasonable number of communities and outperforms mainstream baseline methods in both synthetic and real-world networks across evaluation metrics including modularity(Q),normalized mutual information(NMI),and adjusted Rand index(ARI).Furthermore,the convergence rate of matrix factorization is significantly improved.

Multi-channel Graph Kolmogorov-Arnold Network Based on WL Graph Core

WANG Jinghong, LI Pengchao, MI Jusheng, WANG Wei

Computer Science. 2026, 53 (4): 224-234. doi:10.11896/jsjkx.250600033

Abstract

PDF(5958KB) ( 346 )

References | Related Articles | Metrics

As an emerging deep learning method,graph neural networks have demonstrated powerful capabilities in modeling and representing graph structure data in various graph learning tasks.However,most existing graph neural networks focus on single-channel graph convolution and fail to make full use of the rich and diverse relationship information in real-world graph data.To deeply mine multi-relational features in graph data and enhance the modeling capabilities of graph neural networks,this paper proposes a multi-channel graph Kolmogorov-Arnold network based on the Weisfeiler-Lehman graph kernel(KMCGKN).This method extracts the node domain subgraph and constructs the feature map with the help of the Weisfeiler-Lehman graph kernel method,and replaces the feature transformation function in the original graph convolution layer with the Kolmogorov-Arnold network.Then,two graph convolution network channels learn the characteristics of different relationship graphs respectively,thereby obtaining the feature encoding and structural encoding of the graph.At the same time,the multi-view loss ensures the diffe-rence between channels,which alleviates the overfitting problem of deep models.The KMCGKN method is evaluated on six node classification public data sets.Experimental results show that its performance in node classification tasks is better than single-channel GCN and other benchmark models,effectively improving the model modeling and representation capabilities.

Long-term Causal Effect Estimation Based on Deep Reinforcement Learning

LIU Jiaqi, WANG Yujie, XIANG Guodu, YU Kui, CAO Fuyuan

Computer Science. 2026, 53 (4): 235-244. doi:10.11896/jsjkx.250600043

Abstract

PDF(3319KB) ( 319 )

References | Related Articles | Metrics

Causal effect estimation aims to calculate the magnitude of the causal effect of the treatment variable on the outcome variable.The existing prevalent causal effect estimation methods are mainly applicable to static data or a single time point in time series,and cannot effectively estimate the cumulative impact of the treatment variable on the outcome variable over a long period of time.To solve this problem,the long-term causal effect estimation method based on traditional reinforcement learning fits the long-term potential outcomes through linear basis functions,thereby calculating the long-term causal effect.However,due to the limited expressive power of linear basis functions in complex scenarios,existing methods cannot accurately identify weak causal effects,and at the same time,there will be significant performance degradation problems when the data dimension increases.In response to the above problems,this paper proposes a long-term causal effect estimation method based on deep reinforcement lear-ning.This method uses the dueling network to estimate long-term potential outcomes,which can effectively estimate the impact of the treatment variable on the outcome variable,thereby greatly improving the algorithm’s ability to identify weak causal effects.Meanwhile,the proposed method avoids the biases that occur when estimating long-term potential outcomes due to improper selection of basis functions.Experimental results show that the proposed method outperforms existing algorithms on statistical synthetic datasets and order scheduling simulation datasets.

Phase-preserved MinMax Framework for Graph Augmentation in Frequency Domain

HUA Yu, ZHOU Xiaocheng, SHEN Xiangjun, LIU Zhifeng, ZHOU Conghua

Computer Science. 2026, 53 (4): 245-251. doi:10.11896/jsjkx.250700069

Abstract

PDF(1645KB) ( 298 )

References | Related Articles | Metrics

Graph data augmentation enhances the generalization and robustness of graph neural networks(GNNs) by performing local or global transformations on graph structures or node features.While existing studies have shown that graph augmentation techniques can effectively leverage low-frequency information to capture the global topology of graphs,they often fail to preserve high-frequency components that encode fine-grained structural details.This shortcoming may result in information loss or feature distortion when learning local representations.To address this challenge,this paper proposes a phase-preserving frequency-domain MinMax framework for graph augmentation.The proposed method integrates frequency-domain analysis with the MinMax optimization paradigm,decomposing graph signals into low and high-frequency components.The low-frequency part captures global topological patterns,whereas the high-frequency part represents rich local structural information.By applying the MinMax strategy in the frequency domain,the proposed framework simultaneously preserves global structure and enhances high-frequency details,leading to more expressive multi-scale graph representations.In addition,it adopts an adaptive augmentation strategy that dynamically adjusts the perturbation amplitude based on the characteristics of different frequency components,thereby improving training efficiency.The phase information,which encodes intrinsic structural relations between graph nodes,is explicitly preserved to further enrich the expressive capacity of node representations.Through this frequency-aware design,the proposed method maintains essential topological structures while effectively enhancing node-level features,improving the GNN’s ability to capture both global and local semantics.Extensive experiments on multiple benchmark datasets demonstrate that the proposed method achieves over a 2 percentage points accuracy gain on node classification tasks compared to existing approaches.Moreover,it deli-vers superior computational efficiency,validating its effectiveness and scalability for large-scale graph learning scenarios.

Fast Map Matching Method Based on Trajectory Micro-segment Model

KANG Jun, GAO Shengkai, LAI Jiabao

Computer Science. 2026, 53 (4): 252-259. doi:10.11896/jsjkx.250900115

Abstract

PDF(2288KB) ( 275 )

References | Related Articles | Metrics

Map matching is one of the core technologies in intelligent transportation systems,aiming to map GPS trajectory data to urban road networks,eliminate positioning errors,and reconstruct actual driving paths.With the explosive growth of GPS tra-jectory data volume,traditional HMM(Hidden Markov Model)-based map matching methods struggle to meet real-time proces-sing requirements due to high computational costs and temporal dependency constraints.To address this,this paper proposes a fast map matching method based on a trajectory micro-segment model(Micro-Segment Fast Matching,MSFM).By leveraging a sliding window mechanism,the proposed method decomposes trajectories into fixed-length micro-segments and employs vecto-rized computation techniques in a distributed computing environment.This approach significantly improves computational efficiency while maintaining map matching accuracy.Experimental results demonstrate that,under a distributed cluster configuration,MSFM achieves a map matching speed of approximately 110 000 points per second,which is 7 times faster than baseline algorithms,while retaining a 95.86% matching accuracy.By optimizing the storage structure of trajectory data,the MSFM method exhibits significant performance advantages in real-time processing of large-scale trajectory data with high efficiency.

Unsupervised Infrared Image Generation Method Based on Dual Semantic Contrastive Learning

CHENG Zimeng, YANG Xinyue, AI Haojun, WANG Zhongyuan

Computer Science. 2026, 53 (4): 260-268. doi:10.11896/jsjkx.250700172

Abstract

PDF(4728KB) ( 352 )

References | Related Articles | Metrics

Infrared images are widely used in computer vision,but high-quality infrared image datasets are limited in scale due to restricted acquisition conditions.To address this problem,converting visible datasets to infrared datasets has become an effective way.Existing generation methods generally rely on supervised learning,which requires a large amount of paired data that is extremely difficult to obtain in practical applications.This paper proposes an unsupervised infrared image generation method named DSCGAN.This method adopts a bidirectional transformation architecture and introduces semantic contrast learning to enhance the ability to preserve image content and learn discriminative infrared features.The geometric consistency loss is introduced to preserve the original structure and details of visible images effectively.Meanwhile,a multi-scale PatchGAN discriminator is constructed to improve discriminative capability and enhance the realism of generated images.Experimental results on the AVIID-1,AVIID-2,and Day-DroneVehicle datasets show that DSCGAN outperforms the comparison methods in several metrics,and the generated infrared images exhibit a more reasonable thermal radiation distribution and better visual quality.In the AVIID-1 dataset,the SSIM value increases to 0.814 4,and the FID score decreases to 0.145 6.In the Day-DroneVehicle dataset,the PSNR value improves to 18.14,while the LPIPS value drops to 0.294 9.This study provides a new idea for unsupervised infrared image gene-ration,with potential applications in infrared target detection,infrared scene segmentation,and other downstream tasks.

LegoViT:Block-grained Scaling Techniques for ViT Models in Edge-side Visual Inference

ZHOU Haojie, WU Xiaoning, GAO Zhiqiang, HAN Rui, ZHANG Qinglong, LIU Chi, CHEN Zheng, ZHAO Yu, WANG Shuo

Computer Science. 2026, 53 (4): 269-276. doi:10.11896/jsjkx.250900024

Abstract

PDF(5162KB) ( 358 )

References | Related Articles | Metrics

In recent years,Visual Transformer (ViT) models have been widely deployed in edge-based visual applications because of their powerful image understanding capabilities.To achieve optimal inference accuracy-latency balance in resource-constrained edge-side inference,it is essential to scale ViT models effectively based on available resources.However,existing inference model scaling techniques can only perform scaling at the entire model granularity,leading to the loss of critical information and often requiring more computational resources or higher inference latency to achieve equivalent accuracy.This paper proposes LegoViT,a method that identifies scalable model blocks from the feedforward networks of ViT mo-dels,thus supporting runtime block-level model scaling.Comparative test results demonstrate that LegoViT achieves a 22.37% reduction in memory footprint of ViT models,a 21.1% decrease in computational overhead,and an average 61.05% reduction in inference latency.

Image Classification Based on Hybrid Quantum-Classical Long-Short Range Feature Extension Network

ZHENG Yi, JIA Xinghao, ZHANG Junwen, REN Shuang

Computer Science. 2026, 53 (4): 277-283. doi:10.11896/jsjkx.250600108

Abstract

PDF(2178KB) ( 338 )

References | Related Articles | Metrics

It is difficult to further break through the scale and computing time of classical neural networks,making it difficult to simultaneously achieve lightweight design and high performance.This has become a bottleneck in solving large-scale image classification problems in the era of big data.In contrast,hybrid quantum-classical neural networks combine the advantages of quantum and classical computing,enabling efficient parallel computation and strong generalizability.To address this,this paper proposes the hybrid quantum-classical long-short range feature extension neural network(HQC-LSNet),a multi-branch architecture composed of multiple hybrid modules.It incorporates a quantum decoupled fully connected attention mechanism built with various quantum rotation gates and controlled-Z gates to efficiently extract long-range features from a quantum-enhanced feature space.Simultaneously,a classical convolutional module is employed to capture short-range features,and feature extension is performed by combining the resulting feature maps.The model achieves classification accuracies of 99.42% on the ten-class MNIST dataset and 91.42% on a three-class CIFAR-10 dataset,outperforming corresponding classical and hybrid quantum-classical models.Additionally,HQC-LSNet reduces both the parameter count and time complexity compared to purely classical models.

Tensor-based Multimodal Fusion Technique to Diagnose Microvascular Invasion

WANG Shaodong, LI Liujun, LI Rui, SU Zhongzhen, LU Yao

Computer Science. 2026, 53 (4): 284-290. doi:10.11896/jsjkx.250600188

Abstract

PDF(2282KB) ( 292 )

References | Related Articles | Metrics

Microvascular invasion(MVI) is a critical prognostic factor for postoperative recurrence and reduced survival in hepatocellular carcinoma(HCC),making its precise preoperative localization essential for treatment planning.To address limitations of existing radiomics approaches—including poor feature generalizability,weak interpretability,and neglect of spatial peritumoral MVI distribution crucial for surgical strategies—this study proposes:1) multimodal fusion-based 3D localization of MVI via spatial alignment of whole-slide pathology images(WSIs) with 3D ultrasound(3D US);2) a feature tensor fusion deep learning model integrating multiscale features,tensor fusion,and orthogonal loss functions to extract semantic features of peritumoral MVI distribution.Model performance is evaluated on the curated dataset using metrics such as the area under the receiver operating cha-racteristic curve(AUC),Accuracy and F1 score,demonstrating its effectiveness.Experimental validation demonstrates exceptional performance(AUC:0.910,Accuracy:0.930,F1-score:0.852),confirming the proposed model’s clinical potential for precise preoperative MVI diagnosis.

White Matter High Signal Segmentation Method Combining Local and Global Perception and Semantic Flow Alignment

ZHANG Xinfeng, GUO Yihai, LIU Xiaomin, XU Zhonghe, LI Xiangsheng

Computer Science. 2026, 53 (4): 291-298. doi:10.11896/jsjkx.250700057

Abstract

PDF(3468KB) ( 314 )

References | Related Articles | Metrics

A white matter hyperintensity segmentation method called PGF-Net is proposed,which combines local and global perception with semantic flow alignment,to address the characteristic of small targets in high signal white matter.Firstly,it proposes the PAA(Patch Aware Attention) module,which enhances the ability to extract local features by dividing local small image blocks for feature selection.Secondly,it proposes to combine local and global aware attention modules(PGAA) and utilizes the characteristics of Transformer global perception to establish long-range dependencies.Lastly,it proposes a gated flow alignment module(GFAM) to predict the semantic flow offset field in the decoding section.Guide the expansion of high-level features in the decoder to achieve precise alignment and fusion with the corresponding low-level features in the encoder.Experimental results show that the PGF-Net achieves optimal performance in a self collected dataset,with a cross union ratio(mIoU) of 0.876 9,a Dice coefficient of 0.842 3,a Hausdorff distance(HD) of 32.61,and an average surface distance(ASD) of only 1.7.The model also achieves optimal performance on two small target public datasets,verifying its generalization and robustness.This method has certain application prospects in assisting doctors in diagnosis in the future.

Lightweight Camouflaged Object Detection Model Based on Structured Knowledge Distillation

SONG Jianhua, LIU Chun, ZHANG Yan

Computer Science. 2026, 53 (4): 299-307. doi:10.11896/jsjkx.250100105

Abstract

PDF(2489KB) ( 314 )

References | Related Articles | Metrics

Camouflaged object detection(COD) plays a crucial role in natural scene analysis and security monitoring.However,the complexity and diversity of camouflaged objects pose significant challenges to the performance of detection models.Existing knowledge distillation methods are primarily used for model compression by aligning the output features of teacher and student networks to achieve lightweight models.Nonetheless,these methods often overlook the rich semantic information contained in the intermediate features of teacher networks.Additionally,fixed learning rate strategies struggle to adapt to the significant scale differences between teacher and student models,leading to instability during the distillation process.To address these issues,this paper proposes a lightweight camouflaged object detection model based on structured knowledge distillation.The method leverages structured knowledge to improve the traditional soft and hard label loss calculation,significantly enhancing the distillation performance.Furthermore,the learning rate optimization problem is modeled as an optimization task to stabilize performance fluctuations during the distillation process.Experimental results demonstrate that the proposed method achieves an S_m of 82.9% and 81.0% on the COD10K-V3 and CAMO camouflaged object detection datasets,respectively,while reducing training time to 6.5 hours.

Vehicle-mounted Video Compression Algorithm for Collaborative Vehicle Crowdsensing

JIANG Zixian, YU Saixuan, HUANG Ruixue, SHEN Xin, HUANG Heqing

Computer Science. 2026, 53 (4): 308-317. doi:10.11896/jsjkx.250400103

Abstract

PDF(4246KB) ( 298 )

References | Related Articles | Metrics

Collaborative vehicle crowdsensing significantly extends the perception range of individual cars,thereby greatly enhancing the safety of autonomous and assisted driving.However,it also faces the challenge of high transmission latency when dealing with high-precision,large-volume sensory data such as vehicle-mounted video.To solve this problem,the data transmission delay can be effectively reduced by removing redundant frames with invalid information from vehicle-mounted video.However,the dynamics and complexity of key information in vehicle-mounted video pose significant challenges in representing key and redundant information between frames and balancing the key information retention rate and compression rate.To solve the above challenges,this paper proposes a vehicle-mounted video compression algorithm for collaborative vehicle crowdsensing,aiming to balance information fidelity and compression efficiency.Specifically,it first employs target detection and multi-target tracking algorithms to extract continuous features of key information across video frames.Then,based on the low-rank property of video features,it converts the complex key and redundant information representation into a low-rank sparse matrix decomposition problem.Furthermore,it leverages the inexact augmented Lagrangian method to solve the problem.Finally,it evaluates the performance of the proposed algorithm using the real road dataset in Chongqing city and selected data from the public dataset BDD100K.Experimental results show that the proposed algorithm achieves an average 12.99% improvement in key information retention over four baseline methods under different traffic conditions,while reducing the transmission delay by 61.24% on average compared to the original video transmission.

Multi-stage Grasping Method for Unordered Mixed Objects Grasping Based on GraspNet

YU Lingxin, CHEN Yibo, QU Haojun, LI Guangwei, LI Jinping

Computer Science. 2026, 53 (4): 318-325. doi:10.11896/jsjkx.250600124

Abstract

PDF(3232KB) ( 334 )

References | Related Articles | Metrics

Mechanical devices used in industrial sorting are typically designed for specific application scenarios and products,often exhibiting poor versatility and intelligence when faced with unordered mixed object grasping.Current point cloud matching grasping technologies based on 3D structured light cameras have improved flexible production capabilities to a certain extent.How-ever,they are constrained by high hardware costs,limited feature description capabilities,high computational complexity,and sensitivity to occlusions,making it difficult to meet the demands of unordered mixed object grasping.In recent years,deep learning-based grasping technologies,represented by GraspNet,have developed rapidly,achieving pose estimation through binocular ca-meras.Nevertheless,these methods still suffer from suboptimal target selection strategies,limitations in pose scoring mechanisms,and significant pose localization errors.To address these challenges,this study proposes an improved three-stage grasping algorithm.In the first stage,the YOLOv10 object detection model is fused with the SAM segmentation model,combined with an optimized target selection algorithm that prioritizes unobstructed and closer targets,effectively solving the problem of poor target selection strategies in stacked and occluded scenarios.In the second stage,the GraspNet pose estimation framework is enhanced by introducing a pose filtering mechanism based on point cloud surface normals and reconstructing the scoring mechanism to obtain high-precision grasping poses.In the third stage,a pose fine-tuning strategy is designed using a hierarchical control architecture of “hover alignment-vertical grasping” to effectively eliminate cumulative errors during execution,ultimately addressing the issue of inaccurate real-world grasping.Experimental results demonstrate that this method significantly improves grasping efficiency,operational reliability,and cross-scenario generalization capabilities in complex environments.Moreover,by replacing 3D structured light cameras with binocular cameras,the system cost is significantly reduced,providing a cost-effective solution for industrial automation.

Improved Facial Animation Generation Algorithm Based on EchoMimic and Its Application Specifications

ZHAN Qiwei, REN Haojia, XIAO Tiantian

Computer Science. 2026, 53 (4): 326-336. doi:10.11896/jsjkx.251200015

Abstract

PDF(6185KB) ( 427 )

References | Related Articles | Metrics

In recent years,diffusion model-based approaches for speech-driven facial animation generation have achieved breakthrough progress,which can efficiently produce high-resolution talking videos with long temporal sequences and precise audio-lip synchronization.However,the videos generated by current methods generally suffer from noticeable blurring and artifacts in the mouth region,which seriously impairs the realism and visual credibility of the synthesized videos.To address this issue,this paper proposes LiveEchoMimic,an improved facial animation generation algorithm based on EchoMimic,and further explores its stan-dardized application paradigm.From the technical implementation perspective,it constructs an end-to-end framework for natural talking video generation,with the EchoMimic diffusion model and implicit key point model serving as the dual-core architecture.Specifically,the EchoMimic diffusion model leverages a joint constraint mechanism of audio features and facial key points to accomplish the generation of coarse-grained talking videos.In contrast,the implicit key point model adopts a video-driven paradigm,which realizes the refined generation of high-quality facial animation videos by regulating the displacement features in the implicit key point space.Furthermore,it constructs an audio-lip mapping model to accurately model the intrinsic correlation between audio features and mouth motion states,and a dedicated mapping network is designed to enhance the audio-lip synchronization accuracy of the generated videos.Finally,extensive experimental evaluations are conducted on two public datasets (CelebV-HQ and MEAD) and one private dataset (Avatar).Both quantitative and qualitative results demonstrate that the proposed LiveEchoMimic method significantly outperforms state-of-the-art approaches in core metrics such as visual quality and audio-lip synchronization,achieving superior video generation performance.From the perspective of application norms,considering that highly realistic speech-driven facial animation technology may give rise to identity forgery and behavioral distortion issues,this paper puts forward operable recommendations from the dimensions of challenges,application principles,and implementation measures.These recommendations are intended to promote the sound development of speech-driven facial animation technology to better meet the demands of social development under controllable and secure premises.

Category-Theoretic Semantic Representation: Systematic Review and Compositional Mechanism Analysis

LI Yidan, CUI Jianying, XIONG Minghui

Computer Science. 2026, 53 (4): 337-346. doi:10.11896/jsjkx.251000136

Abstract

PDF(1894KB) ( 303 )

References | Related Articles | Metrics

Semantic representation is a central challenge in natural language processing (NLP).Existing approaches can be broa-dly categorized into two paradigms:symbolic and connectionist methods.Although the latter have achieved remarkable practical success,they suffer from theoretical limitations-commonly referred to as the “compositionality crisis” in compositional modeling and semantic interpretability.In existing methods,categorical compositional distributional semantics provides a principled mathematical framework for unifying symbolic syntactic structure with distributed semantics via type-driven composition.From a categorical perspective,this paper surveys category-theoretic approaches to semantic representation along the conceptual line of “category theory-composition-quantum computation”.Unlike surveys organized by models or tasks,it focuses on semantic composition mechanisms,comparing sentence-level models from a compositional viewpoint,analyzing the limitations of distributed approaches,and outlining the theoretical shift toward compositional distributional semantics.Building on this,string diagram-based frameworks such as DisCoCat and DisCoCirc are presented,clarifying their formal properties and quantum extensions,offering a unified view of symbolic,connectionist,and quantum semantics.

Agent4Stu:Efficient LLM-based Student Answer Behavior Simulation Agent

LIU Suyi, LIU Qi, GAO Weibo

Computer Science. 2026, 53 (4): 347-355. doi:10.11896/jsjkx.250800012

Abstract

PDF(2390KB) ( 349 )

References | Related Articles | Metrics

Personalized learning has become a core focus of the digital transformation in education,with its success hinging on the precise understanding and modeling of student response data.However,real-world educational scenarios often face challenges such as sparse learning behaviors,dynamic evolution of learning states,and privacy compliance constraints.Additionally,discrepancies between offline data and online learning behaviors result in insufficient behavioral data and distribution shifts,significantly limiting the modeling capabilities and generalization performance of intelligent education systems.To alleviate these dilemmas,previous studies have attempted to simulate student response behaviors to expand data scale and improve model performance. However,existing methods struggle to simultaneously balance generation quality,efficiency,and resource costs.To overcome these limitations,this paper proposes Agent4Stu,a student response behavior simulation framework that integrates large language models(LLMs) with retrieval-augmented generation(RAG) techniques,enabling low-cost,efficient,and highly generalizable personalized response generation.The framework comprises a pre-built retrieval database with retrieval strategies and an LLM-based agent.The retrieval database is constructed from student response behaviors,and two retrieval strategies are designed:similar-student collaborative retrieval and relevant-fact retrieval.These are combined with each student’s short-term memory to dynamically generate highly relevant prompts.Internally,the agent integrates three core modules,profile,memory,and action,which are responsible for modeling students’ learning characteristics and cognitive abilities,integrating historical experiences with know-ledge from the retrieval database,and simulating students’ responses on specific items based on their profile,memory,and know-ledge mastery.Compared with existing LLM-based student agents,Agent4Stu has a smaller memory footprint and simplified action reasoning,while leveraging a behavior-oriented structured retrieval database to provide auxiliary information.This design enables low-cost,efficient,and highly generalizable personalized response generation.Quantitative and qualitative experiments on two real-world learning datasets demonstrate the effectiveness and superiority of Agent4Stu in simulating student learning response behaviors.

Cross-model Collaborative Unsupervised Representation Method for Legal Texts

XU Shenjian

Computer Science. 2026, 53 (4): 356-365. doi:10.11896/jsjkx.251100003

Abstract

PDF(2659KB) ( 328 )

References | Related Articles | Metrics

Legal text representation is a fundamental component of legal artificial intelligence systems,directly affecting the performance of downstream tasks such as legal article prediction and case retrieval.However,the professional terminology,complex structure,and reasoning patterns of legal texts often lead to semantic drift in general pre-trained models.Open-source models lack sufficient legal domain knowledge,while closed-source models,despite their strong semantic understanding capabilities,provide representations that are difficult to directly access and reuse.To address these challenges,this paper proposes a cross-model collaborative legal representation framework(CMCLR),which enables collaborative learning between open-source and closed-source models to enhance legal semantic modeling.Specifically,closed-source models are employed to perform dynamic text segmentation and key paragraph identification,producing structured domain-aware signals that guide the fine-tuning of open-source models under collaborative constraints.In addition,unsupervised clustering is introduced to model structural relationships among paragraph-level embeddings,capturing latent semantic associations between legal texts.Experiments conducted on the CAIL2018 legal article classification task demonstrate that CMCLR achieves an accuracy of 90.3%,outperforming representative baseline methods by 2.4 percentage points,while maintaining robust performance across different dataset scales and settings.These results confirm the effectiveness of cross-model collaborative representation learning for deep semantic modeling of legal texts.

SM-PHT:Robust,Scalable,and Efficient Method for Multi-task Reinforcement Learning

PAN Jiahao, FENG Xiang, YU Huiqun

Computer Science. 2026, 53 (4): 366-376. doi:10.11896/jsjkx.250700198

Abstract

PDF(5405KB) ( 358 )

References | Related Articles | Metrics

In recent years,reinforcement learning has achieved remarkable success in various domains.However,traditional RL methods often struggle with adaptability when facing dynamic environments or multiple tasks.To address this challenge,this thesis introduces SM-PHT,a robust,scalable,and efficient method for multi-task reinforcement learning.The primary objective of this research is to enhance the adaptability and generalization capabilities of reinforcement learning agents in multi-task environments by enabling them to learn and transfer knowledge across multiple tasks.SM-PHT integrates three key mechanisms:priority-weighted knowledge distillation(PWKD),hierarchical buffer,and task embedding.PWKD leverages a weighted distillation process to assimilate knowledge from multiple high-performing models,improving the robustness and stability of the student network.Moreover,the hierarchical buffer employs dual buffers to store low-level experiential data and high-level model parameters,optimizing offline learning efficiency.Finally,task embedding enrich task representations by capturing detailed environmental characteristics,facilitating effective knowledge transfer.Experiments conducted in the Meta-World environment demonstrate SM-PHT’s superior performance compared to state-of-the-art methods.In the MT10 challenge,SM-PHT achieves double the success rate and a 30% increase in average rewards.In the more complex MT50 challenge,it improves the success rate by approximately 10% and increases average rewards by around 10%.These results highlight SM-PHT’s ability to handle complex tasks with remarkable stability and minimal fluctuation,making it a promising approach for real-world MTRL applications.

LLM-augmented Training Framework with Cycle-Consistency Constraints

WU Qiaorui, LUO Li, ZHAO Cairong

Computer Science. 2026, 53 (4): 377-383. doi:10.11896/jsjkx.250600032

Abstract

PDF(1611KB) ( 313 )

References | Related Articles | Metrics

This paper proposes a training framework termed LACC(Large Language Model-Augmented Consistency-Constrained),designed to address key challenges in patent abstract generation,including incomplete coverage of technical features,insufficient legal compliance,and inefficiency in edge deployment.The LACC framework constructs a bidirectional reversible task structure between abstract generation and claim expansion,incorporating a cycle-consistency constraint to jointly optimize technical expression and legal formulation.On this basis,LACC integrates a controllable data augmentation strategy powered by large language models(LLMs) to automatically generate high-quality patent text pairs.A dynamic verification mechanism is further introduced to enhance the technical accuracy and regulatory reliability of generated content.Experimental results on the Chinese patent dataset CPTD demonstrate that LACC achieves a ROUGE-L score of 56.74,outperforming the baseline by 8.99 percentage points,and shows significant improvements in the recurrence consistency score(RCS).Moreover,the framework supports efficient edge deployment,with inference latency controlled within 420 ms and single-GPU memory usage limited to 4.5 GB.Overall,LACC offers a practical and scalable solution for downstream tasks such as patent drafting assistance,legal text generation,and intelligent intellectual property(IP) management,and shows strong potential in enabling the automation of the full lifecycle of IP processing.

Multi-view Local Language Feature and Global Feature Fusion for Conversational Aspect-based Sentiment Quadruple Analysis

PENG Juhong, ZHANG Zhengyue, DING Zixu, FAN Xinyu, HU Changyu, ZHAO Mingjun

Computer Science. 2026, 53 (4): 384-392. doi:10.11896/jsjkx.250900032

Abstract

PDF(2319KB) ( 276 )

References | Related Articles | Metrics

Conversational aspect-based sentiment quadruple analysis(DiaASQ) is an emerging research direction in the field of ABSA(Aspect-Based Sentiment Analysis),which aims to identify and extract sentiment quadruples－namely,target,aspect,opinion,and sentiment polarity－from a given dialogue.Compared with traditional ABSA tasks on static texts,DiaASQ faces two major challenges:1)dialogue texts are often lengthy,with sentiment elements such as targets,aspects,and opinions scattered across multiple utterances,making it difficult to capture long-range dependencies;2)dialogue structures are more complex,typically involving multiple speakers and reply relationships,where information frequently spans sentences and speakers,leading to intricate interaction patterns.To address these challenges,this paper proposes MVLLF-GF,a model that integrates multi-view local language features with global contextual representations for dialogue-based sentiment quadruple extraction.Specifically,a multi-view linguistic knowledge encoder is employed to enhance token-level interactions from multiple perspectives,including syntactic dependency and semantic information,thereby learning rich local features.A global utterance encoder is then introduced to capture global features by modeling speaker identities and reply relationships at theutterance level.Furthermore,a multi-granula-rity fusion module is designed to deeply integrate features across different levels,enhancing the model’s contextual understanding.Finally,an end-to-end grid tagging mechanism is applied to decode sentiment quadruples.Experimental results on the public DiaASQ Chinese dataset(ZH) and English dataset(EN) demonstrate that the proposed method achieves Micro-F1 improvements of 9.13 percentage points and 6.50 percentage points,respectively,over the baseline model MVQPN,verifying its effectiveness.

Enhancing Temporal Knowledge Graph Reasoning Method with Graph Information Bottleneck and Transformer

XIN Yichen, LI Shichong, CHEN Bin, CHENG Zhangtao, LI Ye, ZHOU Fan

Computer Science. 2026, 53 (4): 393-405. doi:10.11896/jsjkx.250400050

Abstract

PDF(2614KB) ( 306 )

References | Related Articles | Metrics

Temporal knowledge graphs(TKGs) dynamically record event knowledge in the form of quadruples(subject,relation,object,timestamp),effectively capturing the dynamic evolution of knowledge in the real world.As a result,they have been widely applied in various domains such as recommender systems,large language models,and knowledge-based question answering.However,their inherent incompleteness poses significant challenges for further development and application.Temporal knowledge graph reasoning aims to predict missing event knowledge in TKGs,and has thus attracted considerable attention from both academia and industry.Existing methods for TKG reasoning mainly focus on extracting structural information within graph snapshots and modeling temporal dependencies between them.Nonetheless,they still suffer from two major limitations:1)insufficient handling of noise and redundancy present in the snapshots during the modeling process;2)an overreliance on local temporal patterns within short time windows,while ignoring global temporal dependencies across the entire TKG.To address these issues,this paper proposes GIBformer,a novel temporal knowledge graph reasoning framework that integrates the graph information bottleneck principle with a Transformer architecture.Specifically,it first introduces the graph information bottleneck to compress structural information in TKGs,preserving key information that is highly relevant to downstream prediction tasks while effectively filtering out noise and redundancy.Then,a Transformer with multi-head attention is employed to capture global temporal dependencies across snapshots,while also incorporating local temporal dynamics to enhance the prediction of missing event knowledge.Extensive experiments conducted on four widely-used benchmark datasets demonstrate the effectiveness of the proposed model.

Knowledge-assisted and Reinforced Syntax-driven for Aspect-based Sentiment Analysis

ZHENG Cheng, BAN Qingqing

Computer Science. 2026, 53 (4): 406-414. doi:10.11896/jsjkx.250600117

Abstract

PDF(2742KB) ( 283 )

References | Related Articles | Metrics

Aspect-based sentiment analysis aims to align aspects with their corresponding opinion expressions to identify the sentiment polarity of specific aspects.Existing dependency tree-based graph neural network models have achieved significant performance improvements in aspect-based sentiment analysis.However,most studies fail to fully exploit the complete information of the syntactic dependency tree,often overlooking syntactic dependency distance or dependency label information.This limitation may prevent effective alignment between opinion words and their corresponding aspect terms,particularly in sentences containing multiple aspects.To address these issues,a knowledge-assisted and reinforced syntax-driven network model is constructed.Specifically,an opinion word perception module is designed by incorporating external knowledge information to enhance the model’s ability to recognize opinion expressions in sentences.Then,reinforcement learning is employed to guide the construction of the syntactic distance graph.This graph is then heuristically integrated with the dynamic syntactic label graph,which is built based on word relations and dependency labels,thereby improving the accuracy and comprehensiveness of capturing relevant opinion expressions for a given aspect.Additionally,an aspect-focused attention mechanism is employed to better handle sentences with ambiguous syntactic structures.Extensive experiments conducted on three public datasets validate the effectiveness of the proposed model.

Network Traffic Generation Method for Malicious Traffic Identification

ZHANG Can, LI Weixun, WANG Ming, ZHAN Xiong, XIE Ziguang, HAN Dongqi, WANG Zhiliang, YANG Jiahai

Computer Science. 2026, 53 (4): 415-423. doi:10.11896/jsjkx.250900139

Abstract

PDF(2390KB) ( 326 )

References | Related Articles | Metrics

Malicious traffic identification is a key task in cybersecurity,and the quality of training data directly determines the accuracy of detection models.However,obtaining real traffic data is challenging due to privacy concerns,high annotation costs,and class imbalance.To address these challenges,this paper proposes a fine-grained network traffic generation method based on a pre-training-fine-tuning paradigm.The method firstly introduces a static tokenization scheme that preserves protocol structure information,converting raw traffic into sequence representations that maintain protocol semantics and are suitable for autoregressive model learning.On this basis,a two-stage generation framework is constructed:pre-train on large-scale benign traffic to capture general protocol and temporal patterns,then fine-tune on task-specific labeled malicious traffic to generate high-fidelity samples with explicit attack semantics.To evaluate the effectiveness of the proposed method,multi-dimensional experiments are conducted.The results show that the method outperforms mainstream baselines in protocol compliance(achieving a 99.95% pass rate in expert knowledge checks),distribution similarity(with an Earth Mover’s Distance of 0.005 9 between generated and real distributions),and generation diversity(with real neighborhood coverage exceeding 50%).In malicious traffic identification tasks,the generated traffic uniquely improves the detection performance of multiple classifiers compared with baseline methods.In addition,malicious functionality verification experiments confirm that the generated traffic successfully reproduces attack effects in two attack scenarios.Overall,the results demonstrate that the proposed method can generate fine-grained malicious traffic that is syntactically compliant,statistically consistent,and semantically functional,providing an effective technical approach to alleviate the data scarcity problem in cybersecurity.

Lightweight Federated Continual Learning Method Based on Double Anti-forgetting Mechanism

WANG Pan, WANG Ji, ZHONG Zhengyi, BAO Weidong, ZHANG Yaohong

Computer Science. 2026, 53 (4): 424-434. doi:10.11896/jsjkx.250500116

Abstract

PDF(3132KB) ( 334 )

References | Related Articles | Metrics

Federated learning(FL) enables knowledge sharing among different clients by uploading and aggregating client modelswithout sharing data.However,existing FL methods generally assume that client data is known and fixed.In reality,clients continuously receive tasks with new category data and update their models,which leads to a continuous decline in model performance on old tasks,known as catastrophic forgetting.To address this severe challenge,researchers have introduced continual learning(CL) into FL,giving rise to the research direction of federated continual learning(FCL).Nevertheless,as the number of tasks received by clients increases,existing FCL methods become less effective in alleviating catastrophic forgetting,especially for tasks that are relatively distant in time,where accuracy drops significantly.Moreover,the increasing degree of data heterogeneity further weakens model accuracy.To address this issue,this paper proposes a local-global anti-forgetting mechanism to mitigate the forgetting problem on distant tasks.Specifically,it introduces task-specific lightweight modules at the client level to effectively overcome catastrophic forgetting caused by data changes and model updates.At the server level,it generates and filters category-balanced pseudo-images through model inversion to alleviate the decline in model performance due to data distribution diffe-rences.Through a series of experiments conducted on CIFAR-10,CIFAR-100,and TinyImageNet datasets,the results strongly demonstrate the superiority of the proposed mechanism.Compared with existing methods,it shows significant advantages in improving model performance and alleviating catastrophic forgetting.

Cross-modal Fusion Few-sample Ransomware Classifier:Multimodal Encoding Based on Pre-trained Models

YIN Chuang, LIU Jianyi, ZHANG Ru

Computer Science. 2026, 53 (4): 435-444. doi:10.11896/jsjkx.250500078

Abstract

PDF(3498KB) ( 331 )

References | Related Articles | Metrics

Ransomware,defined by its mechanism of encrypting critical data to extort payment from victims,results in global ransom payments exceeding $1 billion in 2023.Precise classification of ransomware is crucial for effective security defense.How-ever,ransomware samples are often small.To address this challenge,this paper proposes a cross-modal fusion few-shot ransomware classifier named CMFu,comprising a feature construction module,an encoding module,and a fusion module.The feature construction module generates cross-modal features.The encoding module employs two pre-trained models to construct encoders that encode features from different modalities.The fusion module integrates the encoded data to achieve the final classification.Experimental evaluation assesses model performance under training sample ratios of 10%,30%,and 50%.CMFu outperforms all baseline model across all metrics.At a 30% sample ratio,CMFu achieves precision,recall,and F1-score of 0.91,0.91,and 0.90,respectively,demonstrating superior performance.When the sample ratio decreases to 10%,these metrics remain high at 0.78,0.84,and 0.80,confirming its ability in few-shot ransomware classification.Furthermore,ablation studies validate both the viabili-ty of the pre-training-based encoders and the necessity of employing backbone networks for fusion.

Deepfake Detection Method Based on Positional Enhancement and Frequency Domain ComponentInteraction

MENG Siyu, NIU Chunxiang, TAN Quange, WANG Rong

Computer Science. 2026, 53 (4): 445-453. doi:10.11896/jsjkx.250700070

Abstract

PDF(3340KB) ( 391 )

References | Related Articles | Metrics

With the rapid development of Deepfake technology,forged facial images and videos generated by such techniques have become increasingly prevalent on social media platforms.However,these technologies are also being maliciously exploited,posing serious threats to social security.Although existing detection methods perform well in detecting Deepfake faces on in-domain datasets,their performance significantly degrades when applied to unseen datasets.To address this issue,a Deepfake detection method based on positional enhancement and frequency domain component interaction is proposed,aiming to improve the robustness and generalization of facial forgery detection.Firstly,vision Transformer is employed as the backbone network to capture forgery traces from a global perspective.Secondly,the dynamic local feature extraction module is designed,utilizing channel-wise and point-wise convolutional operations for local feature extraction.This module dynamically weights features based on pixel-level importance in feature representation,thereby refining local features and enhancing the ability to perceive local features.Concurrently,the multi-scale feature extraction and positional enhancement module is constructed,which acquires multi-scale features through multi-dilated convolutions and introduces a positional enhancement mechanism to strengthen positional correlations between pixels,effectively extracting multi-scale information from different regions.Then,the global-local frequency domain component interaction module is developed,implementing information exchange between different frequency components through the frequency domain decomposition attention mechanism.This captures dependencies between global and local features to identify artifacts that disappear in RGB space when fake facial image quality degrades.Finally,the pixel relationship similarity loss function is designed to calculate positional relationship losses between pixels and is combined with cross-entropy loss to construct the joint loss function to improve detection accuracy.Experimental results demonstrate that the proposed method achieves AUC scores of 99.29% and 78.62% on FF++ and Celeb-DF datasets respectively,proving its effectiveness in enhancing the robustness and generalization of facial forgery detection.

Smart Medical Secure Authentication Protocol for Cloud and Fog Leakage Resistance

YANG Xin, GUO Yimin

Computer Science. 2026, 53 (4): 454-468. doi:10.11896/jsjkx.250100087

Abstract

PDF(2922KB) ( 365 )

References | Related Articles | Metrics

While smart healthcare enhances the convenience of people’s lives,it also poses significant challenges for the secure transmission of massive medical data in open wireless network communication environments.These data are susceptible to various internal and external attacks during transmission.To ensure timely and effective medical data transmission,the cloud-fog architecture,widely adopted in smart healthcare for network communication,significantly shortens the communication distance between the cloud and terminal devices through the effective extension of cloud computing by fog computing,thereby effectively mitigating network latency and jitter issues caused by excessive distance.However,most existing authentication and communication schemes based on the cloud-fog architecture adopt a centralized architecture of single-cloud,multiple-fogs and multiple-devices,which is prone to the risk of single-point failure.More seriously,these schemes often assume that the cloud is completely trustworthy,whereas in reality,cloud servers also face the risk of internal attacks,enabling attackers to compute session keys during the identity authentication and key agreement phase,leading to the leakage of communication data privacy and severely impacting communication security.In response to these communication security challenges,this paper proposes a secure authentication and key agreement protocol for smart healthcare that is resistant to cloud-fog compromise attacks.Leveraging blockchain technology to ensure the security of protocol data,this protocol can withstand various known attacks while also resisting cloud-fog leakage attacks.The semantic security of the proposed protocol is demonstrated using the extended Random Oracle Model.A heuristic security analysis method is employed to show that the proposed protocol satisfies all eight security properties.Additionally,the security of the proposed protocol is verified using the AVISPA security analysis tool.Performance analysis indicates that,compared with existing related protocols,the proposed protocol has lower communication overhead,lower computational cost,lower energy consumption,and stronger resistance to security attacks.