Computer Science

Study on Intelligent Teaching Mode Driven by AI Teachers and Digital Humans

JIANG Jie, YANG Ruoli, QI Rui, WAN Baiyan

Computer Science. 2026, 53 (6): 1-9. doi:10.11896/jsjkx.250600156

Abstract

PDF(2437KB) ( 138 )

References | Related Articles | Metrics

AI digital human teachers,as an important component of educational intelligence,are gradually being applied in various teaching scenarios.This paper focuses on its universal application in multiple courses,starting from the principle of modular teaching design,analyzing key technologies such as speech synthesis and video generation,and exploring the effectiveness improvement path of AI digital human teachers in teaching environments such as content presentation,motivation stimulation,personalized adaptation,and post learning feedback.And taking the programming course as an example,it proposes specific teaching adaptation mechanisms and improvement methods.Research has shown that AI digital human teachers have potential advantages in improving the adaptability and interactivity of teaching processes,providing feasible ideas for optimizing and exploring future intelligent teaching models.

Teaching Evaluation Sentiment Analysis Method Based on Capsule Network

KE Changbo, LI Tianhao, ZHANG Bolei, XIAO Fu, XU Kang

Computer Science. 2026, 53 (6): 10-18. doi:10.11896/jsjkx.251200107

Abstract

PDF(2859KB) ( 93 )

References | Related Articles | Metrics

With the deep advancement of smart education,intelligent analysis of teaching evaluation texts has become a crucial research direction for improving educational quality.Teaching evaluation texts typically exhibit features such as multidimensional coexistence,implicit emotional expression,and imbalanced category distribution,which impose higher demands on fine-grained sentiment analysis methods.To address this,a sentiment analysis model named CrossAtt-CapsNet-RoBERTa is proposed,integrating the RoBERTa pretrained language model,cross-attention mechanism,and capsule network.The model first employs RoBERTa to obtain deep semantic representations of texts and aspect categories.Then it strengthens the correlation between aspect categories and context through a cross-semantic cross-attention mechanism.Finally,it introduces learnable category-guided capsules and achieves joint modeling of aspect detection and sentiment classification via dynamic routing.To validate the model's perfor-mance,a real teaching evaluation dataset encompassing nine aspect categories is independently constructed.Experimental results show that the proposed model achieves a sentiment classification accuracy of 91.3% on the public dataset Res14 and 83.68% on the public dataset MAMS-ACSA,both surpassing baseline models;on the independently constructed real teaching evaluation dataset,the aspect detection F1 score reaches 79.95% and sentiment classification accuracy reaches 92.76%,both outperforming comparative models.Ablation experiments further confirm the effectiveness of design elements such as the cross-attention mecha-nism and category-guided capsules.Additionally,the model demonstrates strong adaptability and generalization capabilities in few-shot scenarios.These findings provide effective technical insights for the intelligent processing of teaching evaluations.

Academic Early Warning Prediction Model Based on Attention Mechanism and FeatureInteraction

LIU Ruyi, LYU Xiaohan, MIAO Qiguang, LU Zixiang, WANG Di

Computer Science. 2026, 53 (6): 19-29. doi:10.11896/jsjkx.250600192

Abstract

PDF(3673KB) ( 139 )

References | Related Articles | Metrics

Under the “Internet＋Education” context,higher education platforms have accumulated vast student behavior data,which is critical for academic early warning research.However,these data exhibit significant class imbalance.Additionally,structured student behavior data lack inherent spatial correlations among features,making it challenging for traditional deep learning methods to uncover potential feature interactions.Semantic differences among features can also lead to ineffective or misleading associations if not properly addressed,further impacting early warning accuracy.To address these challenges,a novel academic early warning model based on attention mechanisms and feature interactions is proposed.The model firstly employs a non-linear minority oversampling algorithm to augment data and mitigate class imbalance.It then uses residual connections and learnable multivariate Gaussian kernels to encode heterogeneous features into uniform vectors,reducing differences in data types and distributions.Feature interactions are modeled using a graph-based approach with semantic matching and attention mechanisms.Self-attention explores intra-sample feature relationships,while inter-sample attention captures correlations among different samples.An improved neural additive module based on Taylor formulas is introduced to provide interpretable predictions by explicitly representing model outputs as combinations of linear and non-linear feature contributions.Tensor decomposition is used to reduce computational complexity and enhance high-dimensional data processing efficiency,improving the model's generalization ability.

Public Opinion Analysis in Universities Based on GNN Multimodal Fusion

LI Zhen, ZHANG Yang, LI Zhichao, ZHAN Peng, CHEN Lin

Computer Science. 2026, 53 (6): 30-38. doi:10.11896/jsjkx.250600158

Abstract

PDF(3318KB) ( 100 )

References | Related Articles | Metrics

Currently,social media platforms have become crucial sources of information for identifying campus public opinion events.However,the analysis of such events still faces unique challenges,including sparse domain terminology,platform-specific structures,and diverse event types.To address these issues,this paper proposes a multimodal fusion framework based on graph neural network(GNN).By integrating DOM topological structures,knowledge-enhanced text,and cross-modal dynamic interactions,the framework provides campus administrators with a highly robust and accurate public opinion analysis tool.The approach enhances the robustness of public opinion analysis by mining the topological semantics of HTML DOM and cross-modal dynamic interaction information to obtain deeply fused feature representations.The model parses the DOM tree into a multi-level graph structure and employs GNN to model topological relationships between nodes.It incorporates a gated cross-modal attention mo-dule to dynamically adjust the fusion intensity among DOM,text,and visual modalities.Additionally,a Wikipedia knowledge enhancement strategy is introduced to expand short-text semantics through entity associations.Experimental results demonstrate that the proposed model achieves improvements over baseline models across multiple benchmark datasets.This framework effectively addresses issues such as semantic fragmentation across modalities and insufficient information in short texts,delivering a high-precision multimodal fusion solution for campus public opinion analysis.

Research on Adaptive Disciplinary Learning Effectiveness Evaluation Model Driven by PrefrontalEEG

XIE Hui, LIANG Dan, YANG Huiting, JIA Chunli, HE Jiangshan, DONG Zexiao, REN Ziqi, JIANG Mingzhe, CHEN Xueli

Computer Science. 2026, 53 (6): 39-49. doi:10.11896/jsjkx.250600153

Abstract

PDF(3786KB) ( 58 )

References | Related Articles | Metrics

China currently hosts diverse online learning platforms serving over 300 million users,making digital education an integral part of modern life.However,challenges persist in ensuring learning effectiveness and evaluating proficiency levels through conventional grade-based assessment methods,which often fail to capture neural response differences across disciplines and lack dynamic evaluation metrics integrating multidimensional behavioral data.This study designs a multidisciplinary simulated online learning experiment based on the Biglan discipline classification model to address these limitations.EEG signals are recorded from participants during learning sessions,with comparative analysis performed on neural patterns across different courses and disciplines.A composite learning effectiveness metric integrating response time and answer accuracy is developed to label the EEG feature dataset.Classification models are trained to predict learning outcomes at three granularity levels:16 instructional videos,8 courses,and 4 major disciplines.Key findings reveal distinct prefrontal cortex activation patterns between humanities and STEM subjects(natural/applied sciences).The discipline-level classification achieves 90% accuracy in predicting learning effectiveness.These results demonstrate the feasibility of portable EEG devices for educational assessment and provide methodolo-gical insights for developing personalized learning profiles in intelligent evaluation systems.The experimental protocol successfully captures neurocognitive differences across academic domains while maintaining practical applicability in real-world educational settings.

Multi-task Classroom Title Generation Method Integrates Core Sentences and Keyword Guidance

SHANG Yi, YING Di, ZHAO Hui

Computer Science. 2026, 53 (6): 50-58. doi:10.11896/jsjkx.250600151

Abstract

PDF(2505KB) ( 54 )

References | Related Articles | Metrics

Title generation,as a fundamental component of text generation tasks,frequently encounters challenges such as inadequate information coverage and semantic deviation.To address this issue,this paper proposes a multi-task title generation model guided by core sentences.This model emphasizes the critical role of core sentences in capturing the main idea of the source text and improving title generation quality.The model utilizes the original text,core sentences,and keywords as inputs,employing annotated core sentences during the training phase and acquiring them automatically through a core sentence classification task du-ring the testing phase.By integrating the training of core sentence classification and title generation,the model is capable of identi-fying key content while generating titles that more accurately align with the semantic meaning of the source text.Furthermore,to enhance the quality of generation,a similarity loss between keywords and titles is introduced to reinforce thematic consistency.During the decoding phase,the model explicitly distinguishes between two cognitive processes－content understanding and conceptual focus－in an educational context.It employs a dual cross-attention mechanism to generate concise,fluent,and highly summarized titles.Experimental results demonstrate that,within the multi-task framework,the outcomes of the core sentence classification task assist the title generation task.The presence of shared information enables collaborative optimization between tasks,leading to a substantial enhancement in title generation quality and providing novel insights for the automated construction of educational resources.

Automatic Knowledge Point Annotation for Student Code Based on Multi-agent Collaboration:A Case Study of C Language

LIU Jiaqi, GAO Zhizezhang, MENG Xianjia, SUN Xia, FENG Jun

Computer Science. 2026, 53 (6): 59-68. doi:10.11896/jsjkx.250600150

Abstract

PDF(2308KB) ( 52 )

References | Related Articles | Metrics

In intelligent education systems,knowledge point annotation is a key module for organizing teaching resources,enabling personalized recommendations,and modeling students' cognitive states.However,traditional exercise-oriented approaches to knowledge point annotation have limitations in reflecting the individual differences exhibited by students in programming learning processes.To address this issue,this paper proposes an automatic student code knowledge point annotation method based on multi-agent collaboration.This method shifts the focus from exercise-driven to code-driven annotation,constructing a three-level knowledge framework encompassing statement layer,code-block layer,and function layer.It also introduces a collaborative system composed of three agents－knowledge annotation,task analysis,and integrated feedback－with internal self-inspection and iterative optimization capabilities.Experimental evaluation is conducted on 363 student code submissions from an introductory programming course.The system demonstrates strong interpretability and group analysis capability in real-world educational scena-rios,effectively revealing students' knowledge mastery status and common cognitive deficiencies.Moreover,the study employs a LLM-based peer-review mechanism for performance assessment.Results indicate that the multi-agent collaborative approach outperforms methods based directly on LLM across five evaluation dimensions(completeness,accuracy,reasonableness,error identification ability,and educational guidance),with significantly more selections as the preferred solution.This research achieves automated and interpretable knowledge point annotation for student code,providing technical support and practical foundations for fine-grained student modeling,personalized assessment,and other downstream tasks.

From Recognition to Generation:Natural Language Expression of Student Attention in OnlineLearning Contexts

XIE Congcong, AN Yuxuan, WANG Di, LUO Xuemei, WANG Yifeng

Computer Science. 2026, 53 (6): 69-76. doi:10.11896/jsjkx.250600189

Abstract

PDF(3074KB) ( 61 )

References | Related Articles | Metrics

The continuous advancement of artificial intelligence technologies has accelerated the intelligent transformation of education,with student behavior analysis emerging as a key research area supporting precision teaching and personalized learning.However,existing approaches often rely on specialized models for behavior feature extraction and classification,with outputs typically presented as abstract labels,lacking interpretability and intuitiveness.To enable natural language descriptions of students' attentiveness in online learning scenarios,this paper constructs a vision-language alignment dataset for online education contexts,consisting of student learning images paired with corresponding attention-related behavior descriptions.The dataset includes both single-frame images and multi-frame image sequences.Building upon this dataset,it proposes a multimodal fine-tuning method tailored for the task of describing students' attentive behaviors.Experiments are conducted on the Qwen2.5-VL-3B and Qwen2.5-VL-7B vision-language models.The proposed method incorporates prompt design based on head pose,gaze direction,and facial expressions to guide the model in learning attention-related features.Furthermore,this paper introduces an attention-perception loss to enhance the model's understanding of student behavior.Experimental results demonstrate that the fine-tuned models achieve superior accuracy in describing student attentiveness compared to existing vision-language models.

Generation of Programming Learning Situation Feedback Reports Based on Code Analysis

CUI Can, GAO Zhizezhang, CUI Lei, FENG Jun, SUN Xia

Computer Science. 2026, 53 (6): 77-83. doi:10.11896/jsjkx.250600160

Abstract

PDF(2178KB) ( 48 )

References | Related Articles | Metrics

To address the limitations of traditional programming feedback,which relies heavily on outcome-based metrics and lacks fine-grained guidance,and to tackle the challenges of generalized application and insufficient guidance of large language mo-dels (LLMs) in educational contexts,this paper constructs an automated system for generating student learning reports based on code data.The system innovatively integrates static code quality analysis with multiple code submission records,and specifically employs a multi-role Agent collaborative model,optimized Chain-of-Thought (CoT) prompting strategies,and a hierarchical gene-ration mechanism to provide students with precise and comprehensive learning feedback.Through an empirical study using real student programming data,the results demonstrate that the system can effectively pinpoint specific issues in students' code,clearly revealing their problem-solving thought processes and knowledge gaps.Student evaluation feedback confirms that the generated learning reports exhibit excellent performance in terms of accuracy and practicality,showcasing significant application value and development potential in programming teaching practice.

Knowledge Tracing Model Based on Relational Learning Memory Network

XU Zhihong, YANG Xinlei, WANG Liqin, DONG Yongfeng, WANG Xu

Computer Science. 2026, 53 (6): 84-92. doi:10.11896/jsjkx.250600155

Abstract

PDF(2825KB) ( 49 )

References | Related Articles | Metrics

Knowledge tracking technology,which models students' past response information to accurately predict their mastery of various knowledge concepts and future learning performance,has become the core and key of building intelligent educational systems.With the development of deep learning,research methods for knowledge tracking have become increasingly diverse.However,the relationship between questions andknowledge points is complex and implicit,making it difficult for models to accurately uncover their underlying relational features in the absence of expert annotations.To address the limitations of existing methods in modeling the relationship between questions and knowledge points and their poor interpretability,this paper proposes a memory-augmented knowledge tracing model based on relational learning.Firstly,the model employs a self-supervised relational learning module composed of multi-layer Transformers to effectively model the relational features between questions and know-ledge points.It also incorporates dynamic multi-head attention to extract key information from question sequences,enhancing the model's ability to handle long-term dependencies in sequences.Then,a dual-matrix knowledge memory storage module is used to dynamically model students' mastery state of each knowledge point and predict their learning performance.Finally,a PGD-based adversarial training method is applied to generate adversarial samples for joint training,improving the model's generalization abi-lity.Comparative experiments with seven representative models on three knowledge tracking datasets demonstrate that MKTRL achieves improvements in both AUC and ACC metrics.Multi-dimensional experiments further validate the predictive effectiveness of the proposed mode.

Personalized Course Recommendation System Based on Knowledge Graph

ZHAO Lei, YANG Yulu, YUAN Bo

Computer Science. 2026, 53 (6): 93-101. doi:10.11896/jsjkx.250600154

Abstract

PDF(2010KB) ( 48 )

References | Related Articles | Metrics

As online learning platforms and course content multiply,users struggle to choose from a sea of information.Existing recommendation models,failing to fully exploit user-course interaction info,deviate from users' real needs,harming learning experience and resource-matching efficiency.To address this,IKGCN(Interactive Knowledge Graph Convolutional Network),a graph neural network recommendation model based on enhanced info representation,is proposed.It builds a course knowledge graph and a user-course interaction graph.Using a gating mechanism,it identifies and integrates complementary info from these two graph structures,fusing dual info dimensions deeply.This enables effective capture of user behavior dynamics,boosting course semantic representation accuracy and recommendation system intelligence.Experiments show IKGCN outperforms traditional baselines:recall rate and NDCG rise by 4.84% and 9.22% respectively,validating its effectiveness in optimizing online education recommendations.

Review on Parallel Training and Inference of Diffusion Models

ZHU Huming, LIU Huijie, DONG Ximiao, CHEN Zhipeng, GAO Tianqi, JIAO Licheng

Computer Science. 2026, 53 (6): 102-116. doi:10.11896/jsjkx.251000119

Abstract

PDF(3959KB) ( 66 )

References | Related Articles | Metrics

Diffusion models(DMs) have demonstrated outstanding generation quality and controllability in image and video gene-ration tasks.However,they still face significant system-level performance bottlenecks in large-scale training and inference scena-rios.Starting from the fundamental principles and modeling paradigms of diffusion models,and considering the computational cha-racteristics of denoising networks,this paper systematically analyzes the challenges encountered during both training and infe-rence,and summarizes the parallel optimization strategies and distributed training frameworks adopted by mainstream open-source diffusion models.The analysis shows that the training stage is mainly constrained by high memory consumption,long training time,and insufficient computational efficiency,while the inference stage suffers from redundant computation along the time-step dimension and high inference latency.To address these bottlenecks,this paper reviews and compares various optimizationmethods,including data parallelism,tensor parallelism,sequence parallelism,pipeline parallelism,and time-step parallelism,and systematically analyzes their applicability and potential efficiency gains in terms of memory optimization,communication cost reduction,and computation-communication overlap.Based on open-source technical reports and experimental results,the study demonstrates that parallel optimization can significantly reduce memory overhead and improve inference speed.Furthermore,the parallel support characteristics of mainstream diffusion model inference frameworks are investigated,revealing potential future directions in multi-node inference,dynamic scheduling,and mixture-of-experts parallelism.This study provides a systematic refe-rence for efficient training and inference of diffusion models and is of significant importance for performance optimization and distributed deployment of large-scale generative models.

GPU-based Implementation and Optimization of Banded Matrix LU Factorization

SUN Xiaoxue, JIA Haipeng, ZHANG Yunquan, YU Yue, QIN Pinle

Computer Science. 2026, 53 (6): 117-127. doi:10.11896/jsjkx.251000118

Abstract

PDF(5204KB) ( 43 )

References | Related Articles | Metrics

Band matrices are a class of sparse matrices characterized by non-zero elements concentrated around the main diagonal,and they are widely used in scientific computing and engineering simulations.LU factorization of such matrices plays a vital role in solving large-scale linear systems,particularly in simulation workloads,physical modeling,and engineering optimization problems.However,traditional LU factorization algorithms often suffer from limited parallelism and inefficient memory access patterns,which restrict their performance on modern hardware platforms.To address these challenges,this paper proposes a high-efficiency implementation and optimization strategy for band matrix LU decomposition on domestic domain-specific computing units(DCUs).The proposed method introduces two key techniques:one is a right-looking structure combined with a block partitioning strategy to improve computational locality and parallel execution efficiency;and the other is a high-performance panel factorization algorithm that reduces kernel launch frequency and global memory access overhead.Experimental results across different matrix sizes and bandwidths demonstrate that the proposed GPU-based implementation consistently outperforms CPU-based LAPACK and Intel MKL solutions in terms of runtime and resource utilization,while preserving numerical stability.Empirical results show that this approach exhibits strong scalability and practical engineering value,providing theoretical foundations and practical gui-dance for the efficient implementation of banded-matrix methods on heterogeneous parallel computing platforms.

Study on Compilation Technology of Neural Network Accelerator Based on RISC-V InstructionExtension

WANG Yipin, CAI Chenghuan, XU Jiabin, ZHOU Xuegong, ZHANG Fengzhe, CAO Wei, ZHANG Fan, YU Xinsheng

Computer Science. 2026, 53 (6): 128-136. doi:10.11896/jsjkx.250600137

Abstract

PDF(2325KB) ( 54 )

References | Related Articles | Metrics

With the rapid advancement of artificial intelligence,RISC-V instruction set extended neural network accelerators have become a research hotspot.Deep learning compilers are critical for efficiently deploying neural network models on hardware platforms.However,modifying the tuning rules of existing compilers based on hardware characteristics places extremely high demands on developers.At the same time,existing compilers lack compilation support for specialized transposition hardware units and cannot achieve efficient transposition through data flow reconstruction,resulting in the performance potential of such hardware not being fully utilized.To address these issues,this paper first designs an MLIR-based compiler toolchain targeting RISC-V neural network accelerators,enabling end-to-end model deployment.Secondly,it introduces hardware-specific dialects within the MLIR framework.This allows the compiler to optimize tiled matrix multiplication by integrating accelerator buffer size,systolic array dimensions,double buffering strategies,and dataflow patterns.Furthermore,it implements efficient matrix transposition by adjusting data flow patterns according to the hardware architecture,eliminating the need for large-scale data exchange.The paper also implements convolution operator fusion and matrix multiplication-vector computation fusion based on systolic array features.Experimental results demonstrate an average speedup of 14.55× after transposition optimization,an average 41.59% speedup after operator fusion,and a comprehensive average speedup of 5.13% for BERT model execution.

Kokkos-based Direct Solver and Its Implementation on Heterogeneous Platform

LI Zhenjia, WANG Wu

Computer Science. 2026, 53 (6): 137-144. doi:10.11896/jsjkx.251000114

Abstract

PDF(2345KB) ( 48 )

References | Related Articles | Metrics

A high-performance parallel LU direct solver based on the Kokkos framework is developed for solving large complex dense linear systems arising from the method of moments(MoM)in electromagnetic simulations on accelerator-based heterogeneous systems.The impedance matrix and excitation vector are efficiently filled in parallel on the GPU using Kokkos∷parallel_for.Based on the computed solution,the radar cross section(RCS)is obtained on the host by traversing observation angles and synthesizing the scattered field.The overall workflow is efficient and demonstrates good scalability.Performance evaluation is conducted on the “ORISE” supercomputing platform equipped with deep computing unit(DCU)accelerators.The impact of two-dimensional processor grid strategies on performance and communication overhead is analyzed.Under 16 processes,increasing the number of processors per row from 1 to 4 results in a 40.7% performance improvement and a reduction in communication overhead from 64.71% to 55.07%.Proper processor grid configuration effectively balances computation and communication,significantly reducing communication overhead and enhancing overall parallel efficiency.With 64 DCUs,the solver achieves a peak performance of 16 655.73 GFLOP/s,and when scaled to 2 048 DCUs,the performance increases to 58 338.90 GFLOP/s,showing good scalability.In the weak scalability test,the solver attains a parallel efficiency of 24.45% when scaling from 4 to 2 048 DCUs.These results indicate that the Kokkos-based direct solver delivers strong performance and is well-suited for large-scale electromagnetic simulation on heterogeneous high-performance computing platforms.

Research on Fortran Compiler Implementation Technology on CPU-DSP Heterogeneous Processor

ZHU Pengzhi, HUANG Chun, SHEN Jie, CHEN Cheng, XU Haoran, LONG Biao

Computer Science. 2026, 53 (6): 145-152. doi:10.11896/jsjkx.251000117

Abstract

PDF(3671KB) ( 38 )

References | Related Articles | Metrics

The Fortran language is widely applied in scientific and engineering computing domains.However,general-purpose digital signal processors(GPDSPs)currently rely primarily on C or assembly languages for programming,lacking Fortran support at present.Addressing this gap,this paper investigates the implementation technology of Fortran compilers targeting CPU-DSP he-terogeneous processors.Based on the LLVM Flang compiler framework,it designs and implements a Fortran compiler prototype named mtFortran,completing the Flang frontend migration while focusing on resolving compilation and runtime support challenges for Fortran programs in heterogeneous architectures.Key issues addressed include program loading/execution,syntax compatibility,built-in function implementation,and input/output(I/O)system adaptation.Experimental results demonstrate that this Fortran compiler successfully supports the syntactic features of the GUIDE F90 test suite within the commercial U_F90_TS_LITE benchmark collection.Among the 176 test programs are evaluated,all 102 programs with supported runtime libraries are passed heterogeneous compilation and execution validation.The implementation rate for built-in functions across major categories reaches 79.38%,enabling support for typical high-performance computing applications(e.g.,the NPB-EP benchmark program).This work establishes foundational runtime capabilities for Fortran programs on CPU-DSP heterogeneous processor architectures,laying the groundwork for subsequent enhancements in standard compliance,performance optimization,and parallelization extensions.

Machine Learning-based Parallel Parameter Optimization in High-performance ComputingApplications

LI Jinyou, ZHANG Wenshuai, SHEN Yu, ZHANG Yundong, LI Huimin, LI Jing

Computer Science. 2026, 53 (6): 153-162. doi:10.11896/jsjkx.251000113

Abstract

PDF(2899KB) ( 49 )

References | Related Articles | Metrics

In the field of high-performance computing(HPC),supercomputing platforms leverage large-scale computational resource pools to enable parallel processing of multiple tasks,significantly accelerating job execution.Their resource scheduling mechanisms require users to explicitly specify requested resources and their quantities when submitting jobs.However,current practice still heavily relies on manual experience-based parallel parameter configuration,often resulting in hardware resource utilization failing to approach theoretical optimal values,creating significant resource efficiency bottlenecks.For structurally complex HPC software,individual input parameters are tightly coupled with actual hardware resources;adjusting isolated parameters alone makes it difficult to construct a concise yet effective framework for thorough exploration,while exhaustive traversal of the entire parameter search space incurs prohibitively expensive configuration costs.Machine learning(ML)-based runtime optimization methods for parallel computing applications employ multiple efficient models to analyze and process extensive execution data,providing superior parameter configuration recommendations.These methods autonomously determine hardware parameter configurations for job requests without requiring users to specify resource amounts,thereby lowering the barrier to cluster software usage and improving work efficiency.This effectively addresses the problems of suboptimal parallel parameter configuration efficiency for supercomputing cluster users and the lack of high-quality job execution datasets for specific tasks in existing technologies.Experimental results demonstrate that compared to default or commonly used parameter configurations,ML-derived parameter configurations achieve consistent speedup and significantly reduced core-hour consumption across multiple typical VASP instances,exhibiting substantial practical value for widespread adoption.

Optimizing SPMM on ARM Architectures with JIT Instruction Generation

SHI Jun, WANG Qinglin, TIAN Feiyang, WANG Zhicheng, LI Runhua, LIU Jie

Computer Science. 2026, 53 (6): 163-170. doi:10.11896/jsjkx.251000116

Abstract

PDF(3588KB) ( 43 )

References | Related Articles | Metrics

In recent years,with the rapid adoption of ARM architecture processors in edge computing devices and cloud servers,as well as the increasingly critical role of sparse matrix multiplication(SPMM)in compute-intensive applications such as deep learning,research on sparse computing optimization for ARM platforms has become an academic hotspot.However,current SPMM solutions for ARM platforms primarily employ the ahead-of-time(AOT)compilation model,where all compilation is completed before program execution.Nevertheless,AOT solutions for SPMM face three major limitations:unnecessary memory accesses,additional branch overhead,and redundant instructions.This paper proposes ASPJIT,a just-in-time(JIT)assembly code generation framework for ARM platforms,designed to accelerate SPMM computation on ARM multi-core CPUs.ASPJIT dynamically optimizes the computation sequence using a sparse judgment-based column-to-row algorithm,leveraging runtime sparse feature analysis to significantly improve instruction-level parallelism(ILP).Additionally,ASPJIT reduces memory access latency by employing a register allocation strategy to cache frequently accessed data and maximizes arithmetic throughput by carefully selecting SIMD instruction sets.A performance evaluation of ASPJIT is conducted and compared it with two AOT baselines.The first involves existing SPMM implementations compiled with automatic vectorization using the ARM gcc compiler.The second utilizes optimized SPMM routines provided by ARM Eigen.The results demonstrate that ASPJIT delivers average speedups of 3.8x and 5.6x,respectively.

Workload Analysis and Modeling Method for High-performance Computing

WU Can, XIAO Haili, WANG Xiaoning, ZHAO Yining, LU Shasha, HE Rong

Computer Science. 2026, 53 (6): 171-184. doi:10.11896/jsjkx.250800064

Abstract

PDF(8857KB) ( 46 )

References | Related Articles | Metrics

In the field of high-performance computing(HPC),machine learning driven scheduling algorithms are increasingly becoming a research focus,with their performance optimization highly dependent on the quality of workload data.Therefore,research on workload analysis and modeling methods for HPC environments is of significant importance for improving the efficiency and adaptability of scheduling algorithms.This study conducts an in-depth analysis of HPC workloads based on publicly available operational logs from supercomputing centers,establishing a comprehensive methodology for workload analysis and modeling.The DBSCAN algorithm is first employed to clean anomalous data,followed by a systematic analysis of job arrival patterns,processor core distribution characteristics,application distribution features,runtime distribution attributes,and correlations among different parameters.Based on these findings,a flexible HPC workload generator is developed,supporting both default parameter configurations and automated analysis of SWF-format workload characteristics to meet diverse simulation needs.Experimental validation demonstrates that the synthetic data generated by this tool aligns more closely with real-world workload distributions compared to randomly generated data.The proposed generator can provide high-quality training data for the development of novel scheduling algorithms,contributing to improved resource utilization efficiency in supercomputing centers.

Research on Multi-level Optimization of SP Applications for Domestic Phytium Multi-core NUMAArchitecture Servers

REN Rongyao, MA Baiwei, DENG Guanghua, DU Qi, WANG Yueli, LI Shiyan

Computer Science. 2026, 53 (6): 185-192. doi:10.11896/jsjkx.251200067

Abstract

PDF(4027KB) ( 46 )

References | Related Articles | Metrics

This paper addresses the application bottlenecks of domestic Phytium multi-core NUMA architecture server platforms in high-performance computing scenarios,conducting multi-level optimization research around the SP benchmark application.The paper proposes and implements optimization strategies at four levels:compilation,memory allocation,NUMA topology,and vectorized reduction.Experiments analyze the execution time,parallel efficiency,and speedup for different dataset sizes running on 1 MPI process and 8 MPI processes with varying numbers of parallel cores.The analysis shows that the optimized computation time is significantly reduced.With 8 MPI processes and 128 parallel cores,the performance of medium to large datasets improves by 3 to 5 times,and performance degradation under high concurrency is alleviated.The optimized parallel efficiency is more linear,achieving multiple-fold improvements for small datasets with 8 MPI processes,while medium to large datasets maintain approximately 90%,85%,and 64%~71% efficiency at 32,64,and 128 parallel cores respectively,delaying the onset of performance saturation.In terms of speedup,8 MPI processes show better performance gains than 1 MPI process under high concurrency.Experimental results demonstrate that the proposed multi-level optimization strategies can effectively enhance the computational performance of SP applications on the target server architecture.Especially as the number of cores increases and NUMA effects become more pronounced,the optimization scheme exhibits strong scalability advantages,providing an optimization path for numerical simulation and scientific computing on domestic high-performance computing platforms.

High-performance Image Preprocessing Operators for Cambricon MLU Accelerator Card

LI Fei, LIU Song, GUO Songjian, LIU Jiazheng, ZHANG Ying, HONG Longwei, ZHANG Boxuan

Computer Science. 2026, 53 (6): 193-202. doi:10.11896/jsjkx.251000093

Abstract

PDF(3709KB) ( 50 )

References | Related Articles | Metrics

Image preprocessing is a critical component in machine learning tasks,and its computational efficiency directly impacts the performance of model training and inference.Traditional CPU computations struggle to meet the real-time processing demands of high-resolution images and large-scale datasets,whereas Neural Processing Units(NPUs),with their high performance,emerge as ideal platforms for accelerating image preprocessing.However,the diversity and memory access intensiveness of image preprocessing operations do not fully align with the matrix operation optimization mode of NPUs,posing significant challenges for their adaptation.Based on the Phytium CPU and Cambricon MLU heterogeneous computing platform,this paper proposes an efficient acceleration method for image preprocessing operators.By thoroughly analyzing the multi-core parallel architecture and sto-rage system of the MLU,the computational logic and task partitioning of the operators are redesigned.Combined with optimization strategies such as multi-core parallel scheduling,vectorized computation,three-level storage structure,and double buffering mechanism,the computational performance of the operators is significantly enhanced,demonstrating the potential of NPUs in accelerating image preprocessing.Experimental results show that the optimized ten common image preprocessing operators achieve a performance improvement of 33.22% to 234.46% compared to the native operators in PaddlePaddle.In deep learning and traditional machine learning tasks,end-to-end performance improvements of 12.47% and 48.63% are achieved,respectively.

MMCache:High-performance Cluster Cache with Memory-mapped Mirroring

LIU Zhongyi, XIAO Wei, ZHANG Lei, YAN Songbai, HUANG Xiangping, LI Mengxiao

Computer Science. 2026, 53 (6): 203-213. doi:10.11896/jsjkx.251000115

Abstract

PDF(2433KB) ( 40 )

References | Related Articles | Metrics

Applications of distributed systems in high-performance scenarios are becoming increasingly widespread.However,their reliance on traditional caching frameworks faces bottlenecks such as high overhead in complex object transmission,high synchronization latency across nodes,and inefficient data access and storage.To address these issues,this paper proposes MMCache,a high-performance cluster caching framework whose core innovations include:a memory-mapped mirroring data transfer and access mechanism,which leverages virtual address fixation to achieve cross-process zero-copy access and structure-level high-efficiency reads;a cooling pool-driven lock-free cache unit update mechanism,which ensures runtime state atomic visibility and rapid reclamation without blocking business access;and a generic memory allocation management framework for address space,which integrates TCMalloc with generic adapters to enable efficient allocation,reclamation,and fragmentation suppression of mirrored memory.Experimental results show that under multi-scenario workloads involving 3 million keys,MMCache maintains a controllable cooling pool occupancy ratio,achieves a transaction success rate above 99%,and sustains stable response times.Compared with Redis and Memcached,MMCache demonstrates significant advantages in high-frequency read/write operations,complex object storage,and cluster synchronization performance.The technology has been deployed in the back-end system for aviation fare transactions,showcasing high performance,low latency,and strong consistency,and has broad potential for application in high-performance data processing scenarios such as e-commerce and big data analytics.

Review of 3D Object Detection Based on LiDAR-camera Fusion

JI Wenyu, LI Yang, WANG Jiabao, FU Ruizhi, LIU Xiaoyu, MIAO Zhuang

Computer Science. 2026, 53 (6): 214-231. doi:10.11896/jsjkx.250400111

Abstract

PDF(5020KB) ( 46 )

References | Related Articles | Metrics

Multi-modal 3D object detection,as a fundamental task in autonomous driving and human-robot collaboration,has garnered significant attention in recent years.By integrating LiDAR and camera data,3D object detection facilitates effective information transmission and feature consolidation,thereby enhancing the understanding of complex environments and improving detection accuracy.However,as the number of fusion methodologies increases,traditional methods designed for dense scenarios encounter problems such as high computational costs and limited detection ranges,making them insufficient to meet the real-world requirements for long-range detection.Consequently,emerging methods are increasingly focused on developing novel fusion architectures to address detection challenges in sparse scenarios.This paper represents the first Chinese-language review to classify multi-modal 3D object detection methods from the novel perspective of dense scenarios versus sparse scenarios,providing a comprehensive analysis of the characteristics of existing literature and summarizing the evolutionary trends in this field.The primary contributions of this work are as follows:1)A novel classification framework is proposed,distinguishing between dense-level and sparse-level fusion based on the incorporation of dense Bird's-Eye-View feature maps,encompassing two major categories and five subcategories,while elucidating the characteristics and application scenarios of different fusion strategies;2)A detailed review of 15 publicly available datasets is conducted,with a focus on evaluation metrics for 3 mainstream datasets,and a comparative analysis of experimental results from over 10 distinct methods on these benchmarks is provided;3)The limitations of representative fusion methods are critically examined,and future research directions are outlined,offering insights into potential advancements in the field.

MambaCS:Mamba-based Image Compressed Sensing Algorithm

LI Xiuying, CHEN Xuesong, LI Haoze, LIAO Hongwei, HAN Jiameng, DUAN Xiaoyi

Computer Science. 2026, 53 (6): 232-241. doi:10.11896/jsjkx.250400147

Abstract

PDF(4716KB) ( 41 )

References | Related Articles | Metrics

Compressed sensing technology has been widely applied in the field of image acquisition and reconstruction,with its core objective being the accurate and high-quality reconstruction of images from a significantly reduced number of measurements.Compared to conventional signal processing approaches,deep learning-based image compressed sensing demonstrates superior performance in terms of reconstruction quality while also achieving reduced computational costs.Nevertheless,enhancing reconstruction accuracy under extremely low sampling rates remains a critical challenge.To address this issue,this paper proposes a novel deep unfolding framework named MambaCS,which significantly improves the performance of image compressed sensing through several innovative architectural components.During the sampling phase,a block-based measurement strategy is adopted to effectively balance computational complexity and sampling efficiency.In the reconstruction phase,the residual state space mo-dule is introduced,leveraging the advantages of Mamba in capturing long-range dependencies to better model complex spatial structures within images.Furthermore,in order to fully utilize the latent information contained in the measurement data,a multi-channel reuse block is incorporated into the reconstruction process.This module integrates multi-scale feature fusion with measurement reuse techniques,enhancing the network's ability to extract and represent key image features.Extensive experiments de-monstrate that MambaCS consistently outperforms existing state-of-the-art methods in terms of reconstruction accuracy and vi-sual quality.

Primitive Dynamic Weighting for Multi-modal Salient Object Detection

LI Peng, ZHANG Zihao, HAN Yahong

Computer Science. 2026, 53 (6): 242-251. doi:10.11896/jsjkx.250400143

Abstract

PDF(3649KB) ( 43 )

References | Related Articles | Metrics

To address the challenge that existing multimodal salient object detection methods are easily disrupted by low-quality auxiliary modalities,leading to poor model robustness,this paper proposes a multimodal salient object detection method based on saliency primitive dynamic weighting.Specifically,the process captures the common features of different salient objects and clusters them to obtain saliency primitives,thereby achieving clear definition of salient regions.These saliency primitives are then utilized to dynamically adjust the weighting of different modalities during the fusion stage,ensuring that semantic information from high-quality modalities is fully leveraged while effectively suppressing potential interference from low-quality auxiliary modalities.In addition,a primitive-guided feature alignment mechanism is introduced to effectively reduce the semantic gap between the primary and auxiliary modalities,further enhancing the model's detection performance.This mechanism also enables the model to more accurately capture cross-modal common features,thereby improving the accuracy and stability of detection results.To validate the effectiveness of the proposed method,extensive qualitative and quantitative evaluations are conducted on six RGB-D datasets and three RGB-T datasets.Experimental results demonstrate that the proposed method exhibits strong stability and robustness even in the presence of low-quality auxiliary modalities.

Power Object Detection Based on Spatial Interaction and Split Attention in Few-shots

WU Man, WANG Gaocai, LU Yuting, WEN Lili

Computer Science. 2026, 53 (6): 252-262. doi:10.11896/jsjkx.250400032

Abstract

PDF(4645KB) ( 49 )

References | Related Articles | Metrics

Power line inspection is a core task for preventing power outages,ensuring the safe and stable operation of power grids,and promoting economic development.In practical applications,challenges such as limited defect sample numbers,diverse equipment shapes,target occlusion/adhesion,and sample imbalance are often encountered.To address these issues,this paper proposes a novel two-stage object detection network,AKS²-Net,which integrates attention and metric learning improvements.This network combines multiple attention mechanisms for feature extraction and fusion,and uses metric learning for secondary scree-ning of candidate targets,enhancing the extraction and fusion of feature information for irregular,small/fuzzy,and occluded targets,and reducing the impact of few samples and sample imbalance on network performance.Specifically,1)It designs a group convolution feature extraction network AKS² block(AKConv,Spatial-shift and Split-attention Block)based on variable convolution and spatial shift,making spatial information interaction between image features and relationship learning between feature channels possible,thereby enhancing the network's ability to mine irregular and multi-scale feature information.2)It proposes a novel multi-branch attention multi-scale feature fusion(MAFF)module,which fuses multi-channel and multi-level image detail features and spatial information through atrous spatial pyramid pooling(ASPP)and mixed skip connections(MSC),thereby improving segmentation accuracy and boundary localization ability in complex scenes.3)It proposes a feature similarity calculation method based on metric learning,which re-screens candidate negative samples by calculating the similarity between negative sample features selected by the region proposal network and all support category features,and combines a threshold to correct positive samples misjudged as negative samples,reducing interference with network training and lowering the missed detection rate of small and fuzzy targets.4)It introduces the FocalLoss function in the classification loss calculation to mitigate the impact of sample imbalance on detection results.5)Based on AKS²-Net as the backbone,a two-stage fine-tuning-based object detection network suitable for small sample and imbalanced sample conditions is constructed,providing a new option for small sample object detection through the fine-tuning mechanism.A large number of experimental results show that the proposed method performs well on the power line object detection dataset,especially enhancing the network's detection ability for far small/fuzzy and occluded objects,and has significant practical value.Additionally,under similar experimental conditions using various datasets including small sample datasets,the proposed model demonstrates stronger competitiveness compared to existing object detection networks such as ResNet 50,ResNet 101,Inception ResNet,ResNeXt 101,and ResNeSt 101.

Object Detection Method Based on Dynamic Feature Fusion

LIU Jikang, HUANG Lei, ZHANG Ke, NIE Jie, WEI Zhiqiang

Computer Science. 2026, 53 (6): 263-269. doi:10.11896/jsjkx.250700103

Abstract

PDF(2674KB) ( 55 )

References | Related Articles | Metrics

Object detection is a foundational task in computer vision,with wide-ranging applications in autonomous driving,intelligent transportation,and medical diagnosis.DETR(DEtection TRansformer) pioneers an end-to-end Transformer-based detection framework that eliminates hand-crafted components.However,its decoder's strong inter-layer dependencies give rise to cascading negative optimization during decoding,impeding both training efficiency and accuracy.To address this limitation,this paper proposes DFF DETR(Dynamic Feature Fusion DETR),which dynamically fuses cross-layer features to enable effective propagation of object queries across decoder layers and reduce overreliance on preceding outputs.Additionally,it introduces an inter-layer supervisory signal during backpropagation to refine intermediate query representations and correct suboptimal early outputs.Extensive experiments on various DETR-based detectors demonstrate that integrating DFF DETR yields an average mAP improvement of approximately 1.0%,with particularly pronounced gains in small-object detection.

Pansharpening Method Based on Double-side Guided Filtering and Multi-feature Recalibration

MA Ning, CHANG Xia, YUAN Lingyu

Computer Science. 2026, 53 (6): 270-280. doi:10.11896/jsjkx.250400015

Abstract

PDF(8215KB) ( 39 )

References | Related Articles | Metrics

During the imaging process of remote sensing satellites,high-resolution panchromatic images and low-resolution multispectral images are usually obtained.To make full use of the advantages of these two types of images,this paper proposes a panchromatic sharpening method based on double-side guided filtering and multi-feature recalibration,aiming to generate high-resolution multispectral images that can maintain both spatial and spectral information.This method designs a dual-stream U-Net encoder-decoder network architecture.The multi-scale features of multispectral and panchromatic images are extracted by parallel branches,which avoids the information loss caused by direct feature fusion.A double-side guided filtering module is proposed,which can enhance feature interaction through the parallel feature-guided path and adaptive weight fusion mechanism.A multi-feature recalibration module is developed.This module combines a multi-directional edge detector and multi-head feature extraction mechanism and enhances the reconstruction ability of spatial details through a dynamic feature recalibration strategy,which effectively avoids the artifacts caused by excessive enhancement.Experiments show that the proposed method has good band adaptability.It can not only effectively process 4-band GeoEye1 and PLeiades1 data,but also perform well on 8-band WorldView2 and WorldView3 data sets.Quantitative evaluation results show that the proposed method achieves a CC value of 0.957 3 on the WorldView3 dataset;on the GeoEye1 dataset,the PSNR value reaches 38.093 8 dB,with the ERGAS value decreasing by 94.9% and the Q4 value increasing by 2.3% compared to the best-competing method.It is superior to the existing techniques in subjective visual effect and objective evaluation index and provides an effective solution for remote sensing image panchromatic sharpening tasks.

Survey of Recommendation Systems Based on Large Language Models

SHI Hongxu, LIU Yi, LIU Kun

Computer Science. 2026, 53 (6): 281-303. doi:10.11896/jsjkx.250900077

Abstract

PDF(2290KB) ( 50 )

References | Related Articles | Metrics

With the proliferation of Internet applications and the exacerbation of information overload,recommendation systems have been playing an increasingly important role in providing personalized services and enhancing user experience.Although traditional recommendation methods and deep learning techniques have made remarkable progress in modeling user-item interactions,they still suffer from limitations such as data sparsity,cold start problems,and insufficient understanding of users' deep intents.In recent years,large language models,armed with exceptional capabilities in semantic understanding,knowledge transfer,generation,and logical reasoning,have opened up new directions for recommendation system research.This paper presents a comprehensive survey of the research progress on recommendation systems based on large language models,focusing on analyzing their integration paths and technological evolution in the recommendation process.It covers key stages including feature engineering and data augmentation,feature encoding and semantic representation,recommendation result generation,system-user interaction,and coordinated control of the recommendation process,revealing the unique advantages of large language models in improving recommendation accuracy,interpretability,and dynamic adaptability.Furthermore,this paper discusses practical challenges such as model computational efficiency,knowledge credibility,system fairness,and data security,and prospects the future development directions.Through systematic organization and in-depth analysis,this paper aims to provide references and insights for theoretical innovation and practical applications in the field of recommendation systems.

Causal Intervention-based Mitigation of Spurious Correlations in Information Cascade PopularityPrediction

YU Liu, LI Shuo, KUANG Ping, ZHOU Fan, JIANG Tao

Computer Science. 2026, 53 (6): 304-314. doi:10.11896/jsjkx.250400079

Abstract

PDF(3127KB) ( 37 )

References | Related Articles | Metrics

Information cascade popularity prediction is often affected by spurious correlations arising from internal cascade dynamics.Most existing methods assume that cascade data follows an independent and identically distributed pattern,which does not hold in real-world scenarios with complex diffusion processes.This mismatch leads to significant performance degradation on out-of-distribution data.To address this challenge and enhance model robustness and generalization under distributional shifts,this paper proposes CCP(Causal Cascade Prediction),a dual-intervention framework based on causal inference.Specifically,intra-cascade intervention randomly prunes nodes to break the misleading correlation between observed cascade size and final popularity,while inter-cascade intervention incorporates information from structurally similar cascades to introduce data diversity.CCP decouples popularity from non-causal factors such as structure and observation time,enabling the model to capture true causal drivers of information spread.Experimental results on the Weibo and APS datasets show that CCP outperforms the state-of-the-art CasCIFF method,achieving 2%~5% improvement in MSLE,and 2%~3% in MAPE,and demonstrates 5%~7% better generalization performance under the same baseline,validating its effectiveness in handling distributional shifts.

Semi-supervised Learning Method Enhanced by Prototype Loss

NIU Jilong, GUAN Wenhui, ZONG Chenchen, HUANG Shengjun

Computer Science. 2026, 53 (6): 315-319. doi:10.11896/jsjkx.250600161

Abstract

PDF(1850KB) ( 40 )

References | Related Articles | Metrics

Semi-supervised learning methods can significantly improve the generalization ability of models while reducing labeling costs by utilizing a small amount of labeled data together with a large amount of unlabeled data.However,most existing methods rely on confidence thresholds to select unlabeled samples,causing some samples to remain underutilized in the later stages of training,which leads to information waste and limits further performance improvement.In addition,hard samples often have a negative impact on the training process,resulting in reduced model robustness.To address these issues,this paper proposes a semi-supervised learning method enhanced with prototype loss.By designing a dual-head training framework,the proposed method fully exploits the latent information in unlabeled data and effectively mitigates the negative impact of hard samples on model performance.Specifically,it employs consistency regularization to enforce prediction consistency for unlabeled samples under different perturbations,and introduces a prototype-based surrogate loss that measures the similarity between sample features and class prototypes to guide unlabeled samples toward the correct categories.Experimental results show that the proposed me-thod achieves significant performance improvements on multiple datasets,fully validating its effectiveness and robustness.In particular,on the CIFAR-10 and CIFAR-100 datasets with only 4 labeled samples per class,the proposed method improves classification accuracy by 1.82 percentage points and 1.07 percentage points compared to the best baseline methods,respectively.

Top-k Core-based Edge Structural Diversity Search in Temporal Graphs

ZHAN Chenting, ZHOU Junfeng, DU Ming

Computer Science. 2026, 53 (6): 320-331. doi:10.11896/jsjkx.250400076

Abstract

PDF(3555KB) ( 39 )

References | Related Articles | Metrics

Temporal graphs provide an effective framework for modeling time-varying interactions in dynamic networks,yet computing edge structural diversity efficiently remains a significant challenge.This study investigates the top-k edge core-based structural diversity search problem on temporal graphs and develops a multi-level optimization strategy.Previous methods lack core structure awareness and require full-graph traversal,leading to inefficiency.To address this,a t-core-based edge diversity metric is defined,which quantifies structural diversity by counting the number of t-core connected components in each edge ego-network.Two upper-bound pruning algorithms are designed:LUBP uses node degree distribution to estimate a loose upper bound for rapid candidate selection,while SUBP applies coreness information to build a tighter bound that further reduced the search space.Furthermore,the t-SUBP algorithm is proposed to restrict computation to high-coreness subgraphs(e.g.,(t+2)-core),avoiding unnecessary processing in low-coreness areas.Experiments on seven real-world datasets show that the proposed algorithms achieves up to an 8-fold improvement in efficiency compared to baseline methods,while maintaining good scalability with increasing graph size.

Fuzzy Three-way Clustering Based on Mean Shift

NI Yongting, QIAN Jin, YAN Shaowei, WU Yueyang

Computer Science. 2026, 53 (6): 332-338. doi:10.11896/jsjkx.250600056

Abstract

PDF(3829KB) ( 43 )

References | Related Articles | Metrics

Mean Shift is a widely used density-based clustering method,but it is greatly affected by bandwidth selection,and the partitioning characteristics based on the number of visits may lead to low clustering accuracy.To address these issues,this paper proposes a fuzzy three-way clustering algorithm based on Mean Shift(FTWMS).By introducing a dynamic drift point selection strategy and a fuzzy membership-based division mechanism,the proposed method optimizes the clustering process of the tradi-tional Mean Shift algorithm.Firstly,the algorithm enhances the drift process by dynamically selecting drift points,which helps identify more reliable cluster centers.In addition,the fuzzy membership is calculated by integrating the distances from data points to different drift points,the visit frequencies of clusters,and the local densities of cluster.Based on this,each cluster is respectively divided into core and boundary regions,yielding the final three-way clustering structure.Finally,experiments are conducted on 6 synthetic datasets and 6 UCI datasets.The results show that the FTWMS algorithm outperforms traditional Mean Shift,K-means and three-way evidence theory-based density peak clustering(3W-PEDP) algorithms in clustering evaluation metrics,including ACC,NMI,and ARI.The algorithm also demonstrates superior ability in delineating cluster boundaries,offering better overall performance than the three comparison algorithms.

Data Price Prediction and Interpretability Analysis Based on GWO-XGBoost Model andSHAP Values

YANG Jian, CAO Nan, JIN Dayi, ZHANG Jiaqi, YANG Taotao

Computer Science. 2026, 53 (6): 339-349. doi:10.11896/jsjkx.250900068

Abstract

PDF(5886KB) ( 68 )

References | Related Articles | Metrics

With the rapid development of data marketization,data pricing has become a critical issue that needs to be addressed.To address the opaque and inadequate interpretability of pricing mechanisms in this process,this paper proposes a data pricing model based on the grey wolf algorithm(GWO)optimized for XGBoost.Firstly,a raw dataset is obtained from the Youyi Data platform and subjected to descriptive statistical analysis.The data is then preprocessed by removing outliers,one-hot encoding,logarithmic transformation,and normalization.Feature correlation is analyzed using the Spearman correlation coefficient.Finally,the GWO algorithm is used to optimize XGBoost hyperparameters to improve the model's predictive performance.Experimental results indicate that the GWO-XGBoost model achieves a coefficient of determination(R²)of 0.971,significantly outperforming five baseline models.The GWO-XGBoost model also achieves significant improvements in metrics such as mean squared error(MSE),root mean squared error(RMSE),and mean absolute error(MAE)compared to traditional hyperparameter optimization methods such as grid search and random search.Furthermore,using the SHAP interpretability analysis method,an in-depthana-lysis is conducted of the model's prediction results from both global and local perspectives,identifying the data update interval as the dominant factor influencing the model's prediction results,contributing 95.16% of the total prediction increment.This research not only provides a scientific and rational mechanism for data pricing but also provides a clear direction for subsequent model optimization,which is of great significance for promoting the healthy development of the data element market.

Barrier-based Network Storage Ordering Method

HAN Lei, LI Mouxing, WU Zheng, FAN Weibei, QIAN Xiaoyan

Computer Science. 2026, 53 (6): 350-357. doi:10.11896/jsjkx.250300164

Abstract

PDF(2690KB) ( 49 )

References | Related Articles | Metrics

In traditional data exchange models,network latency and operating system scheduling may lead to disorder in data transmission and processing,resulting in significant delays and degraded storage performance.To address this issue,this study proposes a Barrier mechanism-based sequential write control method for storage systems and an enhanced M/G/1 queue model incorporating network transmission characteristics.The method ensures write operation orderliness through dedicated Barrier commands and quantifies storage processes using the improved M/G/1 queue model from queuing theory,thereby enhancing system reliability and consistency.Additionally,a data packet ordering mechanism is introduced to guarantee transmission sequence.Experimental results demonstrate that compared to conventional approaches(implemented in native Linux storage systems),the proposed method improves storage efficiency by an average of 25% under poor network conditions and up to 8% in normal network environments,while maintaining data consistency and sequential integrity.

Fuzzy Clustering-based DTN Routing Algorithm for IoT

CHANG Yanan, SUN Yi, CUI Jianqun, YAN Xianglong, ZONG Chenglu

Computer Science. 2026, 53 (6): 358-366. doi:10.11896/jsjkx.250400094

Abstract

PDF(3242KB) ( 38 )

References | Related Articles | Metrics

In IoT DTN networks,the sparse and highly dynamic distribution of nodes makes it difficult for source nodes to establish direct connections with destination nodes for data transmission.Therefore,how to design an efficient and stable routing algorithm is a critical challenge in IoT DTN networks.In recent years,the rapid development of machine learning has provided new insights into routing optimization,enabling algorithms to extract hidden patterns from complex network data and enhance decision-making efficiency and accuracy.This paper proposes FCMROP(An IoT DTN Routing Algorithm Based on the Fuzzy Clustering Model).The algorithm employs a multi-feature fusion-based dynamic clustering method,considering node buffer,activity le-vel,delivery probability,and structural stability to improve cluster partitioning and routing decision accuracy.Additionally,a fuzzy membership-weighted distance metric is introduced to optimize cluster selection.Simulation results demonstrate that compared to GMMR,DBSCAN-R,KROP,and Prophet routing algorithms,FCMROP achieves superior performance in delivery rate,average delay,and message drop rates.

Research on Hierarchical Fountain Codes for Multi-Availability-Zone Cloud Storage

SUN Jing, WANG Yi, CHEN Haiyan

Computer Science. 2026, 53 (6): 367-375. doi:10.11896/jsjkx.260200107

Abstract

PDF(2308KB) ( 43 )

References | Related Articles | Metrics

In modern multi-availability zone(Multi-AZ) cloud storage systems,failures exhibit significant hierarchical and correlated characteristics.Traditional erasure codes(e.g.,RS codes) or flat Fountain code schemes often overlook the physical topology,leading to high cross-AZ repair bandwidth for frequent local failures and poor flexibility in heterogeneous networks.To address this,this paper proposes a hierarchical AZ-Fountain coding model tailored for Multi-AZ environments.This model explicitly couples the rateless property of Fountain codes with the Multi-AZ failure domain structure,introducing a two-layer coding architecture:utilizing an AZ-optimized Bi-modal distribution(AZ-OBMD) locally to achieve zero cross-AZ traffic for local repairs,and an AZ-Raptor-link distribution(AZ-RLD) globally to ensure efficient recovery during AZ-level disasters.Experimental results show that,comparing to RS codes,LRC,and the recent AZ-Code,the proposed scheme maintains zero cross-AZ traffic for local block failures similar to LRC,while significantly reducing the number of participating blocks and local repair latency.Furthermore,it significantly lowers repair latency and transmission overhead during AZ-level disaster recovery,achieving a systematic trade-off between reliability and repair cost.

Distributed Multi-tag UWB Registration Method with Hash-based Slot Allocation and DynamicFrame Reuse

LI Tianliang, HUANG Baoqi, JIA Bing, HAO Lifei

Computer Science. 2026, 53 (6): 376-387. doi:10.11896/jsjkx.251100143

Abstract

PDF(4703KB) ( 37 )

References | Related Articles | Metrics

Ultra-Wideband(UWB) technology has received widespread attention in both academia and industry due to its high-precision localization characteristic.However,existing large-scale multi-tag UWB localization systems commonly rely on ALOHA-based random access mechanisms for the registration process,which are prone to severe message collisions under high-concurrency conditions.These collisions significantly increase registration delay and reduce resource utilization efficiency.To address this bottleneck,a distributed multi-tag registration method uses hash-based slot allocation and dynamic frame reuse is proposed.By designing a lightweight and reusable frame structure together with a distributed registration protocol without centralized scheduling,the method effectively reduces message collisions and substantially improves registration efficiency.In addition,an adaptive synchronization period adjustment algorithm is developed to mitigate synchronization deviations caused by slowly varying clock drift among nodes,enabling a stable and low-overhead time synchronization mechanism.Simulation results show that under dynamic tag arrival conditions,the achieved registration rate can closely approach the tag arrival rate,and the registration performance remains insensitive to variations in arrival rate and frame length.Compared with the typical Atlas FAST method,the proposed approach reduces registration delay by at least an order of magnitude,fully demonstrating its efficiency and feasibility.

Clock Analysis and/or Check in L2C Trusted Compiler and Investigation on Its Verification Framework

AN Yuanke, WANG Lei, WANG Shengyuan

Computer Science. 2026, 53 (6): 388-395. doi:10.11896/jsjkx.250400014

Abstract

PDF(2338KB) ( 40 )

References | Related Articles | Metrics

Scade is a well-known commercial tool widely used in developing safety-critical embedded control software.Its mode-ling language is a synchronous language,which extends from Lustre,a synchronous data-flow language.The L2C trusted compiler is one of the long-term projects on the highly trusted compilers from a Scade-like modeling language to C,developed with formal verification.Compared to other similar works,the source language features of the L2C trusted compiler are closer to those in the Scade modeling language.An important aspect of the front-end design in the L2C trusted compiler is involved,that is,the construction and verification of the clock analyzer or checker.This paper focuses on introducing the clock analyzing or checking rules adopted in the analyzer or checker,and briefly describes the investigation on its verification framework.

How to Filter Collaborative Development Projects from Open-source Communities－－Exploratory Study on GitHub

DONG Guojun, CHENG Can, FANG Qing, YOU Lan, WANG Wei, PENG Qingxi

Computer Science. 2026, 53 (6): 396-407. doi:10.11896/jsjkx.250700114

Abstract

PDF(3204KB) ( 41 )

References | Related Articles | Metrics

Collaborative development projects(CDPs) and engineering development projects(EDPs) represent two typical project models within the open-source software(OSS) ecosystem,reflecting a project's development status and engineering maturity.For developers,determining whether a project is a CDP is often challenging due to the lack of clear boundaries defining its collaborative nature.Conversely,identifying an EDP requires access to sufficient valuable development information.For researchers,neglecting CDP and EDP samples during selection contaminates the sample pool with numerous projects not development-oriented or unintentionally collaborative,thereby diminishing the validity of research findings.Current research lacks automated screening methods for these two project types.Addressing this gap,this study constructs a standardized dataset to validate the performance differences of 50 combinations of methods and features,along with 24 combinations of machine learning algorithms and feature sets,in project screening.This provides researchers with an efficient model for targeting relevant projects.The findings reveal:1)For scenarios demanding high Precision,baseline methods excel,achieving Precision scores of 0.900 and 0.880 when screening CDPs and EDPs,respectively;2)For scenarios prioritizing high F1-Score,machine learning methods perform best,yielding F1-Scores of 0.879 and 0.821 for CDP and EDP screening,respectively;3)Large language model methods achieve Precision scores of 0.691 and 0.569 for CDP and EDP screening,respectively;4)Integrating machine learning methods with existing screening approaches improve Precision by 4.56% to 42.8% for CDP screening and by 5.9% to 237.5% for EDP screening.

Characterizing Floating-point Errors

DUAN Mengyao, XU Jinchen, YANG Hongru, ZHANG Jianing, ZHANG Tao, LI Fei, ZHOU Bei

Computer Science. 2026, 53 (6): 408-415. doi:10.11896/jsjkx.250600214

Abstract

PDF(3141KB) ( 51 )

References | Related Articles | Metrics

Floating-point errors represent an inherent challenge in computer science,particularly in precision-sensitive domains such as scientific computing and financial analysis where such errors may lead to severe consequences.This study conducts a systematic analysis of floating-point error characteristics using four representative test suites comprising 109 benchmark programs,with particular focus on investigating the impact of floating-point precision,exponents and expression types to identify potential error regularities.Experimental results demonstrate statistically significant patterns in error distributions:1)98.46% of floating-point expressions exhibit identical error distribution trends between single-precision(FP32) and double-precision(FP64) formats;2)For dual-parameter expressions with multiplicative/divisive relationships,86.67% of significant errors occur when the exponents of both parameters demonstrate linear dependence;3)90.91% of rational expressions manifest maximum errors within critical neighborhoods where denominators approach zero.These empirical regularities in floating-point error behavior not only advance fundamental understanding of numerical inaccuracies but also provide actionable insights for enhancing error detection tools,optimizing numerical algorithms,and formulating rigorous error control strategies.

Automated Testing Method for Canvas Elements Based on Large Language Models

ZHANG Weifeng, WANG Xiangwei, XU Lei

Computer Science. 2026, 53 (6): 416-426. doi:10.11896/jsjkx.250900004

Abstract

PDF(4158KB) ( 53 )

References | Related Articles | Metrics

As a core component of modern Web applications,HTML5 Canvas is widely used for dynamic rendering of interfaces,data visualization,etc.However,since Canvas elements lack a DOM structure,existing Web testing tools struggle to test them effectively.In order to solve the above problem,this paper proposes an automated testing method for Canvas elements based on large models,which solves this challenge by combining the advantages of computer vision technology and large model technology.The YOLO object detection algorithm is used to extract the category and geometric attributes of the elements inside the Canvas interface,and further extract the color,related text and hierarchical relationship of the inferred elements to construct an enhanced DOM structure.Prompt strategies are designed to guide large models to make full use of Canvas images and enhance DOM information to generate high-coverage test cases.Experiments show that the proposed method is significantly better than the existing methods(such as VisionTasker) in terms of results,and achieves 10.53% and 16.85% improvement in element coverage and interaction coverage,respectively.In addition,only by using the enhanced DOM structure for test case generation,99.18% of the element coverage and 98.22% of the interaction coverage effect of the proposed method can be achieved with less resource consumption.In addition,this paper compares the performance of different large language models on the research tasks,verifying the versatility and effectiveness of the proposed method.

CausalVulGNN:Framework for Software Vulnerability Explanation Based on Causal Inferenceand Graph Neural Networks

ZHANG Xin, CHEN Wen

Computer Science. 2026, 53 (6): 427-436. doi:10.11896/jsjkx.250800076

Abstract

PDF(2423KB) ( 45 )

References | Related Articles | Metrics

Vulnerability explanation is a critical task in software vulnerability mining,aiming to uncover the root causes of vulnerability formation and improve the quality of vulnerability remediation.Existing approaches for vulnerability explanation can be broadly categorized into factual reasoning and counterfactual reasoning methods.Factual reasoning focuses on identifying code subgraphs that are highly correlated with the model's prediction,while counterfactual reasoning seeks to discover subgraphs responsible for vulnerabilities by applying minimal perturbations that change the model's output.However,traditional methods lack explicit modeling of the causal relationships between code and vulnerabilities,making it difficult to distinguish spurious variables(spurious subgraphs)that are only statistically correlated with the vulnerability from the true causal variables(causal subgraphs)that directly lead to vulnerability formation.As a result,vulnerability explanation can be misled by spurious correlations,producing inaccurate conclusions.To address this issue,this paper proposes CausalVulGNN,a causal explanation framework for vulnerability analysis based on graph neural networks (GNNs).The proposed method firstly employs multi-layer GNN aggregation to capture structural and semantic features by integrating local node and edge information within the code graph.Then,it introduces a sample reweighting mechanism and a causal structure model(CSM)based on the Hilbert-Schmidt independence criterion(HSIC)to model causal relationships among high-level GNN representations.Unlike traditional correlation-based approaches,CSM simulates perturbations to high-level variables and quantifies their direct impact on model predictions,enabling the identification of subgraphs that are truly causal to vulnerabilities.This helps eliminate spurious subgraphs that are only statistically correlated with the output,thereby improving the fidelity of explanations.Experimental results demonstrate that CausalVulGNN achieves significant performance improvements across multiple vulnerability detection models and explanation methods,validating its effectiveness from a causal inference perspective.Systematic experiments are conducted on the real-world large-scale vulnerability dataset Big-Vul,covering popular detection models such as GCN,GGNN,GIN,and GraphConv.The results show that integrating CausalVulGNN leads to substantial improvements in standard explanation metrics(Accuracy,Precision,Recall,and F1-score)for widely used explainers including CFExplainer,GNNExplainer,and GNN-LRP.In particular,CFExplainer achieves an average 19.6% increase in Accuracy and over 28% improvement in Recall,confirming the effectiveness of CausalVulGNN for vulnerability explanation and analysis.

Certificateless Anonymous Authentication Key Agreement Protocol Based on Lattice

ZHANG Qing, FENG Yan, ZHAO Hong, XIE Sijiang, FENG Yimeng

Computer Science. 2026, 53 (6): 437-445. doi:10.11896/jsjkx.250400029

Abstract

PDF(1772KB) ( 39 )

References | Related Articles | Metrics

Authentication key agreement protocols are crucial mechanisms for ensuring the authenticity of communication entities and the confidentiality of session keys,playing a vital role in modern communication systems.However,most current authentication protocols rely on traditional mathematical problems such as discrete logarithm or integer factorization,which are vulnerable to quantum computing attacks.To address this challenge,a certificateless anonymous authentication key agreement protocol based on lattices is proposed,whose security can be reduced to the intractability of the ring learning with errors problem.The proposed protocol achieves mutual authentication within two rounds of message exchanges,while ensuring user anonymity and forward security.Moreover,it eliminates the need for managing complex certificates,thereby mitigating the security risks associated with key management center.Through both formal and informal security analyses,it is demonstrated that the protocol effectively resists various attacks and exhibits strong robustness.Performance comparison results indicate that the protocol significantly enhances security while effectively reducing computational overhead.

Edge Federated Learning Privacy Protection Scheme Based on Multi-key Homomorphic Encryption

LI Ruirui, GUO Rui, ZHANG Yinghui, LI Xuelei, LIU Guangjun

Computer Science. 2026, 53 (6): 446-459. doi:10.11896/jsjkx.250600089

Abstract

PDF(3513KB) ( 55 )

References | Related Articles | Metrics

Federated learning allows users to collaboratively train a machine learning model by aggregating local model updates without sharing raw data.However,traditional federated learning faces issues such as reliance on a central server leading to single points of failure,privacy leakage,and communication bottlenecks.To address these problems,this paper proposes a decentralized multi-edge distributed federated learning privacy protection scheme.A local model aggregation mechanism based on Aggregation Multi-Key Cheon-Kim-Kim-Song(AMK-CKKS)is designed,it implements privacy protection for the raw data owned by data owners.Additionally,a distributed global model aggregation framework is constructed using the RingAllreduce algorithm,where edge servers replace the central server for global model aggregation,effectively reducing communication load and eliminating dependence on central nodes.Furthermore,the introduction of blockchain and the SG-PBFT consensus mechanism ensures that model update parameters are auditable and allows nodes to reach consensus quickly while ensuring the security of honest nodes during operation.Security analysis indicates that this scheme not only ensures the privacy of the model update parameters but also resists collusion attacks involving up tok<(n－2) participants.Moreover,compared to related schemes,the accuracy loss of the proposed model does not exceed 3%,and communication overhead is reduced by approximately 76%.