Computer Science

Perface of Special Issue of Intelligent Medical Engineering

Computer Science. 2025, 52 (9): 1-3. doi:10.11896/jsjkx.qy20250901

Abstract

PDF(1154KB) ( 27 )

Related Articles | Metrics

Research Progress of Machine Learning in Diagnosis and Treatment of Esophageal Cancer

WANG Yongquan, SU Mengqi, SHI Qinglei, MA Yining, SUN Yangfan, WANG Changmiao, WANG Guoyou, XI Xiaoming, YIN Yilong, WAN Xiang

Computer Science. 2025, 52 (9): 4-15. doi:10.11896/jsjkx.250100065

Abstract

PDF(2673KB) ( 18 )

References | Related Articles | Metrics

EC is a highly lethal malignancy worldwide,particularly in China.Its low early diagnosis rates and poor prognosis present significant challenges to clinical management.In recent years,ML technology,based on multi-modal data,has shown great potential in early EC diagnosis and treatment.Traditional ML methods integrate radiomic features from EC imaging and clinical textual data,thereby effectively improving the sensitivity of early lesion diagnosis and providing scientific support for the stratified management of high-risk patients.CNNs with their efficient parameter-sharing mechanisms and excellent local feature extraction capabilities,further enhance the accuracy of early EC diagnosis and screening.Moreover,combining CNNs with Transformer models based on self-attention mechanisms significantly strengthens global feature modeling,which demonstrating broad application potential in lesion segmentation,early diagnosis,treatment outcome prediction,and survival analysis of EC.However,the high heterogeneity of EC lesions and the class imbalance of imaging data continue to pose significant challenges to the clinical application of ML technologies.To advance intelligent diagnostic and therapeutic technologies for EC,this paper focuses on three critical areas:early screening and diagnosis,treatment outcome prediction and survival analysis,and image segmentation.It systematically reviews the current research status and challenges of traditional ML,CNNs,and emerging Transformer technologies in EC diagnosis and treatment,aiming to provide valuable insights and references for future research on intelligent EC diagnosis and treatment.

Deep Learning-based Kidney Segmentation in Ultrasound Imaging:Current Trends and Challenges

YIN Shi, SHI Zhenyang, WU Menglin, CAI Jinyan, YU De

Computer Science. 2025, 52 (9): 16-24. doi:10.11896/jsjkx.250300159

Abstract

PDF(1788KB) ( 20 )

References | Related Articles | Metrics

Kidney ultrasound segmentation plays a pivotal role in clinical diagnosis and treatment planning.This review systema-tically reviews key developments in renal segmentation techniques from 2017 to 2024,focusing on 2D/3D approaches and pathological tissue analysis.Current 2D methods encompass four categories:traditional texture-based techniques,U-Net variants,shape-prior integrated deep learning,and multimodal fusion approaches.The study comprehensively evaluates available datasets and standardized metrics,establishing critical benchmarks for the field.While significant progress has been made in 2D segmentation,persistent challenges include limited precision in fine structures,immature 3D techniques,inadequate pathological analysis,and data scarcity.Overcoming these limitations is crucial for clinical translation.Future directions emphasize refining structural segmentation,advancing 3D reconstruction,developing cross-modal learning,and creating comprehensive datasets.These efforts will enhance the clinical utility of renal ultrasound segmentation,bridging the gap between technical innovation and medical application.

Research Progress on Multi-domain Adaptation Problems in Clinical Data Modeling

CHEN Xiu, ZHANG Xinyun, CHENG Yuting, CHEN Wei, HUANG Zhengxing, LIU Zhenyu, ZHANG Yuanpeng

Computer Science. 2025, 52 (9): 25-36. doi:10.11896/jsjkx.250600104

Abstract

PDF(2105KB) ( 8 )

References | Related Articles | Metrics

With the deep integration of artificial intelligence and healthcare,clinical data is undergoing a paradigm shift from “aiding decision-making” to “driving decision-making”.Clinical data encompasses both structured and unstructured information such as patient symptoms,diagnostic images,and treatment records,providing crucial support for medical decision-making.However,due to the prevalent “domain shift” phenomenon,the independent and identically distributed(i.i.d.) assumption,which clinical AI models rely on for training and evaluation,is invalidated,severely restricting the models’ cross-domain generalization ability.Domain adaptation and domain generalization techniques can effectively enhance the cross-domain performance of models.The former adjusts models use unlabeled target domain data to adapt them to new environments,the latter learns domain-invariant features based on source domain data to achieve generalization without target domain data.Regarding the application progress of these two types of techniques in clinical data modeling,this paper classifies them into shallow and deep methods,demonstrates their application scenarios across different data types,and summarizes the current performance differences of various methods in terms of generalization performance,data dependency,and interpretability.

Multi-scale Multi-granularity Decoupled Distillation Fuzzy Classifier and Its Application inEpileptic EEG Signal Detection

JIANG Yunliang, JIN Senyang, ZHANG Xiongtao, LIU Kaining, SHEN Qing

Computer Science. 2025, 52 (9): 37-46. doi:10.11896/jsjkx.250300096

Abstract

PDF(2601KB) ( 12 )

References | Related Articles | Metrics

In the task of epileptic EEG signal detection,deep learning methods exhibit outstanding feature representation capabilities but suffer from poor interpretability.In contrast,the Takagi-Sugeno-Kang(TSK) fuzzy classifier is endowed with superior fuzzy-rule based interpretability,yet is hambered by its limited modeling ability.To well balance the performance and interpre-tability when deal with EEG signals,this paper proposes a Multi-scale Multi-granularity Decoupled Distillation TSK Fuzzy Classifier(MMDD-TSK-FC).Firstly,training one-dimensional Convolutional Neural Networks with different kernel sizes as teacher models enables comprehensive extraction of EEG features at multiple scales.Next,soft labels are generated by softening the outputs of the teacher models,and the Kullback-Leibler divergence between these soft labels and the outputs of TSK fuzzy classifiers with varying rule numbers is minimized to facilitate deep feature representation knowledge transfer.Meanwhile,the cross-entropy loss between the student model’s output and the ground-truth labels is minimized.Finally,the outputs of multiple TSK fuzzy classifiers are integrated using a voting mechanism.Meanwhile,the multi-granularity TSK fuzzy classifiers generate multiple sets of IF-THEN rules with varying levels of complexity,providing interpretable reasoning to support the model’s detection decisions.The experimental results on the Bonn and New Delhi HauzKhas epileptic EEG datasets thoroughly validate the superiority of the MMDD-TSK-FC,which improves accuracy by approximately 5% compared to the classical TSK classifier and outperforms other deep knowledge distillation models by around 3%.

M2T-Net:Cross-task Transfer Learning Tongue Diagnosis Method Based on Multi-source Data

ZENG Lili, XIA Jianan, LI Shaowen, JING Maike, ZHAO Huihui, ZHOU Xuezhong

Computer Science. 2025, 52 (9): 47-53. doi:10.11896/jsjkx.241000046

Abstract

PDF(2028KB) ( 11 )

References | Related Articles | Metrics

Coronary artery disease is a common clinical cardiovascular disease,and coronary intervention is one of its common treatment methods.However,diabetes mellitus is a risk factor for coronary artery disease,and the combination of diabetes mellitus and coronary artery disease significantly increases the risk of treatment,so early diagnosis and corresponding measures are of great clinical significance for these patients.Clinical indicators are important references for the diagnosis and treatment of coronary heart disease and their comorbidities,and most of these indicators are invasive.Tongue image,as an external manifestation of human health,not only reflects the tongue color,moss color and other characteristics,but also correlates with various physiological and pathological features of the heart.The development of deep learning provides assistance for objectivized and reproducible acquisition of tongue representations.However,existing tongue image classification methods are limited by the singularity of dataset labels,which leads to the lack of model generalization ability.To this end,a cross-task migration learning tongue diagnosis method M2T-Net based on multi-source data is proposed.Specifically it consists of two phases,in the pre-training phase of multi-source data,high quality image encoders under different tasks are acquired.In the cross-task migration phase,the feature representations from different tasks are fused for disease classification by combining the cross-attention mechanism.Experiments show that the performance of the M2T-Net model in the classification tasks of coronary heart disease and coronary heart disease accompanied by diabetes mellitus reaches a classification accuracy of 93%,which is better than the existing state-of-the-art methods,with strong generalization ability and practicality,and the cross-task acquisition of disease representations is more in line with the holistic diagnostic idea of Chinese medicine tongue diagnosis,which provides a more practical solution for the field of tongue image analysis.

DACSNet:Dual Attention Mechanism and Classification Supervision Network for Breast Lesion Detection in Ultrasound Images

LI Fang, WANG Jie

Computer Science. 2025, 52 (9): 54-61. doi:10.11896/jsjkx.241200170

Abstract

PDF(4379KB) ( 12 )

References | Related Articles | Metrics

Ultrasound imaging is the most commonly used technology for breast lesion detection and automated lesion detection of breast ultrasound images has attracted increasing attention from researchers.However,most studies fail to fully integrate image information to enhance features,and they do not take into account the problems of the increased model complexity and neglect the issue of rising false positive rates.Therefore,this paper improves existing RetinaNet model by using VMamba as the backbone network and proposes a lesion detection network based on dual attention mechanism and classification supervision(DACSNet) to improve the accuracy of lesion detection in breast ultrasound images and reduce the false-positive rate.Specifically,this paper incorporates medical domain knowledge into the attention module,where the dual attention module(DAM) effectively enhances feature representation in both the channel and spatial dimensions.The DAM involves only a small number of parameters and effectively boosts the model’s detection performance.Furthermore,to reduce the false-positive rate of lesion detection,a classification supervision module(CSM) is added the model to integrate lesion classification information,achieving secondary focus on suspected lesion areas.To verify the performance of DACSNet,experiments for breast lesion detection are conducted on three sets of publicly available breast ultrasound image datasets,and the experimental results demonstrate the effectiveness of this method.

Application of End-to-End Convolutional Kolmogorov-Arnold Networks in Atrial Fibrillation Heart Sound Recognition

DENG Hong, CHEN Yan, YANG Hongbo, ZHAO Feng, JIANG Yongzhuo, GUO Tao, WANG Weilian

Computer Science. 2025, 52 (9): 62-70. doi:10.11896/jsjkx.250100102

Abstract

PDF(2619KB) ( 11 )

References | Related Articles | Metrics

Atrial fibrillation(AF) is a severe cardiac arrhythmia that requires early diagnosis.Traditional diagnostic methods typically involve cardiologists using electrocardiograms(ECG) and echocardiograms to make diagnostic conclusions.To address issues such as high costs,excessive reliance on clinical expertise,and limited accessibility,this study proposes an innovative application of Kolmogorov-Arnold Networks(KAN) in AF heart sound analysis.This study explores the application of convolutional KAN(CKAN) in AF heart sound recognition,proposing an end-to-end AF identification framework based on CKAN architecture,which incorporates flexible linear activation functions and exhibits superior parameter efficiency.To enhance the usability of heart sound signals,the methodology first applies preprocessing,including signal segmentation,quality assessment,and data cleansing.Subsequently,the model autonomously learns discriminative features through KAN-based convolutional and pooling layers.Fina-lly,a CKAN-based classifier is employed for classification.During the feature extraction phase,self-attention mechanisms and focus modulation are incorporated into CKAN to efficiently extract signal features.In the classification phase,CKAN’s bottleneck structure and regularization techniques are explored to improve the model’s recognition performance.The proposed model is evaluated on a heart sound dataset from The First Medical Center of Chinese PLA General Hospital,achieving an accuracy of 97.86%,sensitivity of 98.18%,specificity of 97.43%,and an F_β score of 98.06%.The results indicate that the CKAN model provides significant advantages in aiding the diagnosis of AF from heart sound signals.

Graph-based Compound-Protein Interaction Prediction with Drug Substructures and Protein 3D Information

LI Yaru, WANG Qianqian, CHE Chao, ZHU Deheng

Computer Science. 2025, 52 (9): 71-79. doi:10.11896/jsjkx.250100116

Abstract

PDF(2386KB) ( 11 )

References | Related Articles | Metrics

Drugs exert therapeutic effects by interacting with proteins to inhibit or activate the functions of specific proteins.In recent years,deep learning methods have made significant progress in predicting compound-protein interactions.However,most existing studies still focus on extracting overall features from drugs and proteins,neglecting the exploration of drug target information,the three-dimensional spatial information of protein structures,and the role of drug key substructures in predicting compound-protein interactions.To address this issue,a new model is proposed,which combines the functional groups of drugs,the overall structural graphs of drugs,and the sequence and three-dimensional spatial graph information of proteins.By utilizing the fusion of graph neural networks and attention mechanisms,efficient feature learning and prediction are conducted.The experimental results on the public datasets of Human and C.elegans show that the proposed model performs excellently in CPI prediction,with an improvement of more than 1% in ACC,AUROC,and AUPR indicators,and demonstrates a stable performance advantage on imbalanced datasets.

Selective Ensemble Learning Method for Optimal Similarity Based on LLaMa3 and Choquet Integrals

FU Chao, YU Liangju, CHANG Wenjun

Computer Science. 2025, 52 (9): 80-87. doi:10.11896/jsjkx.250100150

Abstract

PDF(3013KB) ( 10 )

References | Related Articles | Metrics

To screen and weight base classifiers with correlation during the ensemble process of multiple classifiers,this paper proposes a selective ensemble learning method for optimal similarity based on LLaMa3 and Choquet integrals(LCOS-SELM).Leveraging the open-source LLaMa3,this method efficiently achieves key feature extraction from unstructured text through prompt learning,using only a small amount of labeled data.Subsequently,it integrates the prediction results of classifiers with correlated relationships using the Choquet integral,evaluating their correlation to optimize classifier selection.Finally,it employs an optimal similarity strategy to learn classifier weights,ensuring sample consistency while enhancing the performance of the ensemble me-thod.The LCOS-SELM is used for the auxiliary diagnosis of Crohn’s disease,using 297 examination reports from a tertiary hospital in Hefei.Experiments are conducted by comparing it with endoscopic examination reports to validate the effectiveness of the proposed method.Under the same experimental conditions,LCOS-SELM demonstrates an approximate 8% improvement in Accuracy (Acc),F1-score,and AUC compared to the single classifier,and an approximate 2% improvement in all three metrics compared to traditional ensemble models,further validating its performance advantage.

DHMP:Dynamic Hypergraph-enhanced Medication-aware Model for Temporal Health EventPrediction

WU Hanyu, LIU Tianci, JIAO Tuocheng, CHE Chao

Computer Science. 2025, 52 (9): 88-95. doi:10.11896/jsjkx.250300012

Abstract

PDF(1968KB) ( 11 )

References | Related Articles | Metrics

Temporal health event prediction remains a fundamental challenge in medical Al.To address the critical problem of modeling complex medication-diagnosis relationships in EHR data this paper propose the DHMP model.Firstly,a dynamic subgraph learning mechanism captures local di-sease progression patterns.Secondly,a novel multi-hypergraph fusion architecture jointly models drug interactions and diagnosis associations.Finally,a temporal attention algorithm deciphering long-term depen-dencies in clinical records.Extensive experiments on MIMIC-III and MIMIC-IV datasets demonstrate that DHMP model has state-of-the-art performance,achieving 26.68% w-F1 in diagnosis prediction and 90.65% AUC in risk prediction.Clinical evaluation shows 89% consistency between model predictions and medical expertise,proving its reliability for decision support.

Drug Combination Recommendation Model Based on Dynamic Disease Modeling

HU Hailong, XU Xiangwei, LI Yaqian

Computer Science. 2025, 52 (9): 96-105. doi:10.11896/jsjkx.250300033

Abstract

PDF(3540KB) ( 9 )

References | Related Articles | Metrics

Addressing critical gaps in existing research regarding dynamic prescription adaptation to evolving patient conditions and drug-drug interactions,this study proposes MRNET(Medical Recommendation Network),a novel dynamic disease modeling framework for optimized drug combination recommendation.The model uses the graph convolutional network for pre-training to mine the potential association information between entities by associating related entities and providing data support for subsequent dynamic disease modeling and drug combination recommendation.Subsequectly,the MRNET employs Transformer-based temporal modeling to capture longitudinal disease progression patterns,effectively characterizing dynamic clinical state transitions.At the same time,by comparing the similarity of diagnoses and procedures horizontally,the applicability and differences of different prescriptions under similar conditions and diagnoses can be considered.The combination of horizontal comparison and longitudinal disease dynamics enables the model to more comprehensively evaluate the rationality and applicability of drug combinations in the drug recommendation process.Finally,the introduction of drug side effects can screen out safer and more effective drug combinations,and improve the accuracy and safety of drug recommendations.Experimental validation demonstrates MRNET’s superior performance,achieving improvements of 2.07%,1.96%,and 1.72% in Jaccard similarity,F1-score,and PRAUC metrics respectively over state-of-the-art baselines.The advantages of MRNET in these important metrics fully demonstrate its superiority in drug combination recommendation.

Machine Learning Based Interventional Glucose Sensor Fault Monitoring Model

LIU Sixing, XU Shuoyang, XU He, JI Yimu

Computer Science. 2025, 52 (9): 106-118. doi:10.11896/jsjkx.250300037

Abstract

PDF(6344KB) ( 11 )

References | Related Articles | Metrics

With the advancement of sensor technology,blood glucose monitoring has evolved from traditional single-point sampling to continuous dynamic monitoring CGM,enabling real-time monitoring of interstitial fluid glucose concentration through interventional glucose sensor.The operational status of glucose sensors is crucial for monitoring accuracy,but sensor fault detection faces the challenge of class imbalance,leading to degraded performance of machine learning models.Based on this,this paper proposes an optimization strategy that combines data preprocessing,feature engineering,and model integration.Firstly,the completeness and reliability of the data are improved through missing value imputation and noise reduction.Secondly,the SMOTE is used to oversample minority class samples,alleviating the class imbalance problem.Finally,a two-layer model is constructed using the Stacking ensemble learning method,which combines base classifiers of XGBoost optimized with Focal Loss and CatBoost with a LR meta-classifier,further enhancing the accuracy of fault monitoring.To demonstrate the effectiveness of the proposed model,its prediction results are compared with other models,including a single XGBoost model based on Focal Loss and ensemble models constructed with SVM,KNN,and LightGBM as base classifiers.The results show that the proposed XGBoost optimized with the Focal Loss and CatBoost models,perform well in the sensor fault classification task,with both PR and ROC curves outperforming other models,achieving precision and recall rates of 0.925 0 and 0.923 8,respectively.

Delayed PET Reconstruction for Early Tumor Diagnosis:Multimodal PET/CT Nuclear Matrix- constrained Delayed Imaging Algorithm

SONG Zhichao, ZHANG Jianping, ZHANG Qiyang, FANG Xi, XIE Liang, SONG Shaoli, HU Zhanli

Computer Science. 2025, 52 (9): 119-127. doi:10.11896/jsjkx.250400037

Abstract

PDF(4372KB) ( 13 )

References | Related Articles | Metrics

Positron emission tomography(PET) delayed imaging is of great significance in tumor heterogeneity analysis and treatment evaluation,but its clinical application is limited by low resolution,high noise and inaccurate quantification.Computed tomography(CT) can provide high-resolution anatomical information,but lacks functional information in tumor assessment,making it difficult to distinguish benign and malignant lesions and evaluate metabolic activity.Although dynamic PET/CT fusion can improve image quality,multiple CT scans increase patient radiation exposure and are not conducive to long-term follow-up.To address the above problems,a super-resolution enhanced PET/CT multimodal kernel matrix constraint algorithm(SR-PET/CT-KMC) is proposed.The algorithm super-enhances the initial scan PET image based on stable diffusion,combines it with the anatomical prior information of the initial scan CT image,and establishes an expectation maximization(EM) iterative framework for multimodal PET/CT kernel matrix constraints.Stable diffusion is used to improve the resolution of the initial scan PET,while the multimodal PET/CT prior information is used to suppress noise and artifacts.By utilizing the structural information of the initial scan CT,the need for CT scans in delayed imaging is reduced,thereby reducing the patient’s cumulative radiation exposure.Experimental results show that compared with PET-KEM,the peak signal-to-noise ratio(PSNR) of SR-PET/CT-KMC is improved by 6.23%,the structural similarity index(SSIM) is improved by 9.64%,the normalized root mean square error(NRMSE) is reduced by 33.3%,and the mean square error(MSE) is reduced by 13.92%.Compared with CT-KEM,the PSNR is improved by 4.05%,the SSIM is improved by 1.11%,the NRMSE is reduced by 33.3%,and the MSE is reduced by 8.11%.These results show that this method has advantages in improving the resolution and quantitative accuracy of delayed scanning PET images,providing a new imaging paradigm for tumor metabolism tracking and improving the clinical feasibility of delayed PET imaging.

Node Failure and Anomaly Prediction Method for Supercomputing Systems

ZHAO Yining, WANG Xiaoning, NIU Tie, ZHAO Yi, XIAO Haili

Computer Science. 2025, 52 (9): 128-136. doi:10.11896/jsjkx.240700171

Abstract

PDF(1915KB) ( 11 )

References | Related Articles | Metrics

As the scale of supercomputing systems continues to expand,the probability of computing node failures and anomalies also increases,seriously affecting the stability of systems.Traditional fault response methods mainly apply post event response and remedial policies,which can only partially recover the wastage.Predicting node failures and anomalies in advance can provide more response and processing time,thus has become a research hotspot.This paper proposes a node failure and anomaly prediction method for improving the stability of supercomputing systems and reducing the waste of computing resources.The method analyzes the historical running data of the system,and marks anomalies through unsupervised methods plus a small amount of manual assistance.These anomalies are used to find correlating pre-running features are discovered in the original dataset.Prediction models are then established using machine learning methods.This prediction method achieves the precision over 78% and the recall around 90% through cross validation over the original dataset,and it also ensures sufficient lead time.The dataset used in the evaluation comes from the raw running data of a real supercomputing system,proving the applicability of the proposed method.

Research on Parallel Scheduling Strategy Optimization Technology Based on Sunway Compiler

XU Jinlong, WANG Gengwu, HAN Lin, NIE Kai, LI Haoran, CHEN Mengyao, LIU Haohao

Computer Science. 2025, 52 (9): 137-143. doi:10.11896/jsjkx.241200072

Abstract

PDF(1789KB) ( 6 )

References | Related Articles | Metrics

Scheduling strategies are an important part of compiler parallelization,ensuring load balancing on multi-core processors.However,the default static scheduling used by the Sunway GCC compiler divides loop iterations statically,causing load imbalance in irregular loop structures,which impacts the performance of parallel programs on the Sunway platform.To address this problem,the proposed method combines trapezoid scheduling strategy,balancing scheduling overhead and load balancing,to improve the existing scheduling strategy of Sunway GCC.This strategy tested on the SW3231 processor using 844 parallel test cases from the GCC compiler test suite,and performance tested on the SPEC OMP 2012 benchmark and four typical loop types,shows a performance improvement of up to 1.10 and 4.54 compared to the three standard scheduling strategies in Sunway GCC.This method enhances thread-level parallelism in scientific computing programs,providing valuable insights for parallel compilation on the Sunway processor platform.

Partial Differential Equation Solving Method Based on Locally Enhanced Fourier NeuralOperators

LUO Chi, LU Lingyun, LIU Fei

Computer Science. 2025, 52 (9): 144-151. doi:10.11896/jsjkx.240700122

Abstract

PDF(4294KB) ( 9 )

References | Related Articles | Metrics

Partial differential equations(PDEs) are crucial mathematical tools for describing real-world systems,and solving them is key for predicting and analyzing system behavior.Analytical solutions for PDEs are often difficult to obtain,and numerical methods are typically used for approximate solutions.However,numerical solutions for parameterized PDEs can be inefficient.In recent years,the use of deep learning for solving PDEs has shown its advantages in addressing these issues,and particularly the Fourier Neural Operator(FNO) has proven effective.However,FNO only captures global information through convolution in the frequency domain and struggles with multi-scale information of PDEs.To address this challenge,a locally-enhanced FNO model is proposed,incorporating a parallel multi-size convolution module in the Fourier layer to enhance the model’s capability to capture local multi-scale information.Behind the linear layer,a multi-branch feature fusion module is introduced,enhancing the model’s ability to integrate multi-channel information by elevating the data across different channels.Experimental results demonstrate that the model reduces errors by 30.9% in solving Burgers’ equation,18.5% in Darcy Flow equations,and 5.5% in Navier-Stokes equations.

Maximum Error Parallel Detection Method Based on Locality Principle

JI Liguang, YANG Hongru, ZHOU Yuchang, CUI Mengqi, HE Haotian, XU Jinchen

Computer Science. 2025, 52 (9): 152-159. doi:10.11896/jsjkx.240800018

Abstract

PDF(3026KB) ( 10 )

References | Related Articles | Metrics

Floating-point numbers use a finite number of digits to represent infinite real numbers for computation,so floating-point computation is inherently inaccurate,which can be measured by the maximum error.The traditional floating-point maximum error detection algorithm uses serial computing thinking combined with classical search algorithm.When the number of sampling points is small,it is easy to treat the local maximum as the global maximum,thus omitting the maximum error value.If the number of sampling points is increased on a large scale,the time of the detection program will be greatly increased and the perfor-mance will be reduced.In this paper,the parallel computing mode is used to exponentially increase the number of sampling points,and the floating-point dynamic sampling strategy is used to near the error hot spot in combination with the principle of synchronization and locality,which greatly improves the accuracy of the detection results.This method can maximize the computing power of parallel computing,which can not only improves the detection accuracy of the maximum error of floating-point number calculation,but also reduces the execution time of the detection program and improves the performance,and the acceleration ratio can reach 1 136.3.The maximum error value detected is better than the current mainstream detection tools,which provides a new detection method for measuring the floating-point number calculation index.

Efficient Hardware Implementation of Pipelined NTT for Dynamic Rotation Factor Generation

HAN Yingmei, LI Bin, LI Kun, ZHOU Qinglei, YU Shiliang

Computer Science. 2025, 52 (9): 160-169. doi:10.11896/jsjkx.250200083

Abstract

PDF(3328KB) ( 9 )

References | Related Articles | Metrics

In the field of fully homomorphic encryption,the high computational complexity of polynomial multiplication has always been a bottleneck for performance improvement.To accelerate this process,the Number Theoretic Transform(NTT) has been widely used,but traditional NTT architectures based on multipath delay commutation have several drawbacks,such as insufficient hardware utilization and large overhead of twiddle factors.To address these issues,this paper proposes a novel NTT hardware accelerator design based on a pipelined architecture.This design optimizes the multipath delay commutation architecture through a unified NTT/INTT(Inverse NTT) framework,combined with pipelining techniques to achieve highly parallel computation.During the design process,the optimized integrated butterfly units introduce a pipeline-friendly improved K-RED algorithm,which can flexibly adapt to various computational strategies and accelerate large-scale data processing.Moreover,the design employs a dynamic twiddle factor generation technique,with a Twiddle Factor Generation(TFG) unit closely collaborating with the butterfly units.In the initial small parameter stage,pre-stored twiddle factors are read from ROM to save logic resources.When parameters are larger,the TFG dynamically generates twiddle factors and directly assigns them to the corresponding butterfly units,achieving efficient resource balancing and effectively reducing the storage requirements for twiddle factors.Field Programmable Gate Array implementation results show that,compared with previous studies,the proposed solution improves the execution speed of NTT on fully homomorphic encryption parameter sets by 1.14 to 40.81 times.At the same time,in terms of hardware resource usage,the solution reduces the area-time product metric by 23% to 82%.

Joint Function Deployment Optimization Method for WebAssembly and Containers Based on Longest Latency-Weighted Bandwidth

CHEN Ranzhao, LI Zhexiong, GU Lin, ZHONG Liang, ZENG Deze

Computer Science. 2025, 52 (9): 170-177. doi:10.11896/jsjkx.250300031

Abstract

PDF(2772KB) ( 10 )

References | Related Articles | Metrics

Container technology has been widely used in the oblivious computing platform of edge servers due to its advantages such as lightweight nature,easy deployment,and high availability.However,with the increasing demand for low-latency in applications,the high-latency problem caused by the cold start of containers has gradually become a bottleneck for system perfor-mance.WebAssembly(Wasm),with its lightweight sandbox feature and millisecond-level startup capability,has become an important complementary solution to container technology in certain scenarios.However,its computational performance is inferior to that of containers.Especially when dealing with complex interdependencies between functions,the inherent advantages and disadvantages of Wasm and containers make the decision on function deployment methods and locations extremely difficult.To address this issue,this paper constructs an oblivious computing model for servers based on function dependencies,transforming the pro-blem of mixed deployment of Wasm and containers into a non-linear integer programming problem.This problem is subsequently proven to be an NP-hard problem.Therefore,this paper designs the Long-Latency-Sensitive Weighted Bandwidth Greedy Scheduling Algorithm(LLS-WBG).Based on function dependencies and the longest completion time of predecessor functions,it calculates the server bandwidth with weights to optimize resource utilization and reduce the tail latency of tasks.Experimental results based on real-world data show that,in the edge computing scenario,compared with the state-of-the-art algorithms,the proposed algorithm can reduce the application completion time by 44.45%.

Prediction of Resource Usage on High-performance Computing Platforms Based on ARIMAand LSTM

LI Siqi, YU Kun, CHEN Yuhao

Computer Science. 2025, 52 (9): 178-185. doi:10.11896/jsjkx.241100174

Abstract

PDF(3565KB) ( 11 )

References | Related Articles | Metrics

The rapid increase in data scale and experimental complexity in scientific research and engineering simulations has significantly heightened the demand for high-performance computing(HPC) resources.However,limited availability of such resources necessitates efficient utilization strategies.This paper analyzes over 400 000 job records collected from the HPC cluster at East China Normal University between January 2022 and November 2023.After preprocessing the data,daily job counts and CPU utilization rates are extracted to represent the cluster’s resource usage patterns.Autoregressive Integrated Moving Average(ARIMA),Long Short-Term Memory with Dual Dropouts(2DLSTM),and a hybrid ARIMA-2DLSTM are applied to fit the historical data and forecast short-term and long-term resource usage.Model performance is evaluated by mean absolute error(MAE) and mean squared error(MSE) metrics.Results indicate that the ARIMA-2DLSTM hybrid model achieves superior predictive accuracy compared to standalone ARIMA and 2DLSTM models.The hybrid model effectively captures trend changes of the cluster’s resource usage patterns and accurately predicts peak and trough timings,providing critical insights for optimizing resource allocation in HPC environments.

SLP Vectorization Across Basic Blocks Based on Region Partitioning

HAN Lin, DING Yongqiang, CUI Pingfei, LIU Haohao, LI Haoran, CHEN Mengyao

Computer Science. 2025, 52 (9): 186-194. doi:10.11896/jsjkx.241100130

Abstract

PDF(2358KB) ( 8 )

References | Related Articles | Metrics

Automatic vectorization is a key technique in mainstream compilers for uncovering data-level parallelism and enhancing program performance.Traditional SLP vectorization struggles with cross-basic-block statement vectorization,particularly when consecutive vectorizable instructions are split by basic block boundaries,limiting its ability to detect potential vectorization opportunities.To address this,this paper proposes a region-based cross-basic-block SLP vectorization method that extends the analysis scope to multiple basic blocks within dominance relations,effectively breaking basic block boundaries and uncovering more vectori-zation opportunities.Implemented in the GCC 10.3.0 compiler,the proposed method is evaluated using relevant program segments from the SPEC CPU2006 benchmark.Experimental results demonstrate that the proposed method achieves up to a 12% speedup in SPEC CPU2006,an average speedup of 8% for related test programs,and a 3% average speedup in the Polybench benchmark compared to traditional SLP methods,validating its effectiveness.This work provides a technical reference for improving SLP vectorization efficiency in GCC compilers.

Survey of Data Classification and Grading Studies

LIU Leyuan, CHEN Gege, WU Wei, WANG Yong, ZHOU Fan

Computer Science. 2025, 52 (9): 195-211. doi:10.11896/jsjkx.240800149

Abstract

PDF(2089KB) ( 11 )

References | Related Articles | Metrics

In recent years,the continuous development of various information systems and the Internet of Things have led to their increasingly close integration with daily human activities.The massive data generated as a result has become a new type of productive asset in today’s socio-economic context,and even a national strategic resource,making data governance a growing focus of attention for governments,enterprises,and research institutions.Accurate and reasonable data classification and grading,as the most fundamental step in data governance tasks,will have a significant impact on subsequent data ownership determination,sharing,and security protection.This paper first defines the task of data classification and grading,and introduces traditional me-thods of classification and grading.It then provides an overview and comparison of recent data classification and grading technologies based on artificial intelligence,especially large language models.Given the relevance of data classification and grading to specific industries,this paper also presents the application of data classification and grading in key industries and domains.Finally,the paper looks forward to the development of data classification and grading technologies,discussing the new challenges they face and potential future directions.

Source-free Domain Adaptation Method Based on Pseudo Label Uncertainty Estimation

HUANG Chao, CHENG Chunling, WANG Youkang

Computer Science. 2025, 52 (9): 212-219. doi:10.11896/jsjkx.240700159

Abstract

PDF(2109KB) ( 8 )

References | Related Articles | Metrics

Most of the source-free domain adaptive methods are self-supervised learning based on pseudo-labels to solve the source-free limitation.However,these methods overlook the impact of the clustering structure of target sample feature distributions and the uncertainty of samples near the decision boundary on pseudo-label noise during the generation process,which reduces the model’s performance.Therefore,this paper proposes a source-free domain adaptive method based on pseudo-label uncertainty estimation.Firstly,multiple perturbations are introduced to the model’s feature extractor parameters to simulate variations in source knowledge resulting from data fine-tuning.The similarity of target sample feature distributions under different perturbed models is utilized to assess the generalization uncertainty of the source knowledge.Moreover,an extreme value entropy is proposed to quantify the latent information uncertainty within the target domain,where distinct entropy calculation methods are applied based on the difference between the highest and second-highest prediction probabilities.Secondly,the target samples are divided into reliable samples and unreliable samples according to two kinds of uncertainty.For reliable samples,self-supervised learning is employed,using their prediction probabilities as weights to update sample features into class prototypes.Furthermore,historical class prototypes are incorporated to improve the stability of the class prototypes.Contrast learning is applied to unreliable samples to bring them closer to similar class prototypes.Compared with several baseline models,the classification accuracy is improved on three public benchmark datasets:Office-31,Office-Home and VisDA-C,and the effectiveness of the method is verified.

Synthetic Oversampling Method Based Noiseless Gradient Distribution

HU Libin, ZHANG Yunfeng, LIU Peide

Computer Science. 2025, 52 (9): 220-231. doi:10.11896/jsjkx.241000010

Abstract

PDF(3491KB) ( 12 )

References | Related Articles | Metrics

Synthetic Oversampling Method is an important means to solve imbalanced classification problem,but the current oversampling methods still have many problems when dealing with high-dimensional imbalanced classification problem.A synthetic oversampling method based on noiseless gradient distribution is proposed to address the three issues of error accumulation caused by noise samples,excessive dependence on sample space distance,and reduced recognition accuracy of negative class samples in current synthetic oversampling methods.Firstly,the gradient contribution attribute of the sample is used as the metric to mea-sure the label confidence of the sample and the noise label samples in the data set are filtered to avoid the error accumulation caused by the noise samples as the root samples.Secondly,the positive samples are assigned to different gradient intervals accor-ding to the gradient contribution metric and the safe gradient threshold,the samples in the safe gradient interval are selected as the root samples,and the gradient right nearest neighbor of the root sample are regarded as the auxiliary samples,which not only gets rid of the dependence on spatial distance measurement,but also ensures that the decision boundary moved to the negative class samples continuously.Finally,a safe gradient distribution approximation strategy based on cosine similarity is designed to calculate the number of samples to be generated in each safe gradient interval,and the synthesized sample distribution by which can make the decision boundary moved toward the negative class samples in a safe way,so the recognition accuracy of the negative class samples will not be significantly sacrificed.Experiments on datasets from KEEL,UCI and Kaggle platforms show that the proposed algorithm can not only improve the Recall value of the classifier,but also obtain satisfactory F1-Score,G-Mean and MCC values.

Personalized Federated Learning Framework for Long-tailed Heterogeneous Data

WU Jiagao, YI Jing, ZHOU Zehui, LIU Linfeng

Computer Science. 2025, 52 (9): 232-240. doi:10.11896/jsjkx.240700116

Abstract

PDF(2139KB) ( 8 )

References | Related Articles | Metrics

Aiming at the problem of model performance degradation in federated learning caused by long tail distribution and he-terogeneity of data,a novel personalized federated learning framework called Balanced Personalized Federated Learning(BPFed) is proposed,where the whole federated learning process is divided into two stages:representation learning based on personalized fe-derated learning and personalized classifiers retraining based on global feature augmentation.In the first stage,the Mixup strategy is adopted firstly for data augmentation,and then a feature extractor training method is proposed based on the personalized fede-rated learning with parameter decoupling to optimize the performance of the feature extractor while reducing the communication cost.In the second stage,a new class-level feature augmentation method based on global covariance matrix is proposed firstly,and then the classifiers of clientsare retaines individually in balance with a proposed label-aware smoothing loss function based on sample weight to correct the overconfidence for head classes and boost the generalization ability for tail classes.Extensive experimental results show that the model accuracy of BPFed is significantly improved compared with other representative related algorithms under different settings of data long-tailed distributions and heterogeneities.Moreover,the effectiveness of the proposed methods and optimization strategies is further verified by the experiments on ablation and hyperparameter influence.

Trajectory Prediction Method Based on Multi-stage Pedestrian Feature Mining

DENG Jiayan, TIAN Shirui, LIU Xiangli, OUYANG Hongwei, JIAO Yunjia, DUAN Mingxing

Computer Science. 2025, 52 (9): 241-248. doi:10.11896/jsjkx.250700138

Abstract

PDF(2339KB) ( 9 )

References | Related Articles | Metrics

Current pedestrian trajectory prediction faces two major challenges:1)the difficulty in modeling the interrelationships among multiple pedestrians and the impact of complex environmental states;2)the increased model scale,which hinders its effectiveness in resource-constrained scenarios such as autonomous vehicles.To address these challenges more effectively,this study proposes a Multi-stage Pedestrian Trajectory Prediction framework,abbreviated as MSPP-Net.The framework comprises three components:a student module,a teacher module,and a social interaction module.Firstly,the student module constructs a prediction model based on wavelet transforms,decomposing pedestrian trajectories into high-frequency and low-frequency features to accurately extract motion details and global trends.Simultaneously,the teacher model is trained on multimodal trajectory data,including trajectories,poses,text,and the student model enhances its prediction performance by learning from the teacher model through knowledge distillation.Secondly,a social interaction module based on dynamic differential equations is developed to capture the dynamic characteristics of pedestrian movements,further improving the rationality of predictions,thus forming the final MSPP-Net prediction model.Finally,extensive experiments on the ETH/UCY and SDD datasets demonstrate that MSPP-Net achieves improvements of 12.50%/2.63% and 19.30%/10.34% in ADE and FDE metrics,respectively,outperforming mainstream methods,while reducing the parameter count by 64.47% compared to the teacher model.

Panoramic Image Quality Assessment Method Integrating Salient Viewport Extraction andCross-layer Attention

LIN Heng, JI Qingge

Computer Science. 2025, 52 (9): 249-258. doi:10.11896/jsjkx.241000108

Abstract

PDF(3369KB) ( 12 )

References | Related Articles | Metrics

Panoramic images,as an important content form for immersive multimedia,provide a 360-degree horizontal and 180-degree vertical field of view,directly influencing the user’s sense of immersion in VR.To address the challenges of insufficient handling of projection distortion and inadequate utilization of multi-scale features in panoramic image quality assessment,this paper proposes a Salient Viewport Attention Network(SVA-Net).The network is composed of a saliency-guided viewport extraction module,a cross-layer attention dependency module,and a multi-channel fusion regression module.It aims to alleviate projection distortion,efficiently extract multi-scale features,and enhance feature representation.Experimental results demonstrate that SVA-Net significantly improves the accuracy of image quality prediction compared to existing methods across two public datasets and shows strong generalization ability.By combining salient viewport sampling and cross-layer attention mechanisms,this method enhances feature representation and improves the accuracy of panoramic image quality assessment,making the prediction results more aligned with human subjective evaluations.

Multimodal Air-writing Gesture Recognition Based on Radar-Vision Fusion

LIU Wei, XU Yong, FANG Juan, LI Cheng, ZHU Yujun, FANG Qun, HE Xin

Computer Science. 2025, 52 (9): 259-268. doi:10.11896/jsjkx.240400143

Abstract

PDF(5117KB) ( 9 )

References | Related Articles | Metrics

Air-writing gesture recognition is a promising technology for human-computer interaction.Extracting gesture features with a single sensor,such as mmWave radar,camera,or Wi-Fi,fails to capture the complete gesture characteristics.A flexible Two-Stream Fusion Networks(TFNet) model is designed,capable of fusing Air-writing Energy Images(AEIs) and Point Cloud Temporal Feature Maps(PTFMs),as well as operating with unimodal data input.A robust and reliable multimodal air-writing gesture recognition system is constructed.This system utilizes a hard trigger to start and end multi-sensor data acquisition,processing image and point cloud data within the same time sequence to generate AEIs and PTFMs,achieving temporal alignment of multimodal data.Branch networks are employed to extract features of gesture appearance and fine-grained motion information.Adaptive weighted fusion of the dual-stream decision results is used,avoiding the complex interactions of intermediate multimodal features and effectively reducing model loss.Data of ten air-writing gestures representing digits 0－9 are collected from multiple participants to evaluate the model.The results indicate that the proposed model outperforms other baseline models in recognition accuracy and demonstrates strong robustness.The model shows significant advantages in air-writing gesture recognition tasks,making it an effective tool for multi-sensor air-writing gesture recognition.

Visual Storytelling Based on Planning Learning

WANG Yuanlong, ZHANG Ningqian, ZHANG Hu

Computer Science. 2025, 52 (9): 269-275. doi:10.11896/jsjkx.240700136

Abstract

PDF(2498KB) ( 6 )

References | Related Articles | Metrics

Visual storytelling is a growing area of interest for scholars in computer vision and natural language processing.Current models concentrate on enhancing image representation,like using external knowledge and scene diagrams.Despite some advancements have been made,they still suffer from content reuse and lack of detailed descriptions.To address these issues,this paper proposes a visual story generation model that incorporates planning learning.It poses questions across six key dimensions—theme,object,action,place,reasoning,and prediction－and uses a pretrained visual question answering language model to generate detailed answers.This approach guides the planning and designs process,leading to more nuanced visual story generation.The model is divided into four stages.The first stage extracts visual information from pictures.The second stage extracts and selects relevant concepts through the concept generator.The third stage is used pre-trained language models to guide the generation of planning information.In the fourth stage,it integrates the visual,conceptual and planning information generated in the above three stages to complete the visual story generation task.The model’s effectiveness is validated on the VIST dataset,outperforming the COVS model with improvements in BLEU-1,BLEU-2,ROUGE_L,Distinct-3,Distinct-4,and TTR scores by 1.58 percentage points,2.7 percentage points,0.4 percentage points,2.2 percentage points,3.6 percentage points,and 5.6 percentage points respectively.

Text-Dynamic Image Cross-modal Retrieval Algorithm Based on Progressive Prototype Matching

PENG Jiao, HE Yue, SHANG Xiaoran, HU Saier, ZHANG Bo, CHANG Yongjuan, OU Zhonghong, LU Yanyan, JIANG dan, LIU Yaduo

Computer Science. 2025, 52 (9): 276-281. doi:10.11896/jsjkx.241200204

Abstract

PDF(1668KB) ( 11 )

References | Related Articles | Metrics

In social and chat scenes,users are no longer limited to using text or simple emoji,but are more inclined to use static or dynamic images with richer semantic meaning to communicate.Although existing text-dynamic image retrieval algorithms have been achieved,there are still problems such as lack of fine-grained intra-modal and inter-modal interactions,and lack of global guidance in the prototype generation process.In order to solve the above problems,this paper proposes a Global-aware Progressive Prototype Matching Model(GaPPMM) for text-dynamic image cross-modal retrieval.A three-stage progressive prototype matching method is used to achieve cross-modal fine-grained interaction.In addition,a globally sensitive temporal prototype ge-neration method is proposed,which uses the preview features generated by the global branch as the query of the attention mechanism to guide the local branch to pay attention to the most relevant local features,so as to realize the fine-grained feature extraction of dynamic images.The experimental results demonstrate that the proposed model surpasses state-of-the-art in terms of recall rate on the publicly available dataset.

Knowledge Graph Completion Model Using Semantically Enhanced Prompts and Structural Information

CAI Qihang, XU Bin, DONG Xiaodi

Computer Science. 2025, 52 (9): 282-293. doi:10.11896/jsjkx.240700201

Abstract

PDF(2453KB) ( 14 )

References | Related Articles | Metrics

Knowledge graph completion aims to infer new facts based on existing facts,enhance the comprehensiveness and reliability of the knowledge graph,and thus improve its practical value.In order to solve the problems that existing methods based on pre-trained language models have large differences in the prediction effects of head and tail entities,large fluctuations in the training process due to the stochastic initialization of consecutive prompts,and under-utilization of structural information of the know-ledge graph,this paper proposes the knowledge graph completion model using semantically enhanced prompts and structural information(SEPS-KGC).The model follows a multi-task learning framework that unites the knowledge graph completion task with the entity prediction task.Firstly,an example-guided relationship templates generation method is designed to generate two more targeted relationship prompt templates using a large language model for the different tasks of predicting head entities and predicting tail entities,and incorporating semantic auxiliary information to enable the model to better understand the semantic associations between entities.Secondly,a prompt learning method based on effective initialization is designed,using pre-trained embeddings of relational labels for initialization.Finally,a structural information extraction module is designed to extract knowledge graph structural information using convolution and pooling operations to improve the stability and relationship understanding of the model.The effectiveness of SEPS-KGC is demonstrated on two public datasets.

Collaboration of Large and Small Language Models with Iterative Reflection Framework for Clinical Note Summarization

ZHONG Boyang, RUAN Tong, ZHANG Weiyan, LIU Jingping

Computer Science. 2025, 52 (9): 294-302. doi:10.11896/jsjkx.241000114

Abstract

PDF(3351KB) ( 10 )

References | Related Articles | Metrics

Generating clinical notes from doctor-patient dialogues is a critical task in medical artificial intelligence.Existing me-thods typically rely on large language models(LLMs) with few-shot demonstrations but often struggle to integrate sufficient domain-specific knowledge,leading to suboptimal and less professional outputs.To address this problem,a novel iterative reflection framework is proposed,which integrates Error2Correct example learning and domain-model supervision,aiming to improve the summary quality of EMR.Specifically,a large-cale language model integrating the Error2Correct example learning mechanism is designed for the initial generation and continuous potimization of EMR,and the medical domain knowledge is integrated into the pre-generation stage.Then,this paper uses a lightweight medical pre-training language model,fine-tuned with domain data,to evaluate the refined content,integrating domain knowledge in post-generation.Finally,an iterative scheduler is introduced,which can effectively guide the model to optimize in the continuous process of reflection and improvement.Experimental results on two public datasets demonstrate that the proposed method achieves state-of-the-art performance.Compared with the fine-tuned large language models,the proposed method improves overall performance by 3.68% and 7.75% on IMCS-V2-MRG and ACI-BENCH datasets.

Event Causality Identification Model Based on Prompt Learning and Hypergraph

CHENG Zhangtao, HUANG Haoran, XUE He, LIU Leyuan, ZHONG Ting, ZHOU Fan

Computer Science. 2025, 52 (9): 303-312. doi:10.11896/jsjkx.240800121

Abstract

PDF(3266KB) ( 11 )

References | Related Articles | Metrics

Event causality identification(ECI) is a crucial research direction in the field of natural language processing,with the objective of accurately identifying whether the causal relations exists between two specific events.Current mainstream methods often utilize pre-trained language models to extract limited contextual semantic information from text to judge causal relationships.However,such methods tend to simplify the understanding of key event structures and their contextual semantics,failing to fully leverage the capabilities of pre-trained language models.Additionally,they overlook the significant role of historical events and relevant labels in constructing analogical reasoning to establish causal relations between target events.To address these challenges,model based on a prompt learning and hypergraph enhanced model(PLHGE) is proposed.The proposed model effectively captures global interaction patterns among events and the structural-semantic connections between current and historical events.By integrating descriptive knowledge with textual semantics,the model generates a hierarchical event structure.Additionally,PLHGE constructs a knowledge-based hypergraph to incorporate fine-grained and document-level semantic information,thereby enhancing its identification ability.Furthermore,a relationship-based knowledge prompt learning module is introduced to utilize latent causal knowledge within pre-trained language models to improve event causal relationship recognition.Finally,extensive experiments conduct on two public benchmark datasets,and the results demonstrate that PLHGE model outperforms existing baselines in the ECI task.

Sentiment Classification Method Based on Stepwise Cooperative Fusion Representation

GAO Long, LI Yang, WANG Suge

Computer Science. 2025, 52 (9): 313-319. doi:10.11896/jsjkx.240700161

Abstract

PDF(2219KB) ( 10 )

References | Related Articles | Metrics

The goal of multimodal sentiment analysis is to perceive and understand human emotions through various heteroge-neous modalities,such as language,video,and audio.However,there are complex correlations between different modalities.Most existing methods directly fuse multiple modality features,they overlook the fact that asynchronous modality fusion representations contribute differently to sentiment analysis.To address the above issues,this paper proposes a sentiment classification me-thod based on stepwise collaborative fusion representation.Firstly,a denoising bottleneck model is used to filter out noise and redundancy in the audio and video,and the two modalities are fused through Transformer,establishing a low-level feature representation of the audio-video fusion.Then,a cross-modal attention mechanism is utilized to enhance the audio-video modalities with the text modality,constructing a high-level feature representation of the audio-video fusion.Secondly,a novel multimodal fusion layer is designed to incorporate multi-level feature representations into the pre-trained T5 model,establishing a text-centric multimodal fusion representation.Finally,the low-level feature representation,high-level feature representation,and text-centric feature fusion representation are combined to achieve sentiment classification of multimodal data.Experimental results on two public datasets,CMU-MOSI and CMU-MOSEI indicate that the proposed method improves the Acc-7 metric by 0.1 and 0.17 compared to the existing baseline model ALMT,demonstrating that stepwise collaborative fusion representation can enhance multimodal sentiment classification performance.

Robot Indoor Navigation System for Narrow Environments

DONG Min, TAN Haoyu, BI Sheng

Computer Science. 2025, 52 (9): 320-329. doi:10.11896/jsjkx.240700167

Abstract

PDF(3990KB) ( 11 )

References | Related Articles | Metrics

In the field of robotics,safe passage through narrow environments is one of the keys for robots to perform navigation tasks autonomously and reliably.To solve the problem that robots cannot safely pass through narrow environment due to multiple error sources,this paper proposes the robot indoor navigation system for narrow environments.In the course of navigation,the system marks the narrow environment according to the geometric relationship between the obstacles and the global path in the map and generates suitable traffic poses.When the robot enters and exits the marked narrow environment,it automatically swi-tches the corresponding navigation strategy to adapt the environment.In the narrow environment navigation strategy,the global cost map is inflated to plan a safer global path,and the robot plans the global path in segments according to the suitable traffic poses,aiming to adjust the pose in advance to reduce the turning demand in the narrow environment.The MPC path tracking method is optimized by converting the optimal control problem into the least squares problem.It replaces the local path planning method to calculate the trajectory,and prevents the misjudgment of local trajectory collision resulting in navigation failure.The simulation and real environment experiment results show that the system can effectively improve the passing rate of the robot in the narrow environment,so that the robot can perform the navigation task more safely and stably.

Graph Attention-based Grouped Multi-agent Reinforcement Learning Method

ZHU Shihao, PENG Kexing, MA Tinghuai

Computer Science. 2025, 52 (9): 330-336. doi:10.11896/jsjkx.240700107

Abstract

PDF(2869KB) ( 9 )

References | Related Articles | Metrics

Currently,multi-agent reinforcement learning is widely applied in various cooperative tasks.In real environments,agents always have access to only partial observations,leading to inefficient exploration of cooperative strategies.Moreover,sharing reward values among agents makes it challenging to accurately assess individual contributions.To address these issues,a novel graph attention-based grouped multi-agent reinforcement learning framework is proposed,which improves cooperation efficiency and enhances the evaluation of individual contributions.Firstly,a multi-agent system with graph structure is constructed,which learning relationships among the individual agents and their neighbors for sharing information.This approach expands individual agents’ perceptual fields to mitigate constraints from partial observability and assess individual contributions.Secondly,an action reference module is designed to provide joint action reference information for individual action selection,enabling agents to explore more efficiently and diversely.Experimental results in two different scales of multi-agent control scenarios demonstrate significant advantages over baseline methods.Detailed ablation studies further verify the effectiveness of the graph attention grouping approach and communication settings.

Dynamic Pricing and Energy Scheduling Strategy for Photovoltaic Storage Charging Stations Based on Multi-agent Deep Reinforcement Learning

CHEN Jintao, LIN Bing, LIN Song, CHEN Jing, CHEN Xing

Computer Science. 2025, 52 (9): 337-345. doi:10.11896/jsjkx.240700197

Abstract

PDF(3850KB) ( 11 )

References | Related Articles | Metrics

The improvement in the operational profits of photovoltaic storage charging stations(PSCSs) can enable charging station operators to increase their investment and deployment of PSCSs infrastructure,thereby alleviating the load pressure on the grid caused by the growing penetration of electric vehicles(EVs).To address the issue of improving PSCSs operational profits,a dynamic pricing and energy scheduling strategy based on multi-agent deep reinforcement learning(MADRL) is proposed to enhance the overall operational profits of PSCSs under a fully cooperative relationship.Firstly,aiming to maximize the total operational profits of all PSCSs,multiple PSCSs and EVs under a single PSCS operator are modeled as a Markov game.Secondly,the multi-agent twin delayed deep deterministic policy gradient(MATD3) algorithm is used to solve the model,setting the selling price of charging services and the charging and discharging strategies of the energy storage system(ESS) to achieve profit maximization.The cosine annealing method is employed to adjust the learning rate of the algorithm,improving its convergence rate and threshold.Finally,to prevent potential price monopolies under a fully cooperative relationship among multiple stations,an inverse demand function is introduced to constrain the selling price of charging services.Experimental results show that the proposed strategy improves the operational profits of charging stations by 4.17% to 66.67% compared to benchmark methods,and using the inverse demand function effectively prevents price monopolies among multiple stations.

Survey on Formal Modelling and Quantitative Analysis Methods for Complex Systems

WANG Huiqiang, LIN Yang, LYU Hongwu

Computer Science. 2025, 52 (9): 346-359. doi:10.11896/jsjkx.240600022

Abstract

PDF(2627KB) ( 8 )

References | Related Articles | Metrics

Formal modeling is an important fundamental method of system verification and performance analysis.It can be utilized to evaluate the feasibility and performance boundaries of the system as early as in the design phase and is widely applied for abstract simulation and theoretical analysis of various complex systems.Because the system interaction is gradually shifting towoard diversification and dynamism,this exacerbates complexity and uncertainty.Starting from the common evaluation criteria of formal modeling,this paper summarizes the advantages and disadvantages of different common formal languages and their analysis me-thods,providing technical references for formal modeling of complex systems.Firstly,this paper proposes a framework of evaluation metrics for formal modeling and solution methods.Secondly,it classifies the existing formal methods and discusses the implementation principles,advantages,limitations,and application scenarios of different methods.Thirdly,it compares the current typical solving methods around the state space explosion problem in the model-solving process and analyzes the performance in different scenarios based on the metrics selected.On this basis,this paper examines two typical application scenarios based on process algebra technology.Finally,this paper summarizes the research hotspots in formal modeling and quantitative analysis and provides a preliminary outlook on the research trends.

Bayesian Verification Scheme for Software Reliability Based on Staged Growth Test Information

WANG Yuzhuo, LIU Haitao, YUAN Haojie, ZHAI Yali, ZHANG Zhihua

Computer Science. 2025, 52 (9): 360-367. doi:10.11896/jsjkx.240600086

Abstract

PDF(1718KB) ( 7 )

References | Related Articles | Metrics

The prior distribution determination methods of existing Bayesian schemes are relatively conservative and ideal for processing software reliability phased growth testing information.In this article,an edge distribution model for the software success probability at the final stage of growth testing is constructed using the identity relationship between the regular incomplete beta function and the cumulative sum of binomial distributions.On this basis,a prior distribution determination method of software success probability under sequential constraints is proposed,and a Bayesian verification scheme based on average a posteriori risk is designed to protect the interests of users.The example and simulation show that the proposed prior distribution determination method is more reasonable for processing software reliability phased growth information,the designed Bayesian scheme can significantly reduce the number of reliability verification test cases and reduce testing burden while ensuring the reliability of the schemes and has certain economic value.

Vulnerability Detection Method Based on Deep Fusion of Multi-dimensional Features from Heterogeneous Contract Graphs

ZHOU Tao, DU Yongping, XIE Runfeng, HAN Honggui

Computer Science. 2025, 52 (9): 368-375. doi:10.11896/jsjkx.241000007

Abstract

PDF(2378KB) ( 8 )

References | Related Articles | Metrics

Smart contracts are pieces of code that execute automatically on the blockchain,and the safety problem is critical due to their irreversibility and close links to financial transactions.However,the current smart contract vulnerability detection technology still faces problems such as low feature extraction efficiency,low detection accuracy,and over-reliance on expert rules.In order to solve these problems,this paper proposes a vulnerability detection method based on multi-dimensional feature deep fusion of heterogeneous contract graph.Firstly,the code of smart contract data is denoised,and the data set is expanded by data enhancement method of code function exchange,and represented as heterogeneous contract graph.Secondly,the high-dimensional semantic representation of nodes in the smart contract graph is efficiently obtained by combining graph embedding technology and code pre-training technology.Finally,the dual heterogeneous graph attention network is designed to deeply integrate the node features learned in two dimensions to achieve more accurate vulnerability detection.The experimental results for different types of vulnerabilities show that the overall performance of the proposed method has been improved,and the average F1 index is higher than 77.72%.In the case of denial of service vulnerability detection,the F1 value is up to 84.88%,which is significantly improved by 10.62% and 22.34% compared with the traditional deep learning method and the graph topology detection method respectively.The proposed method not only improves the detection efficiency and accuracy,but also reduces the dependence on expert rules by learning node characteristics,providing a more reliable guarantee for the security of smart contracts.

Gradient-guided Pertuerbed Substructure Optimization for Community Hiding

YU Shanqing, SONG Yidan, ZHOU Jintao, ZHOU Meng, LI Jiaxiang, WANG Zeyu, XUAN Qi

Computer Science. 2025, 52 (9): 376-387. doi:10.11896/jsjkx.240800107

Abstract

PDF(5140KB) ( 8 )

References | Related Articles | Metrics

Community detection is a technique used to reveal network clustering behaviors,capable of accurately identifying the community structure within a network,thus helping to better understand the internal organization and functions of complex networks.However,with the rapid development of these algorithms,concerns have arisen regarding issues such as information leakage and privacy invasion.In response,community hiding algorithms have been widely studied,which reduce the effectiveness of community detection algorithm and realize privacy protection by constructing perturbed substructure to blur the community structure in the network.Among the current methods for optimizing perturbation substructures,genetic algorithm-based approaches performs better.However,these methods often lack guidance directional in the search for solutions,leaving room for improvement in both the effectiveness and efficiency of constructing perturbation substructures.By incorporating gradient-guided information into the genetic algorithm search,the construction process of perturbation substructures can be optimized,enhancing the effectiveness and efficiency of community hiding.Experimental results demonstrate that integrating gradient-guided information into genetic algorithm search for perturbation substructures significantly outperforms other baseline methods for community hiding,proving its effectiveness.

Multi-authority Revocable Ciphertext-policy Attribute-based Encryption Data Sharing Scheme

LI Li, CHEN Jie, ZHU Jiangwen

Computer Science. 2025, 52 (9): 388-395. doi:10.11896/jsjkx.240700066

Abstract

PDF(1958KB) ( 9 )

References | Related Articles | Metrics

In the field of data security protection and sharing,Ciphertext-Policy Attribute-Based Encryption(CP-ABE) is widely recognized as a method that ensures the confidentiality of data while allowing authorized users to access and share the data.However,users’ attributes are not static,leading to potential changes in data access permissions.A practical approach is for data ow-ners to re-encrypt ciphertext and upload it to the server to prevent revoked users from accessing the data.This practice imposes a significant burden on the server.To address this issue,a CP-ABE scheme supporting user-level and attribute-level revocation without updating cloud ciphertext is proposed,through proxy server re-encrypt and pre-decrypt ciphertext and managing the pre-decryption keys for each user,and updating only the pre-decryption keys during revocation.Experimental analysis demonstrates that under the conditions of multiple attribute authorities,this scheme achieves user-level and attribute-level revocation with forward security without updating cloud ciphertext,with lower computational and key storage overhead compared to similar schemes.Security proofs are provided under the q-BDHE hardness assumption,showing that the scheme is indistinguishable against chosen plaintext attacks.

Identity-based Linkable Ring Signcryption on NTRU Lattice

TANG Jiayi, HUANG Xiaofang, WANG Licheng, ODOOM J

Computer Science. 2025, 52 (9): 396-404. doi:10.11896/jsjkx.240700126

Abstract

PDF(1663KB) ( 11 )

References | Related Articles | Metrics

Although the current lattice-based ring signcryption scheme resists quantum attacks,it has large key storage and high encryption/decryption time.The linkable lattice ring signcryption scheme not only can protect signer anonymity,but also can determine whether two signatures are generated by the same signer.Thus,based on the compact Gaussian sampling algorithm and rejection sampling technique,the identity-based linkable ring signcryption scheme is constructed on the NTRU lattice.Firstly,the system master key is generated using the trapdoor generation algorithm on the NTRU lattice.Then private keys of ring members are obtained based on the compact Gaussian sampling algorithm.Finally,the user signature is generated using rejection sampling,and the key encapsulation mechanism encrypts the signature.Security proof in the random oracle model (ROM) that DRLWE and NTRU small integer solutions are difficult to solve ensures confidentiality,unforgeability,unconditional anonymity,and linkability.Performance analysis shows that compared to the lattice-based ring signcryption and linkable ring signature schemes on NTRU,the proposed scheme has a smaller public key and lower encryption/decryption cost,greatly improving efficiency.