Computer Science

Survey on Research of Compatibility Issues in Operating System for Software Ecology Evolution

HONG Xinran, MA Jun, WANG Jing, ZHANG Chuang, YU Jie, LI Xiaoling, ZHANG Xueyan, YANG Yajing

Computer Science. 2025, 52 (7): 1-12. doi:10.11896/jsjkx.240900097

Abstract

PDF(2753KB) ( 28 )

References | Related Articles | Metrics

With the rapid development of hardware and software technologies,the software ecosystem has become a key driver for innovation in the information industry.However,due to the vast scale and complexity of the software ecosystem,diverse application scenarios,and intricate dependencies and supply chain relationships,compatibility issues arising from the rapid evolution of the software ecosystem have become increasingly prominent,exposing the limitations of traditional compatibility analysis me-thods.As the foundational infrastructure supporting the entire software ecosystem,operating system compatibility directly impacts software stability,usability,security,and the overall health of the software ecosystem.Analyzing compatibility from the perspective of operating systems allows a more comprehensive understanding of the hierarchical and dependency relationships within the software ecosystem.By leveraging big data and intelligent approaches,we can analyze large-scale relationships and complex evolution patterns,thereby more efficiently identifying and resolving compatibility issues,enhancing the adaptability and user experience of operating systems.This paper aims to provide a comprehensive analysis of the concepts and models of opera-ting system compatibility from various dimensions,including architectural hierarchy,relationship networks,and evolutionary processes.From the perspective of the software ecosystem evolution within operating systems,this paper systematically explores the current solutions and research achievements for compatibility issues through analysis,assessment,detection,and resolution,integrating advanced technologies such as complex relationship networks,artificial intelligence,and knowledge graphs.It also summarizes the research challenges and future directions in this field.

Survey on Fuzzing of Embedded Software

SUN Qiming, HOU Gang, JIN Wenjie, HUANG Chen, KONG Weiqiang

Computer Science. 2025, 52 (7): 13-25. doi:10.11896/jsjkx.240800068

Abstract

PDF(2501KB) ( 24 )

References | Related Articles | Metrics

Embedded software is now widely used in various safety-critical systems,such as national defense,aerospace,and IoT communications,which face increasingly severe security challenges.Therefore,it is crucial to quickly identify and fix security vulnerabilities in embedded software.Fuzz testing,as an efficient software testing technique,can automatically generate a large amount of random data to test the reliability of software systems and has gradually been applied to the discovery of vulnerabilities in various embedded software.This paper first introduces the concepts of fuzz testing,embedded systems,and their firmware devices.Then,it provides an overview of the fuzz testing process for embedded software,analyzes the differences from traditional software fuzz testing and the faced challenges.Following that,it systematically introduces the current research status and main methods of fuzz testing for embedded software,including direct fuzz testing and simulation-based fuzz testing.Finally,this paper discusses optimization methods that can be used to improve the effectiveness of embedded software fuzz testing and looks ahead to potential future technological directions.

Sub-community Detection and Evaluation in Open Source Projects:An Example of Apache IoTDB

WANG Weiwei, LE Yang, WANG Yankai

Computer Science. 2025, 52 (7): 26-36. doi:10.11896/jsjkx.250200108

Abstract

PDF(3318KB) ( 22 )

References | Related Articles | Metrics

As open-source collaboration has become a widely adopted paradigm in software development,the scale and structure of open-source projects have grown increasingly complex.Within the open-source collaboration model,ensuring software quality in large and intricate software systems has emerged as a critical issue.In the existing operational models of open-source communities,a project's community is often treated as a single entity,which contradicts the modular design principles of complex software.This study focuses on the phenomenon of sub-communities within open-source projects.Based on the analysis of code commit records and file change histories,a graph structure is constructed to model the relationships between developers and code files,upon which a sub-community detection algorithm is proposed that leverages developer activity and code modification records.By introducing intra-community participation coefficients and inter-community participation coefficients,this paper estab-lishes a core developer identification model,providing project managers with a quantitative evaluation tool for assessing developer contributions and collaboration significance.Additionally,it designs a sub-community scoring method that comprehensively considers both modular concentration and dispersion to evaluate the quality performance of different sub-communities in the module development process.An empirical analysis is conducted using the Apache IoTDB project as a case study.By mining 11 523 commit records from 282 developers,it constructs a collaboration network and identify four distinct sub-communities with significant characteristics.The experimental results indicate that the core developer identification outcomes and the code quality evaluation scores of each sub-community align with actual development conditions,validating the effectiveness of the proposed models and methods in open-source projects.

Analysis of the Code Quality of Code Automatic Generation Tool Github Copilot

WANG Dongyu, MO Ran, ZHAN Wenjing, JIANG Yingjie

Computer Science. 2025, 52 (7): 37-49. doi:10.11896/jsjkx.240600076

Abstract

PDF(2506KB) ( 21 )

References | Related Articles | Metrics

Github Copilot is a generative AI-based code auto-generation tool launched by Github and OpenAI in 2022.One of its core functions is to generate corresponding implementation code based on natural language annotations describing functions.This expansion of AI in the field of programming has attracted heated discussion and attention in recent years.At this stage,people's focus is mainly on the comparison between AI programming and human programming,such as the comparison of programming efficiency and code performance between AI programmers and human programmers.However,there is currently limited research on the characteristics of Copilot-generated code itself,particularly regarding code quality issues,such as defects in the AI-generated code,whether these defects might lead to program errors,and the understandability of the code.Code quality directly determines the life and durability of a software project.Analyzing and summarizing its code quality characteristics helps to better use and improve such AI code tools.This paper utilizes tools to extract all open-source problems from LeetCode(2,033 in total) as data samples to test Copilot,generating code suggestions in three programming languages(Java,JavaScript,and Python),submitting them,and recording the execution results of the generated code.By statically analyzing the code suggestions with SonarQube and integrating their execution results,this paper evaluates Copilot's code quality in terms of reliability,maintainability,and complexity.The results reveal that:1)Copilot-generated code is relatively reliable.For Java,JavaScript,and Python,7,5,and 9 types of bugs are identified respectively.The proportion of code suggestions involving bugs do not exceed 3% across all three languages,but over 50% of bug-related code suggestions fail test cases.2)Copilot's code suggestions exhibit poor maintainability.For Java,JavaScript,and Python,47,23,and 20 types of code smells are detected respectively.Over 40% of code suggestions in all three languages contain code smells,and more than 50% of smell-related suggestions failetest cases.3)Copilot-generated code is easy to understand.The complexity of most code suggestions do not exceed predefined thresholds,with less than 6% of suggestions flag for excessive complexity.Finally,based on the experimental findings,practical recommendations for improving Copilot are proposed,and potential future research directions for such tools are discussed.

Dynamic Library Debloating Enhanced System Call Restriction of Programs

ZHANG Linmao, SUN Cong, RAO Xue

Computer Science. 2025, 52 (7): 50-57. doi:10.11896/jsjkx.240700026

Abstract

PDF(2230KB) ( 23 )

References | Related Articles | Metrics

The development and execution of applications rely extensively on dynamic libraries.Dynamic libraries have the cha-racteristics of commonly used by multiple programs,thus contain a number of library functions that are far more than the functions required by the specific application.The application uses only a few library functions.However,the library is completely loaded at run time.Loading redundant library code makes a broader attack surface towards the program.The application-specific debloating of the dynamic library helps reduce the attack surface.Meanwhile,state-of-the-art system-call restriction frameworks have yet to consider the extra restriction space of the system calls brought by dynamic library debloating.These frameworks can not realize the strict restriction on the system calls of the specific application.This paper proposes a dynamic-library-debloating enhanced system-call restriction framework based on intermediate representation. Binary debloating of applications is used to reduce the impact of redundant code on dynamic library debloating and system call restrictions.An improved pointer analysis has been implemented on the intermediate representation of the dynamic library,which obtains the application-specific library function call graph. Then,the redundant library functions are trimmed to generate the debloated dynamic library.On the intermediate representation of the dynamic library,the system calls corresponding to the preserved functions are extracted to determine the allowed set of system calls.Based on the allowed system-call set, a binary rewriting is developed on the debloated binary application to filter out system calls outside the allowed system-call set. The experimental results demonstrate that the proposed framework has higher debloating degrees of library functions and more strict system-call restriction ability than the state-of-the-art framework,and the pointer analysis has higher accuracy than SVF.In typical applications,the proposed approach can reduce the attack surface of code-reuse attacks and avoid typical known vulnerabilities.

Method for Coupling Analysis of Requirements Models Based on Variable Dependency Relationships

YIN Wei, DOU Lin, GAO Zhongjie, WANG Lisong, SUN Qian

Computer Science. 2025, 52 (7): 58-68. doi:10.11896/jsjkx.241000092

Abstract

PDF(2294KB) ( 18 )

References | Related Articles | Metrics

The airborne software is a typical safety-critical software,and its development and verification process is strictly regulated in the aviation industry.The complexity and diversity of airborne software make requirements analysis a crucial area of research.Particularly during the requirements verification phase,it is essential to focus on the interaction patterns between system components and whether the dependencies between variables meet expectations.Therefore,This paper proposes a coupling analysis method based on the VRM model for airborne software requirements,which defines the dependencies between variables in the requirements,and measures the data coupling and control coupling between system components through metrics.In order to solve the deficiencies of the coupling analysis technique based on the requirement level,this paper uses the VRM model as a formal requirement model to model and analyze the system requirements at the system requirement level,which effectively supports the relevant requirements of DO-178C for data coupling,control coupling and software components.Secondly,this paper proposes a coupling measurement method based on the hierarchical dependency relationship between variables,defines the relationship between variables as an n-forked dependency tree structure,and uses a series of algorithms to classify the weights of the variables to measure the coupling degree by constructing variable matrices,demand variable dependency trees,etc.,which forms a prototype for analyzing the data and control coupling based on the dependency relationship of the variables.The research in this paper breaks through the coupling analysis technology based on the requirement level,which provides help for the design of complex systems and improves the quality and reliability of airborne software development.

Multi-view Clustering Based on Bipartite Graph Cross-view Graph Diffusion

WANG Jinfu, WANG Siwei, LIANG Weixuan, YU Shengju, ZHU En

Computer Science. 2025, 52 (7): 69-74. doi:10.11896/jsjkx.240500097

Abstract

PDF(2257KB) ( 22 )

References | Related Articles | Metrics

Multi-view clustering is an research hotspots in the field of unsupervised learning.Recently,the method based on cross-view graph diffusion uses the complementary information between multiple views to obtain a unified graph for clustering on the basis of learning an improved graph for each view,which has achieved good results,but the time and space complexity are high,which limits its application on large-scale datasets.This paper proposes a multi-view clustering method based on bipartite graph diffusion across views to address problems with high time and space complexity.It reduces the complexity to linear complexity,making it suitable for large-scale clustering tasks.The specific method involves using a bipartite graph instead of a complete graph for cross-view graph diffusion and modifying the cross-view graph diffusion formula based on the complete graph to accommodate the bipartite graph input.Experimental results on six benchmark datasets demonstrate that the proposed method outperforms most existing multi-view clustering methods in terms of clustering accuracy and computational efficiency.In small-scale datasets,accuracy and other metrics are generally more than 5% higher than those of comparison algorithms.In large-scale datasets,the advantage is even more pronounced,with indicators such as ACC and NMI are 15%~30% higher than the comparison algorithms.

TSK Fuzzy System Enhanced by TSVR with Cooperative Parameter Optimization

WANG Wei, ZHAO Yunlong, PENG Xiaoyu, PAN Xiaodong

Computer Science. 2025, 52 (7): 75-81. doi:10.11896/jsjkx.240500086

Abstract

PDF(1631KB) ( 21 )

References | Related Articles | Metrics

As a special nonlinear regression system,the Takagi-Sugeno-Kang (TSK) fuzzy system can solve machine learning tasks,but its effect on high-dimensional problems is not ideal,and it is difficult to determine and adjust the rules.In order to optimize the system,the fuzzy “IF-THEN” rule is followed.Firstly,the fuzzy clustering algorithm is used to divide the dataset,and the data points are mapped to the space representing the membership degree from the point to the fuzzy clustering center.Secondly,the twin support vector regression machine (TSVR) is used to determine the two regression planes to obtain the regression values.Considering that different datasets adapt to different key parameters such as cluster number,the genetic algorithm (GA) is used to optimize multiple parameters at the same time,which simplifies the prior setting of domain knowledge.All of the above processes are called as TSVR-GA-TSK fuzzy system (TG-TSK).Experimental results show that compared with the classical regression algorithms and the typical TSK fuzzy systems,the TG-TSK fuzzy system has good regression accuracy and robustness,and has a significant advantage in the pairwise comparison of Nemenyi test.

Degree Distribution Inference Method for Complex Networks Based on Controllable PreferentialSampling

CHI Yiyan, QI Mingze, HUANGPENG Qizi, DUAN Xiaojun

Computer Science. 2025, 52 (7): 82-91. doi:10.11896/jsjkx.241200098

Abstract

PDF(4271KB) ( 21 )

References | Related Articles | Metrics

With the advent of the big data era,the scale and complexity of complex networks have grown exponentially.Due to constraints such as network size,dynamics,and privacy protection,obtaining complete information about these networks are impractical.Network sampling serves as an effective solution,allowing estimation of global properties and characteristics through local network information.However,existing network sampling methods often exhibit systematic degree bias during node acquisition,leading to considerable discrepancies between the degree distributions of sampled networks and the ground truth.To address this issue,this paper proposes a degree distribution inference framework based on controllable preferential sampling.This framework incorporates a directed edge sampling method that achieves precise control over sampling preference by introducing prefe-rence parameters into neighbor node selection probabilities.Additionally,it develops an iterative inference mechanism for prefe-rential sampled data based on the expectation-maximization algorithm,which accurately estimates the original network's degreedistribution by correcting sampling preferences.Experimental results on artificial networks and real networks demonstrate that the proposed method can accurately infer the original network's degree distribution,providing a reliable analytical tool for applications such as network robustness analysis and spreading dynamics research under limited observational information.

Pedestrian Trajectory Prediction Based on Motion Patterns and Time-Frequency Domain Fusion

LIU Yajun, JI Qingge

Computer Science. 2025, 52 (7): 92-102. doi:10.11896/jsjkx.250200011

Abstract

PDF(3935KB) ( 22 )

References | Related Articles | Metrics

Due to the uncertainty of human behaviour and the inherent multi-modality of predicting the future,it becomes an ine-vitable issue to discern the significance and likelihood of predicted pedestrian trajectories.Pedestrian motion patterns can be used as the benchmark characteristics for differentiation.In addition,most existing studies have examined pedestrian trajectories only in terms of the temporal dimension,ignoring the potentially great help of the frequency dimension of trajectories.This paper proposes a pedestrian trajectory prediction model based on the motion patterns and time-frequency domain fusion,called MPTF.The probabilistic prediction sub-network of MPTF extracts the time and frequency domain feature embedding of trajectories.By combining pedestrian motion patterns,it predicts the occurrence probability of future trajectories through a classification task.The time-domain branch of the regression prediction sub-network mines the social relationships among pedestrians,while the frequency-domain branch focuses on the contributions of different frequency components of the trajectories to the prediction.The gated fusion network fuses the features from these two dimensions and conducts regression inference to obtain multi-modal future tra-jectories.Experimental results on multiple public datasets show that the model has achieved a level equivalent to that of the latest studies in terms of the evaluation metrics of Average Displacement Error(ADE) and Final Displacement Error(FDE).On the UNIV dataset,the model obtains optimal results of 0.22 and 0.40 for ADE and FDE respectively.Moreover,on the ETH dataset,the FDE is improved by 8.5%.This validates the effectiveness of the method that combines time-frequency domain trajectory features.

Knowledge-aware Graph Refinement Network for Recommendation

LUO Xuyang, TAN Zhiyi

Computer Science. 2025, 52 (7): 103-109. doi:10.11896/jsjkx.240600120

Abstract

PDF(2393KB) ( 26 )

References | Related Articles | Metrics

Knowledge graph-based recommendation models achieve accurate user preference modeling by capturing entity associations of interaction items on the knowledge graph,thereby enhancing recommendation accuracy.However,existing research ignores the noise and sparsity issues in the interaction graph,which limits the model's ability to capture entity associations and leads to biased,ultimately leading to suboptimal results.To address these issues,this paper proposes a model named knowledge-aware graph refinement network(KGRN).Specifically,a graph pruning module is designed that utilizes semantic information from the knowledge graph to dynamically prune noisy interactions in the interaction graph.Additionally,a graph construction module is developed to mitigate data sparsity in the interaction graph,enhance the model's capability to identify user preference entities,and improve user preference modeling.Comparative experiments are conducted on three benchmark datasets to evaluate the effectiveness of KGRN.Compared to existing models,KGRN achieves performance improvements of 2.97% on MovieLens-1M,1.69% on Amazon-Book,and 2.22% on BookCrossing,demonstrating the effectiveness of the proposed model.

Research on Node Learning of Graph Neural Networks Fusing Positional and StructuralInformation

HAO Jiahui, WAN Yuan, ZHANG Yuhang

Computer Science. 2025, 52 (7): 110-118. doi:10.11896/jsjkx.240400093

Abstract

PDF(2066KB) ( 25 )

References | Related Articles | Metrics

Graph neural networks are powerful models for learning graph-structured data,representing them through node information embedding and graph convolution operations.In graph data,the structural information and positional information of nodes are crucial for extracting graph features.However,existing GNNs have limited expressive ability in simultaneously capturing positional and structural information.This paper proposes a novel graph neural network,named Positional and Structural Information with Graph Neural Networks(PSI-GNN).The core idea of PSI-GNN lies in utilizing an encoder to capture the positional and structural information of nodes and embedding these information features into the network.By updating and propagating these two types of information within the network,PSI-GNN effectively integrates and utilizes positional and structural information,providing an effective solution to the aforementioned problem.Additionally,to accommodate different types of graph learning tasks,PSI-GNN assigns varying weights to positional and structural information based on the specific downstream tasks.To validate the effectiveness of PSI-GNN,experiments are conducted on multiple benchmark graph datasets.The experimental results demonstrate that PSI-GNN achieves a maximum improvement of approximately 14% on node-level tasks and approximately 35% on graph-level tasks.These results confirm the effectiveness of PSI-GNN in simultaneously capturing positional and structural information.

Hierarchical Classification with Multi-path Selection Based on Calculation of Correlation Degree of Granularity Categories in the Same Level

ZHANG Yuekang, SHE Yanhong

Computer Science. 2025, 52 (7): 119-126. doi:10.11896/jsjkx.240600043

Abstract

PDF(2244KB) ( 21 )

References | Related Articles | Metrics

Hierarchical classification is an important branch in the field of data mining,which organizes data into a hierarchical structure by mining information between data.However,inter-level error propagation is an inevitable problem in hierarchical classification.This paper proposes a hierarchical classification method based on multi-path selection of the association relationship between categories in the same level,which can effectively alleviate the problem of error propagation between levels.Firstly,the correlation matrix between categories is constructed by the distribution of predicted categories and true categories.Then,inspiring by the pointwise mutual information(PMI),it designes a measurement method RPMI of the degree of correlation between categories in the same level,and the degree of correlation between categories in the same level is calculated based on RPMI.Secondly,logistic regression is used recursively from top to bottom in the hierarchical structure to select prediction categories at each level,and multiple candidate categories at the current level are determined by selecting categories that are more closely related to the prediction categories.Finally,Random Forest is used to select the best prediction category from the results of multi-path prediction.The proposed method is evaluated on five datasets,demonstrating that the method has a good classification performance.

Two-way Feature Augmentation Graph Convolution Networks Algorithm

LI Mengxi, GAO Xindan, LI Xue

Computer Science. 2025, 52 (7): 127-134. doi:10.11896/jsjkx.240600090

Abstract

PDF(2399KB) ( 28 )

References | Related Articles | Metrics

Graph convolutional neural network algorithms play a crucial role in the processing of graph structured data.The mainstream mode of existing graph convolutional networks is based on weighted summation of node features using Laplacian matrices,with a greater emphasis on optimizing the convolutional aggregation method and model structure,while ignoring the prior information of the graph data itself.To fully explore the rich attributes and structural information hidden behind graph data,and effectively reduce the proportion of noise in graph data,a bidirectional feature-enhanced graph convolutional network algorithm is proposed.The algorithm enhances the topological and attributes space features of graph data through node degree and similarity calculations,and then the two enhanced graph feature representations are propagated simultaneously in both topological and attribute spaces.The attention mechanism is used to adaptively fuse the learned embeddings.In addition,to address the issue of over-smoothing in deep graph convolutional neural networks,a multi-input residual structure is proposed,which combines initial resi-dual and high-order neighborhood residual to achieve balanced extraction of initial and high-order neighborhood features in any convolutional layers.Experiments are conducted on three public datasets,and the results show that using the proposed network achieves better classification performance than existing networks.

Computing 2D Skeleton Using Novel Potential Model

WAN Zhaolin, MA Guangzhe, MI Le, LI Zhiyang, FAN Xiaopeng

Computer Science. 2025, 52 (7): 135-141. doi:10.11896/jsjkx.240600124

Abstract

PDF(4014KB) ( 21 )

References | Related Articles | Metrics

As a concise representation of shapes,skeletons preserve both the geometric characteristics and complete topological structure of the shapes.Among skeleton computation methods,the potential field-based approach suggests that skeletons are located in the region of singular points on the potential field surface and are capable of providing a topologically correct and con-tinuous representation of skeletons.However,the skeletons derived from this method still have some limitations,such as high sensitivity to noise and isometric transformations.To address these issues,this paper assumes that charges are evenly distributed on the shape boundary and defines a novel potential field inside the shape for the computation of 2D skeletons.Unlike traditional potential fields that use Euclidean distance,this paper utilizes heat kernel functions to approximate the geodesic distance within the shape,and then calculates the novel electric potential distribution within the shape.Due to the smoothness of the heat kernel geodesic distance,it exhibits stronger robustness against shape noise and isometric transformations.Furthermore,based on the Nyström distance interpolation technique,a fast computation method for the defined potential field is proposed.Extensive experiments are conducted on two shape datasets,and the parameters of this method are thoroughly analyzed,demonstrating that the proposed method can generate stable and concise shape skeletons,outperforming state-of-the-art competitors in the robustness of noise.

Weakly-aligned RGBT Salient Object Detection Based on Multi-modal Feature Alignment

LIU Chengzhuang, ZHAI Sulan, LIU Haiqing, WANG Kunpeng

Computer Science. 2025, 52 (7): 142-150. doi:10.11896/jsjkx.240600033

Abstract

PDF(3986KB) ( 20 )

References | Related Articles | Metrics

Visible and thermal(RGBT) salient object detection(SOD) aims to identify common salient objects from RGB and thermal infrared images.However,existing methods are predominantly trained on well-aligned image pairs,overlooking the “weak alignment” issue caused by sensor discrepancies during actual imaging.This issue refers to the fact that the same object,while structurally relates in different modalities,exhibits differences in position and scale.Therefore,training models with weakly-aligned RGBT images without alignment processing will lead to a significant reduction in detection performance.To address this challenge,a multi-modal alignment and fusion network(AFNet) is specifically designed for weakly-aligned RGBT SOD.The network comprises three main modules:the distribution alignment module(DAM),the attention-guided deformable convolution alignment module(AGDCM),and the cross-attention fusion module(CAM).DAM is based on optimal transport theory,aims to make the distribution of thermal infrared and RGB features as close as possible,achieving initial feature alignment.AGDCM utilizes deformable convolution and incorporates attention weights in the process of learning feature offsets,allowing different regions to learn suitable offsets for themselves,thereby achieving precise multi-modal feature alignment.CAM employs a cross-attention mechanism to fuse the aligned features,enhancing the discriminative capability of the fused features and improving computational efficiency.Extensive experiments on both aligned and weakly-aligned datasets demonstrate the effectiveness of the proposed method.

EFormer:Efficient Transformer for Medical Image Registration Based on Frequency Division and Board Attention

HUANG Xingyu, WANG Lihui, TANG Kun, CHENG Xinyu, ZHANG Jian, YE Chen

Computer Science. 2025, 52 (7): 151-160. doi:10.11896/jsjkx.240400159

Abstract

PDF(4634KB) ( 20 )

References | Related Articles | Metrics

Medical image registration is essential for several post-processings.Even though the existing single-stream or dual-stream network structures based on convolution and Transformer can achieve the promising results,it is still difficult to make a compromise between the registration performance and computational efficiency.To deal with this issue,this paper proposes an efficient registration network EFormer which mainly consists of the frequency division module(FDM) and broad attention module(BAM).Specifically,stacking several FDMs in encoder and decoder to mimic the roles of dual-branch network for extracting both local and global information can significantly improve the computation efficiency,using BAM to enhance the transmission of local information in multiple FDMs can preserve significant semantic features to promote the registration performance.The qualitative and quantitative comparisons with state-of-the-art methods on three datasets demonstrate that the Dice score,ASSD,HD95 and the ratio of negative Jacobian determinant of the proposed EFormer is improved at least by 1.3%,2.6%,0.6% and 95% respectively.In addition,using EFormer-tiny,the computation efficiency(Flops) is improved by 14%,showing that the proposed EFormer can achieve the best registration results in attention-based networks with the fastest computation speed.

SCF U²-Net:Lightweight U²-Net Improved Method for Breast Ultrasound Lesion SegmentationCombined with Fuzzy Logic

ZHUANG Jianjun, WAN Li

Computer Science. 2025, 52 (7): 161-169. doi:10.11896/jsjkx.240500134

Abstract

PDF(5414KB) ( 25 )

References | Related Articles | Metrics

The boundary of lesions in breast ultrasound images is uncertain and the shape varies.The original U²-Net has a pro-blem with a large number of parameters.In response to this,this paper proposes a lightweight and improved U²-Net breast ultrasound lesion segmentation method called SCF U²-Net,which combines fuzzy logic.It utilizes fuzzy logic to blur the feature map pixels and calculates uncertainty values,then multiply them with the input feature map to reduce image blurriness effectively addressing the uncertainty issue at boundaries.It also improves the Residual U-blocks (RSU) by combining depthwise separable convolution and dilated convolution to reduce model parameter count and enhance segmentation efficiency.To address the varied morphology of breast lesions,embedding coordinate attention mechanisms in the decoding stage to strengthen information extraction capabilities for regions of interest,thereby improving segmentation accuracy.Through testing on BUSI dataset,the proposed method achieves Dice and IoU scores of 0.8975 and 0.8328 respectively,while reducing parameter count by 90% and increasing inference speed by 1.9 times compared to original speeds.Furthermore,when compared with three mainstream semantic segmentation models,the proposed algorithm demonstrates superior performance,has a significantly segmentation performance advantage with a comparable parameters,and has good clinical application value.

Bio-inspired Neural Network with Visual Invariant Response to Moving Pedestrian

YU Shihai, HU Bin

Computer Science. 2025, 52 (7): 170-188. doi:10.11896/jsjkx.240400209

Abstract

PDF(11214KB) ( 18 )

References | Related Articles | Metrics

Visual invariance is a cardinal neural tuning response for the cognitive function in biological vision-brain systems,but no computational model has been reported for one such issue to the moving pedestrian vision perception.To fill this gap,a bio-inspired artificial visual neural network(mpvirNN) with visual invariant response to moving pedestrian perception is investigated,based on the current researches revealed by biological studies,including the structural properties of mammalian retina,the spiking response mechanism of neurons in the medial temporal lobe area(MTL) of the human brain,and the kinetics properties of human.The proposed neural network consists of two count-parts:presynaptic network and postsynaptic network.The presynaptic network captures low-order visual motion information of pedestrian objects in the field of view,by means of the visual information processing mechanism in mammalian retina.The postsynaptic network extracts visual cues of pedestrian motion frequency properties,and integrates them to generate the neural membrane potential clusters against to the object in the field of view.Systematic experimental studies show that mpvirNN can effectively perceive moving pedestrian in different visual scenes and tune neural spike response with visual invariance properties.This work is involved in the processing of visual dynamic information inspired by biological vision-brain systems,which can contribute some new ideas and methods for pedestrian detection and cognitive recognition research in artificial intelligence.

Object Detection Algorithm Based on YOLOv8 Enhancement and Its Application Norms

XU Yongwei, REN Haopan, WANG Pengfei

Computer Science. 2025, 52 (7): 189-200. doi:10.11896/jsjkx.250100108

Abstract

PDF(7026KB) ( 36 )

References | Related Articles | Metrics

Object detection is one of the pivotal technologies within the field of computer vision.Its objective is to pinpoint the locations of objects and recognize their affiliated classes within images or videos,finding extensive applications in domains like intelligent transportation,security monitoring,and industrial inspection.The YOLOv8 object detection approach has attained remarka-ble achievements in both detection precision and real-time responsiveness.Nevertheless,it encounters formidable challenges when dealing with complex background interferences,small object detection,and occlusions,often resulting in false positives or missed detections.To augment the accuracy of object detection,an object detection algorithm based on YOLOv8 enhancement is proposed,and the corresponding application specification are discussed.On the technical front,a spatial attention mechanism is incorporated into the backbone network,bolstering the feature extraction capabilities for key objects.Secondly,an adaptive feature fusion module is devised to enhance the integration proficiency of multi-scale feature maps.Subsequently,data augmentation techniques and transfer learning strategies are employed to effectively tackle the problems of sample imbalance and restricted object quantities in the dataset.Then,via a dynamic weight adjustment mechanism for bounding box regression loss and classification loss,the predictive accuracy is further elevated.Ultimately,extensive experiments conducted on five datasets,namely COCO,PASCAL VOC,Cityscapes,KITTI and VisDrone,validate that the proposed method outperforms other SOTA methods in terms of detection accuracy and operational speed.Notably in complex scenarios,small object detection,and occlusion circumstances,the robustness and accuracy of the model are conspicuously boosted.At the application specification level,with the aim of mitigating the security risks to personal image privacy data arising from the application of large-scale object detection algorithms,it is imperative to formulate comprehensive application norms in aspects such as law,ethics,and technology,so as to promote the progress of technology to closely align with the needs of social development.

Emotion Recognition Based on Brain Network Connectivity and EEG Microstates

FANG Chunying, HE Yuankun, WU Anxin

Computer Science. 2025, 52 (7): 201-209. doi:10.11896/jsjkx.240500087

Abstract

PDF(3465KB) ( 20 )

References | Related Articles | Metrics

As neuroscience and computational methods continue to advance,researchers have become increasingly interested in the relationship between emotion and brain activity.In this field,the connectivity of complex networks and EEG microstates have become hot research topics.The connectivity of brain networks reveals the degree of information transmission and coordination between different brain areas,which has an important impact on the emotion regulation process.Microstate is the stable activity pattern of the brain in a short period of time in the resting state,and its changes reflect the transformation of the brain's functional state.In order to further study the relationship between emotions and various brain regions and improve the accuracy of emotion recognition,this paper proposes an emotion recognition method based on brain network module connectivity and brain electrical microstates.This method uses network module connectivity analysis to modularize complex systems to reveal the relationship between the whole and parts under different emotions.At the same time,microstate analysis is introduced to explore the correspondence between brain areas and emotions,and the duration,occurrence frequency,coverage ratio and transition probability of each microstate are extracted as features for emotion recognition.It is found that emotions are more active in the right hemisphere.Finally,in order to get more comprehensive feature information,the two features are spliced and fused for emotion recognition.A lot of experiments are conducted on the SEED dataset,and the experimental results show that the highest average accuracy is obtained for the module connectivity feature gamma band with 94.07% accuracy,the microstate feature with 87.23% accuracy,and the fusion feature with 95.34% accuracy,which is increased by 1.27% and 8.11% respectively compared with the accuracy of feature extraction and recognition of the above single method.

Cross-modal Hypergraph Optimisation Learning for Multimodal Sentiment Analysis

JIANG Kun, ZHAO Zhengpeng, PU Yuanyuan, HUANG Jian, GU Jinjing, XU Dan

Computer Science. 2025, 52 (7): 210-217. doi:10.11896/jsjkx.240600127

Abstract

PDF(2140KB) ( 23 )

References | Related Articles | Metrics

Sentiment expressions are multimodal,and more accurate emotions can be derived through multiple modalities such as verbal,audio,and visual.Studying the interactions among modalities can effectively improve the accuracy of multimodal sentiment analysis.Previous studies have used graph models to capture rich interactions across modalities and time to obtain highly expressive and fine-grained sequence representations,but there is a greater need to tap into the expression of higher-order information in multimodal data,which can only be achieved on a one-to-one basis in graph neural networks,which restricts the utilisation of the interactions of higher-order information.This paper explores the application of hypergraph neural networks in multimodal sentiment analysis,where the hypergraph structure can connect two or more nodes to make full use of intra-and inter-modal higher order information and to achieve the interaction of higher-order information between data.Furthermore,this paper proposes a hypergraph adaptive module to optimise the structure of the original hypergraph,where the hypergraph adaptive network is designed to detect potential hidden information by means of point-edge cross-attention,hyperedge sampling and event node sampling to discover potential implicit connections and prune redundant hyperedges as well as irrelevant event nodes to update and optimise the hypergraph structure,the updated hypergraph structure represents the higher-order correlations of the data more accurately and completely than the initial structure.Extensive experiments on two publicly available datasets show that the proposed framework improves 1% to 6% in several performance metrics over other state-of-the-art algorithms on the CMU-MOSI and CMU-MOSEI datasets.

Aspect-based Sentiment Analysis Based on Syntax,Semantics and Affective Knowledge

ZHENG Cheng, YANG Nan

Computer Science. 2025, 52 (7): 218-225. doi:10.11896/jsjkx.240500124

Abstract

PDF(2476KB) ( 24 )

References | Related Articles | Metrics

The goal of aspect-based sentiment analysis is to identify the emotional polarity of specific aspect words in a sentence.In recent years,many studies have utilized syntactic dependency relationships and self-attention mechanisms to obtain syntactic and semantic knowledge respectively,and updated representations by fusing these two types of information through graph convolutional networks.However,syntactic dependency relationships and self-attention mechanisms are not specific tools for sentiment analysis,and cannot directly and effectively capture the emotional expression of aspect words,which is the key to aspect-based sentiment analysis.In order to pay more attention to the emotional expression of aspect words,this paper constructs a network integrating syntax,semantics,and affective knowledge.Specifically,utilizing the syntactic knowledge in the syntactic dependency tree to construct a syntactic graph,and integrating external emotional knowledge information into the syntactic graph.At the same time,self-attention mechanism is adopted to obtain semantic knowledge of each word in the sentence,and aspect-aware attention mechanism is used to make the semantic graph focus on information related to aspect words.In addition,a bidirectional message propagation mechanism is used to learn the information in the two graphs at the same time and update node representations.The experimental results on three benchmark datasets validates the effectiveness of the proposed model.

Multimodal Sentiment Analysis Model Based on Cross-modal Unidirectional Weighting

WANG Youkang, CHENG Chunling

Computer Science. 2025, 52 (7): 226-232. doi:10.11896/jsjkx.240600066

Abstract

PDF(2325KB) ( 24 )

References | Related Articles | Metrics

Most multimodal sentiment analysis models utilize cross-modal attention mechanism to handle multimodal features.These approaches are prone to not only overlook the unique and effective information within each modality,but also suffer from the interference of redundant information shared across modalities,resulting in decreasing classification accuracy.To address this issue,this paper proposes a multimodal sentiment analysis model based on cross-modal unidirectional weighting.This model leverages a unidirectional weighting module to extract both shared and unique information within different modalities,and uses si-milar structure to interact between multimodal data.To prevent excessive extraction of repetitive information,it employs a KL divergence loss function for contrastive learning of identical modality information.Additionally,it introduces a gated temporal convolutional network with filtering function to extract features from unimodal data,thereby enhancing the expressive power of unimodal feature information.Evaluation on two public datasets,CMU-MOSI and CMU-MOSEI,against 13 baseline models show significant advantages in terms of classification accuracy,F1 score,and other metrics,validating the effectiveness of the proposed method.

Study on Opinion Summarization Incorporating Evaluation Object Information

KONG Yinling, WANG Zhongqing, WANG Hongling

Computer Science. 2025, 52 (7): 233-240. doi:10.11896/jsjkx.240600144

Abstract

PDF(2035KB) ( 21 )

References | Related Articles | Metrics

The opinion is a written form of consumer evaluation and feedback on a product.Opinion summarization refers to extracting and compressing a review to create a concise text that summarizes the information contained in the review.Currently,most opinion summarization tasks focus solely on the review text,without considering the evaluation object information within the review,such as aspects,opinion phrases,and sentiment polarities.Therefore,this paper proposes an opinion summarization me-thod based on the T5 model(Text-to-Text Transfer Transformer) incorporating the evaluation object information.The method first utilizes the T5 model to represent the task of opinion summarization.It learns contextual information from the review using the attention mechanism and generates a summary that encapsulates the core semantics.Then,it extracts the evaluation object information from the summary as an auxiliary task of opinion summarization.Finally,the model parameters are fine-tuned using a limited sample of data,which further enhances the summary generation process,resulting in a high-quality summary.Experimental results on both the hotel review dataset SPACE and the product review dataset OPOSUM+,show that the proposed method has a significant improvement in the ROUGE evaluation metrics compared to the baseline models.

Confidence-guided Prompt Learning for Multimodal Aspect-level Sentiment Analysis

LI Maolin, LIN Jiajie, YANG Zhenguo

Computer Science. 2025, 52 (7): 241-247. doi:10.11896/jsjkx.240600126

Abstract

PDF(2442KB) ( 19 )

References | Related Articles | Metrics

With the increasing volume of data from social media platforms,multimodal aspect-level sentiment analysis is crucial for understanding the underlying emotions of users.Existing research primarily focuses on sentiment analysis tasks by fusing image and text modalities,but these methods fail to effectively capture the implicit emotions in both image and text.Furthermore,traditional approaches are often constrained by the black-box nature of the models,which lack interpretability.To address these issues,this paper proposes a confidence-guided prompt learning(CPL) based multimodal aspect-level sentiment analysis model,which consists of four key components:a multimodal feature processing module(MF),a confidence-based gating module(CG),a prompt construction module(PC),and a multimodal classification module(MC).The multimodal feature processing module is responsible for extracting features from multimodal data.The confidence-guided gating module evaluates the classification difficulty of samples using confidence assessment through a self-attention network and adaptively processes samples based on their difficulty.The prompt construction module generates adaptive prompt templates for different difficulty levels of samples to guide the T5 large language model in generating auxiliary sentiment cues.And the multimodal classification module is used for final sentiment prediction.Experimental results on the public datasets Twitter-2015 and Twitter-2017 show that,compared to existing baseline methods,the proposed multimodal aspect-level sentiment classification model achieves significant performance improvements,with accuracy increases of 0.48% and 1.06%,respectively.

Research on Automatic Vectorization Benefit Evaluation Model Based on Particle SwarmAlgorithm

LIU Mengzhen, ZHOU Qinglei, HAN Lin, NIE Kai, LI Haoran, CHEN Mengyao, LIU Haohao

Computer Science. 2025, 52 (7): 248-254. doi:10.11896/jsjkx.241000181

Abstract

PDF(1657KB) ( 22 )

References | Related Articles | Metrics

Automatic vectorization leverages SIMD components to accelerate program execution,easing the programmers' workload and serving as a key optimization in the GCC compiler.However,the current benefit evaluation model in GCC lacks precision,affecting vectorization decisions.To improve vectorization efficiency on the Sunway platform,this study introduces a new benefit evaluation model within GCC.The model designs cost metrics specific to the Sunway processor's backend instruction set and applies a particle swarm algorithm to optimize these costs,enhancing evaluation accuracy and vectorization performance.Experimental results on SPEC2006 and SPEC2017 benchmarks show that the proposed model delivers up to 7.6% and 5.75% performance gains,respectively,over the default GCC model.These outcomes confirm the model's effectiveness in refining automatic vectorization and improving the usability of the Sunway compilation system.With more accurate evaluations,the model supports better optimization decisions,resulting in enhanced platform performance.

Multi-UAV Task Assignment Based on Hybrid Particle Swarms Algorithm with Game Theory

WANG Rongjie, ZHANG Liang

Computer Science. 2025, 52 (7): 255-261. doi:10.11896/jsjkx.240400079

Abstract

PDF(1884KB) ( 23 )

References | Related Articles | Metrics

By considering maximum UAV load,track cost,task time deviation and task benefit to construct a task allocation model,this paper proposes an improved particle swarm optimization algorithm based on game theory to solve the multi-UAV cooperative task assignment problem(MTAP).The principle of the algorithm decodes the particles into feasible task sequences by real number encoding and deadlock repair,as well as establishes the mapping between particle vector and task sequences.By involving the evolutionary stability strategy of the evolutionary game theory in the particle swarm optimization,and by game operation,the game equilibrium point is obtained,which is utilized to adaptively adjust the control parameters of the standard particle swarm to balance the global and local search capabilities of the algorithm.This paper also proposes a strategy avoiding stuck in local convergence,by improving the individual optimal position vector of particles to achieve the effect of enhancing social cognition.Upon simulation analysis,as well as comparing with the existing algorithms,the proposed algorithm shows efficiency in the task allocation problem of multiple UAVs.

Lifelong Multi-agent Task Allocation Based on Graph Coloring Hybrid Evolutionary Algorithm

SHI Xiaoyan, YUAN Peiyan, ZHANG Junna, HUANG Ting, GONG Yuejiao

Computer Science. 2025, 52 (7): 262-270. doi:10.11896/jsjkx.240600016

Abstract

PDF(2208KB) ( 24 )

References | Related Articles | Metrics

The multi-agent task allocation problem is a fundamental issue in the field of intelligent warehousing.It involves continuously assigning continuously incoming tasks to available agents to minimize the average cycle time of overall tasks.For suchlifelong multi-agent task allocation problem,the problem is first mathematically modeled as a graph coloring problem,using a graph that considers conflict relationships to characterize the correlation between tasks and agents.Based on this problem model,in order to minimize the average cycle time of all tasks,this paper proposes a graph coloring hybrid evolutionary algorithm(GCHEA) that combines heuristic algorithm,tabu search algorithm,and genetic algorithm,using the heuristic algorithm to ge-nerate initial solutions to effectively guide the search process;introducing tabu lists to avoid candidate solutions falling into local optima during the optimization process;utilizing the selection,crossover,and replacement operations of genetic algorithm to enhance population diversity,and obtaining the global optimal solution through iterative optimization.Finally,the proposed GCHEA obtains the graph coloring scheme and further decode it into a specific task-agent allocation solution.Tested on a simulation system,the experimental results show that,compared with existing task allocation algorithms,GCHEA achieves significant improvements in terms of the performance indicators of average cycle time and total system delay time.Specifically,using the GCHEA results in approximately a 49% reduction in the average cycle time of the task,and a 50% decrease in the average total system delay time.

Research on Multi-machine Conflict Resolution Based on Deep Reinforcement Learning

HUO Dan, YU Fuping, SHEN Di, HAN Xueyan

Computer Science. 2025, 52 (7): 271-278. doi:10.11896/jsjkx.240800133

Abstract

PDF(2404KB) ( 22 )

References | Related Articles | Metrics

With the increase in military,civilian,and general aviation flight activities,the conflict over airspace use has become prominent,and it has become a normal phenomenon for multiple aircraft to fly simultaneously in the same airspace.Therefore,it is an urgent problem that needs to be solved how to provide assistance in avoiding flight collisions through technical means.To tackle the challenge of resolving conflicts between multiple aircraft in flight,this paper introduces a Graph Convolutional Deep Reinforcement Learning(GDQN) algorithm.This algorithm combins multi-agent deep reinforcement learning with a graph con-volutional neural network framework.Initially,it constructs a message-passing function to develop a multi-agent flight conflict model,which can navigate multiple aircraft through three-dimensional,unstructures airspace while avoiding conflicts and collisions.Subsequently,it employes a deep self-learning method based on graph convolutional networks to offer intelligent conflict avoidance solutions for airport scheduling,creats a multi-agent system(MAS) for managing multi-aircraft conflict scenarios.The effectiveness of the algorithm is validated through simulations using extensive training datasets in a controlled environment.The results indicate that the optimized algorithm is effective,achieving a conflict resolution success rate of over 90%,with resolution decision times of less than 3 seconds.Additionally,it significantly reduces the number of air traffic control(ATC) commands issued and improves overall operational efficiency.

Specific Emitter Identification Based on Progressive Self-training Open Set Domain Adaptation

ZHANG Taotao, XIE Jun, QIAO Pingjuan

Computer Science. 2025, 52 (7): 279-286. doi:10.11896/jsjkx.240600073

Abstract

PDF(2713KB) ( 27 )

References | Related Articles | Metrics

Aiming at the problem that the specific emitter identification model trained in the closed set scene will suffer from the degradation of the known class recognition performance and the error of the new class recognition when deployed in the environmental conditions of the specific emitter identification containing the new class,this paper proposes an specific emitter identification method based on open set domain adaptation in the noise change scene.The maximum and minimum thresholds are used to distinguish the known class and the unknown class,and a target classifier is trained by a progressive self-training method for testing the scene.An unknown classification of the target classifier fit the feature distribution of multiple unknown classes at the same time may lead to the problem of boundary confusion of the learned known and unknown feature distribution.Based on this,a multi-center loss is proposed to increase the compactness within the unknown class of the target known class and the distinguish ability between classes,which can improve the accuracy of the target classifier.At the same time,in order to reduce the problem of fingerprint feature offset caused by noise between the source domain and the target domain,a prototype-to-prototype contrastive learning is used to learn domain invariant features.Six groups of experiments are carried out on the public data set.The HOS index of the proposed method in five groups is better than other methods,and the HOS reaches 93.8 % in the task of 10dB-8dB.The experimental results show the effectiveness of the proposed method.

Service Function Chain Deployment Method Based on VNF Divided Backup Mechanisms

ZHAO Jihong, MA Jian, LI Qianwen, NING Lijuan

Computer Science. 2025, 52 (7): 287-294. doi:10.11896/jsjkx.240400142

Abstract

PDF(3990KB) ( 22 )

References | Related Articles | Metrics

Service Function Chain(SFC) deployment is a key technology for realizing flexible and diverse network services,and SFC reliability is an important index in SFC deployment work.Existing methods improve the reliability of SFC while waste network resources.In order to balance SFC reliability and network resource consumption,a VNF diversity backup(VDB) mechanism is designed to improve SFC by utilizing limited network resources,splitting low-reliability VNF instance into two replica instances,and performing cross-backup of VNF replica instances.In the SFC deployment phase,a VDB mechanism-based service function chain deployment method is proposed,a multi-stage graph is constructed based on the improvement results of the VDB mechanism and network topology attributes,and a dynamic planning algorithm based on Viterbi is used to search for the optimal deployment paths in the multi-stage graph.In addition,an evaluation metric,the backup price/performance ratio,is introduced to measure the balancing effect of the reliability of the SFC and the consumption of the network resources.Simulation results show that the proposed method effectively balances reliability and network resource consumption,and the transmission delay has been optimized.

Component Reliability Analysis of Interconnected Networks Based on Star Graph

LIU Wenfei, LIU Jiafei, WANG Qi, WU Jingli, LI Gaoshi

Computer Science. 2025, 52 (7): 295-306. doi:10.11896/jsjkx.240400170

Abstract

PDF(3020KB) ( 25 )

References | Related Articles | Metrics

With the rapid development of data centers,supercomputing,cloud computing and other technology areas,interconnected networks as one of the foundations of these technologies,are expanding in size.However,as the scale of the network increases,it is inevitable that servers in the network will fail.Once the interconnected network is paralyzed by failures,the normal work and life of human beings will be seriously affected.Therefore,how to minimize the negative impact of faulty units on the entire network topology is a very meaningful topic.Usually,the largest connected component in the remaining network is called the functional subsystem,which quantifies the communication capability and efficiency between processors in the faulty network.This study of quantitative reliability helps us to better understand and manage the stability of interconnected networks.Starting from a star graph-based interconnection network,this paper determines that a small component H in S_n－F satisfies |V(H)|≤4 when the set of fault nodes |F|≤5n－15,and when |F|≤6n－19,the small components H in S_n－F satisfy |V(H)|≤5.In addition,it focuses on the remaining components of a star network S_n(n≥6)when faulty vertices up to 6n－19 are removed from the network.Finally,an approximation algorithm for solving the minimum number of neighbours of small components in a faulty network is proposed,and by simulation experiments,the star network possesses good robustness and fault tolerance.These results are significant for understanding and designing highly reliable interconnection networks.

Research on Multi-user Task Offloading and Service Caching Strategies

WANG Xiang, HAN Qinghai, LIANG Jiarui, YU Xiaoli, WU Qi, QING Li

Computer Science. 2025, 52 (7): 307-314. doi:10.11896/jsjkx.240500036

Abstract

PDF(2561KB) ( 18 )

References | Related Articles | Metrics

As a novel platform providing multidimensional resources such as storage and computing,mobile edge computing(MEC) deploys the capabilities of cloud computing to the edge,offering low-latency and low-power services to users in proximity.However,due to the limited computing resources of MEC servers,selecting effective task execution strategies for massive user data is crucial.This paper investigates task offloading and service caching from the perspectives of latency optimization and terminal energy conservation.Focusing on an MEC network scenario of multiple user,the paper formulates a joint optimization pro-blem for task offloading and service caching,aiming to minimize task latency and terminal energy consumption.Subsequently,the paper proposes a solution based on deep Q-networks to address the optimization problem.Simulation results demonstrate that compared to other benchmark schemes,the proposed approach can significantly improve system service caching hit rates,reduce terminal energy consumption,and decrease task latency.

Survey of Security Research on Multimodal Large Language Models

CHEN Jinyin, XI Changkun, ZHENG Haibin, GAO Ming, ZHANG Tianxin

Computer Science. 2025, 52 (7): 315-341. doi:10.11896/jsjkx.241100141

Abstract

PDF(2653KB) ( 25 )

References | Related Articles | Metrics

With the rapid development of large language models,multimodal large language models have garnered attention for their outstanding performance across various modalities,such as language and images.These models have not only become valuable assistants in daily tasks but are also gradually penetrating major application areas,such as autonomous driving and medical diagnosis.Compared to traditional large language models,multimodal large language models possess enormous potential and challenges due to their closer alignment with real-world applications involving multiple resources and the complexity of multimodal processing.However,research on the vulnerabilities of multimodal large language models is relatively limited,and these models face numerous security challenges in practical applications.This paper aims to provide a comprehensive survey of the security aspects of multimodal large language models,particularly large vision-language models.Firstly,the basic structure and development history of multimodal large language models are summarized.Then,the causes of security risks throughout the full lifecycle of these models are discussed,and the correlations between model structure and security risks are analyzed.Next,this paper systematically summarizes current efforts in evaluating the security of multimodal large language models in terms of image and text security,including model hallucinations,privacy security,bias,and robustness.Attacks on multimodal large language models are divied into jailbreak attacks,adversarial attacks,backdoor attacks,and poisoning attacks.Furthermore,the paper provides a comprehensive overview of a range of trustworthy enhancement methods addressing threats such as hallucinations,privacy leaks,and bias in multimodal large language models,as well as defense mechanisms against malicious attacks on the models.Finally,the main opportunities and challenges in the security research of multimodal large language models are discussed,and guidance and recommendations are provided for researchers in the complex applications and research areas of multimodal large language models.

Lightweight Authentication and Key Agreement Protocol for Cloud-assisted Smart Home Communication

LI Jiangxu, CHEN Zemao, ZHANG Liqiang

Computer Science. 2025, 52 (7): 342-352. doi:10.11896/jsjkx.250100098

Abstract

PDF(2887KB) ( 26 )

References | Related Articles | Metrics

With the widespread adoption of smart home devices,the resource-constrained nature of these devices and the diverse array of potential attack threats present significant challenges to traditional security protocols.In particular,the popular cloud-based smart home Internet of Things(IoT) technologies,while enhancing the intelligence and management efficiency of household devices,have also introduced more complex control models compared to previous systems.Specifically,users can set control rules on the cloud platform for automated device management or remotely control household devices via Apps provided by smart home manufacturers.However,in both control modes,if the identity of the remote controller is not authenticated and a secure session key is not established,attackers may send malicious commands to household devices,thus endangering home security.However,existing security solutions do not address these two mainstream control models and struggle to balance computational overhead,communication efficiency,and security.This highlights the need for a lightweight and efficient authentication and key negotiation protocol.Therefore,to address the security risks in these two control scenarios,this paper proposes a lightweight bidirectional authentication and key negotiation scheme based on elliptic curve cryptography for cloud platforms and smart devices,as well as a bidirectional authentication and key negotiation scheme between users and smart devices,enabling efficient and secure authentication between remote controllers and household devices.The security of the proposed schemes is analyzed using the formal verification tool ProVerif and heuristic methods.A comparison with similar solutions in terms of both security and performance de- monstrates that the proposed scheme can offer more security features while maintaining lightweight performance requirements.

Face Forgery Algorithm Recognition Model Based on Multi-classification Dataset

DING Bowen, LU Tianliang, PENG Shufan, GENG Haoqi, YANG Gang

Computer Science. 2025, 52 (7): 353-362. doi:10.11896/jsjkx.240800079

Abstract

PDF(3649KB) ( 25 )

References | Related Articles | Metrics

At present,the face detection methods mainly focus on the detection of the authenticity of faces,and there are few studies on the recognition of forgery algorithms,accompanied by poor image disturbance robustness and large resource occupation.What's more,the public face detection datasets have problems such as slow update and few types.In order to solve the above problems,this paper proposes a face forgery algorithm recognition model—Indentiformer,which takes the visual self-attention model as the backbone.It decomposes the position coding fusion block,and then uses Fast Fourier Transformation improved by the Khatri-Rao product to extract the global features.At the same time,the parallel convolutional structure is used to supplement the local feature information and the multi-head attention mechanism is used for fusion to enhance the modeling ability of the model.Finally,the overfitting is reduced by the improved multilayer perceptron based on regularization to realize the recognition of face forgery algorithms.In addition,this paper also constructs a multi-classification dataset of fake faces,including 18 forgery methods such as diffusion model,large model and fusion technology,with a total of more than 410 000 face images,which has better data diversity in true and false mixing.Experimental results show that the Indentifomer model achieves 99.57% and 99.73% AUC in the algorithm recognition multi-classification and true-false discrimination binary classification tasks without increasing resource overhead.In robustness experiments,the AUC decreases by only 4.62% on average,which has high recognition ability and anti-interference ability.

Tor Multipath Selection Based on Threaten Awareness

CHEN Shangyu, HU Hongchao, ZHANG Shuai, ZHOU Dacheng, YANG Xiaohan

Computer Science. 2025, 52 (7): 363-371. doi:10.11896/jsjkx.240900102

Abstract

PDF(3221KB) ( 22 )

References | Related Articles | Metrics

With the development and application of machine learning and deep learning,attackers can conduct traffic analysis on malicious nodes and malicious AS on Tor user links,thus carrying out de-anonymization attacks on Tor users.At present,one of the common defense methods for traffic analysis attacks is to insert virtual packets or delay real packets to change traffic characteristics,which will introduce bandwidth and delay costs.The other type defends by dividing user traffic and transmitting it through multiple paths.This method lacks the perception of malicious nodes and malicious AS on the circuit.When an attacker collects a complete traffic trail,it is still difficult to resist the de-anonymization attack on Tor users by traffic analysis.In order to make up for the lack of threat awareness in the path selection of multi-path defense methods,this paper proposes a multipath selection algorithm based on threat awareness,which integrates malicious node awareness and malicious AS awareness.Firstly,an improved method of node distance measurement is proposed,and the improved distance measurement is used to cluster nodes based on K-Mediods algorithm,which improves the detection effect of malicious nodes.Then the improved AS sensing algorithm is improved the anonymity requirement.Finally,a multi-path selection algorithm based on threat perception is proposed by combining malicious node detection and AS sensing algorithm.The experimental results show that the proposed algorithm can not only resist a variety of traffic analysis attacks,but also ensure certain performance requirements of Tor circuits.

Happiness Prediction Approach via Adaptive Privacy Budget Allocation

LUO Yanjie, LI Lin, WU Xiaohua, LIU Jia

Computer Science. 2025, 52 (7): 372-378. doi:10.11896/jsjkx.240700128

Abstract

PDF(2461KB) ( 21 )

References | Related Articles | Metrics

Happiness prediction aims to forecast individuals' life satisfaction and happiness indices by analyzing data.Online platforms for happiness prediction possess a vast amount of data,which also carries the risk of privacy breaches.Existing differential privacy machine learning methods overlook the privacy needs of different attributes.Moreover,privacy budget averaging approach injects excessive noise into the model,leading to performance degradation.To address these issues,this paper proposes a method called Adaptive Privacy Budget Allocation for Happiness Prediction(APBA-DP).Initially,attributes are graded based on users' privacy preferences,and privacy budgets are allocated using information entropy.Subsequently,happiness prediction model establishes an attribute mapping layer to ensure personalised privacy protection.Experimental results on ESS and CGSS datasets show that the accuracy of APBA-DP algorithm is improved by 2.3%~4.4% compared with the traditional differential privacy algorithms under certain privacy protection intensity.At the same time,the success rate of member inference attacks is reduced by 14.7% and 12.5% on average compared with the model without differential privacy protection.

Accelerating Firmware Vulnerability Discovery Through Precise Localization of IntermediateTaint Sources and Dangerous Functions

ZHANG Guanghua, CHEN Fang, CHANG Jiyou, HU Boning, WANG He

Computer Science. 2025, 52 (7): 379-387. doi:10.11896/jsjkx.240800052

Abstract

PDF(2181KB) ( 22 )

References | Related Articles | Metrics

Existing methods aim to accurately identify the starting points of taint analysis by recognizing intermediate taint sources and filter safe command hijacking points in certain cases to streamline endpoint analysis,thus reducing the paths to be analyzed and shortening vulnerability mining time.However,these methods spend excessive time identifying intermediate taint sources and fail to fully filter safe dangerous function call points,leading to prolonged overall vulnerability mining times.The ALTSDF scheme addresses these issues by accurately identifying intermediate taint sources and dangerous function locations.To quickly and accurately identify intermediate taint source as the starting point for taint analysis,it collects the parameter strings used at different call sites of each function to form its parameter string set.We then calculate the proportion of this set that overlaps with the shared keyword set.Functions are ranked in descending order of this proportion－the higher the proportion,the more likely the function is an intermediate taint source.When filtering safe dangerous function call points,it statically back-traces parameter types to exclude points where the parameter source is a constant,thus avoiding safe command hijacking and buffer overflow points.To reduce the time spent identifying intermediate taint sources,minimize taint propagation paths to dangerous function calls,and shorten the analysis time,thus speeding up vulnerability discovery.Testing on embedded Web programs in 21 real device firmwares show that ALTSDF significantly reduces the time spent on intermediate taint source inference compared to the FITS tool.It also reduces the taint analysis path by 8% compared to CINDY and ultimately reduces vulnerability mining time by 32% compared to the combined solution of SaTC with FITS and CINDY.These results demonstrate that ALTSDF acce- lerates the identification of vulnerabilities in firmware embedded Web programs.

Improved SVM Model for Industrial Control Anomaly Detection Based on InterferenceSample Distribution Optimization

GU Zhaojun, YANG Xueying, SUI He

Computer Science. 2025, 52 (7): 388-398. doi:10.11896/jsjkx.240500100

Abstract

PDF(3031KB) ( 26 )

References | Related Articles | Metrics

Most of the existing anomaly detection and classification methods for industrial control systems cannot effectively deal with the problems of class imbalance and overlapping coupling.This paper proposes improved SVM model based on adaptive differential evolution with sphere(SJADE_SVM).The model combines the adaptive differential evolution oversampling technique based on hypersphere coverage with support vector machine.Firstly,by improving the hypersphere covering algorithm,the probability formula is constructed to identify and eliminate interference samples.Then improve synthetic mionrity oversampling technique to relieve class imbalance and overlap coupling by sampling safe samples.Finally,adaptive differential evolution algorithm is used to optimize the location and properties of the samples,and SVM is used to classify the samples.Based on six real industrial control data sets and four UCI public data sets,three sets of experiments are designed to compare the performance of anomaly detection classification algorithms such as logistic regression and Gaussian naive Bayes, improve the experimental comparison of sample distribution methods,meanwhile,compare the running time of algorithm.The experimental results show that the model has a maximum improvement of 38.29% and 10.54% in F-score and G-mean,respectively,ranking the top three in classification results,and also shows significant performance advantages in statistical experiments such as non-parametric bilateral Wilcoxon signed rank test and Friedman test with α=0.05.