Supervised and Sponsored by Chongqing Southwest Information Co., Ltd.
CODEN JKIEBK
Featured ArticlesMore...
-
-
Comprehensive Survey of LLM-based Agent Operating Systems
郭陆祥, 王越余, 李芊玥, 李莎莎, 刘晓东, 纪斌, 余杰. 大语言模型智能体操作系统研究综述[J]. 计算机科学, 2026, 53(1): 1-11.
GUO Luxiang, WANG Yueyu, LI Qianyue, LI Shasha, LIU Xiaodong, JI Bin, YU Jie. Comprehensive Survey of LLM-based Agent Operating Systems[J]. Computer Science, 2026, 53(1): 1-11. - GUO Luxiang, WANG Yueyu, LI Qianyue, LI Shasha, LIU Xiaodong, JI Bin, YU Jie
- Computer Science. 2026, 53 (1): 1-11. doi:10.11896/jsjkx.250500002
-
Abstract
(
104 )
PDF(3168KB) (
41
)
- References | Related Articles | Metrics
-
Large language model-based agent operating systems(Agent OS),as core platforms for integrating large models,tool resources,and multi-agent collaboration,are gradually becoming a key research direction for advancing general artificial intelligence.This paper systematically reviews the research progress in the field of Agent OS.It begins by discussing foundational theories,reviewing the evolution of various large language models,and progress in agent technology and traditional operating systems.This paper then elaborates on how their hierarchical architectures and modular designs achieve resource management and intelligent scheduling,focusing on typical architectures such as AIOS.Furthermore,it clarifies existing technical bottlenecks in scalability,context integration,and security within current systems.It also proposes future directions,including the use of lightweight designs,self-supervised learning mechanisms,and dynamic scheduling algorithms to optimize multi-agent cooperation efficiency.The main contributions of this paper are integrating fragmented research to provide a clearer technical framework,and highlighting the current limitations of Agent OS in covering emerging applications and industry-specific customizations.Future work should focus on enhancing the capability of cross-domain Agent OS for self-evolution and accelerating their implementation across diverse fields.
Research and Application of Large Language Model Technology-
Efficient Inference Techniques of Large Models in Real-world Applications:A Comprehensive Survey
刘利龙, 刘国明, 齐保元, 邓雪杉, 薛迪展, 钱胜胜. 实际应用场景中的大模型高效推理技术综述[J]. 计算机科学, 2026, 53(1): 12-28.
LIU Lilong, LIU Guoming, QI Baoyuan, DENG Xueshan, XUE Dizhan, QIAN Shengsheng. Efficient Inference Techniques of Large Models in Real-world Applications:A Comprehensive Survey[J]. Computer Science, 2026, 53(1): 12-28. - LIU Lilong, LIU Guoming, QI Baoyuan, DENG Xueshan, XUE Dizhan, QIAN Shengsheng
- Computer Science. 2026, 53 (1): 12-28. doi:10.11896/jsjkx.250300030
-
Abstract
(
70 )
PDF(2904KB) (
13
)
- References | Related Articles | Metrics
-
In recent years,the technologies of LLMs have been rapidly developed,with their applications across various industries experiencing vigorous growth.From natural language processing to intelligent recommendations,and from information retrieval to automated writing,LLMs are becoming indispensable tools in many fields.However,with the diversification of application scena-rios and the increase in demands,the efficiency of LLM inference is becoming increasingly prominent.In practical applications,ra-pid and accurate inference capabilities are crucial for responding to user queries,handling large-scale data,and making real-time decisions.To address this challenge,academia has undertaken extensive research and exploration to enhance the inference efficiency of LLMs.This paper comprehensively surveys the literature on efficient LLM inference in practical application scenarios.Firstly,it introduces the principles of LLMs and analyzes how to improve LLM inference efficiency in practical application scenarios.Secondly,it proposes a taxonomy tailored for real-world applications,which consists of three main levels:algorithm optimization,parameter optimization,and system optimization.This survey summarizes and categorizes related work about LLMs.Finally,it discusses potential future research directions.
-
LLM-based Business Process Adaptation Method to Respond Long-tailed Changes
邵欣怡, 朱经纬, 张亮. 基于大语言模型的业务流程长尾变化应变方法[J]. 计算机科学, 2026, 53(1): 29-38.
SHAO Xinyi, ZHU Jingwei, ZHANG Liang. LLM-based Business Process Adaptation Method to Respond Long-tailed Changes[J]. Computer Science, 2026, 53(1): 29-38. - SHAO Xinyi, ZHU Jingwei, ZHANG Liang
- Computer Science. 2026, 53 (1): 29-38. doi:10.11896/jsjkx.250100001
-
Abstract
(
83 )
PDF(3885KB) (
21
)
- References | Related Articles | Metrics
-
Business process adaptation is one of fundamental and enduring tasks in business process management,aimed at enhancing flexibility and achieving business objectives by adjusting process models and instances in response to everchanging environment.Long-tailed changes(LTCs),stemming from residual uncertainty during modeling,are inevitable and pose a significant challenge to business resilience.The most effective approach available now is a tripartite collaboration framework,consisting of a frontend business operators perceiving LTCs and fulfilling adaptation using domain specific languages(DSL),a technical backend and managerial team providing service repository and compliance requirements,and an enabling tool assisting the adaptation.However,the diversity,complexity,and urgency of LTCs in varying spatiotemporal scenarios may exceed the frontend’s ability to grasp the situations,formulate appropriate solutions,and express them in DSL.To address the limitation and further expand the effective framework,LLM-Adapt,a long-tailed changes adaptation method based on large language models(LLMs),is proposed.By leveraging the generalization ability,content generation power,and embedded knowledge of events and countermeasures in LLMs,LLM-Adapt provides a more efficient and applicable adaptation mechanism.Firstly,a prompt engineering strategy tailored to the characteristics of LTCs is developed to enable frontend to interact with LLMs in natural language and obtain adaptation solutions.Secondly,in alignment with the business baseline constraints set by the back-end process owners,functional validation of the adaptation solutions is conducted.Furthermore,a new algorithm SSDT-Lane based on process structural similarity is proposed to filter out adaptation candidates that strike current organizational and resource configurations.Case studies and experiments conducted using both synthetic and real-world datasets demonstrate that LLM-Adapt outperforms existing methods in terms of accuracy,efficiency and applicability.
-
Research on Architecture and Technology Pathways for Empowering Tactical AdversarialSimulation Experiments with LLMs
刘大勇, 董志明, 郭齐胜, 高昂, 邱雪欢. 大模型赋能战术对抗仿真实验体系架构及技术路径研究[J]. 计算机科学, 2026, 53(1): 39-50.
LIU Dayong, DONG Zhiming, GUO Qisheng, GAO Ang, QIU Xuehuan. Research on Architecture and Technology Pathways for Empowering Tactical AdversarialSimulation Experiments with LLMs[J]. Computer Science, 2026, 53(1): 39-50. - LIU Dayong, DONG Zhiming, GUO Qisheng, GAO Ang, QIU Xuehuan
- Computer Science. 2026, 53 (1): 39-50. doi:10.11896/jsjkx.250400064
-
Abstract
(
75 )
PDF(6214KB) (
20
)
- References | Related Articles | Metrics
-
Tactical confrontation simulation experiments are the core means of operational analysis,simulation training,and equipment activities based on simulation,and their levels of intelligence and automation directly impact the effectiveness of experiments and the generation of combat capabilities.To address the low efficiency issues in experimental design,model construction,scenario control,and human-computer interaction in traditional simulation experiments,a system architecture for empowering tactical confrontation simulation experiments with large language models is proposed,referencing the MCP protocol.This architecture consists of five layers:the foundation layer,tool resource layer,AI agent layer,empowerment path layer,and application layer.The five-layer architecture is guided top-down and integrated bottom-up layer by layer,enabling the coupling and aggregation of large and small models with data resources and traditional small models,and empowering various military activities based on simulation.Based on this,the specific paths of large model empowerment in tactical confrontation simulation are discussed in detail:large model empowerment in simulation experiment design,large model empowerment in decision-making model construction,and large model empowerment in scenario control.Finally,the challenges and countermeasures are analyzed.
-
Pre-training World Models from Videos with Generated Actions by Multi-modal Large Models
万盛华, 徐兴业, 甘乐, 詹德川. 基于多模态大模型辅助视频动作生成的预训练世界模型[J]. 计算机科学, 2026, 53(1): 51-57.
WAN Shenghua, XU Xingye, GAN Le, ZHAN Dechuan. Pre-training World Models from Videos with Generated Actions by Multi-modal Large Models[J]. Computer Science, 2026, 53(1): 51-57. - WAN Shenghua, XU Xingye, GAN Le, ZHAN Dechuan
- Computer Science. 2026, 53 (1): 51-57. doi:10.11896/jsjkx.250800033
-
Abstract
(
41 )
PDF(2984KB) (
15
)
- References | Related Articles | Metrics
-
Pre-training of world models is key to improving the sample efficiency of reinforcement learning.However,existing methods struggle to capture the causal mechanisms of state transitions due to the lack of explicit action labels in video data.This paper presents MAPO(Multimodal-large-model-generated Action-based pre-training from videOs for world models),a novel pre-training framework.It leverages the semantic understanding of visual-language models and meets the needs of kinematic mode-ling,overcoming the limitations of traditional pre-training methods in the absence of action semantics.MAPO uses the multimodal large model(QWEN2_5-VL-7B) to analyze video frame sequences and generate fine-grained semantic action descriptions during pre-training.This establishes action-state associations with causal explanations.It also designs a context quantization encoding mechanism to separate static scene features from dynamic control factors,improving cross-modal representation.During fine-tu-ning,MAPO uses a dual-network collaborative architecture to align the pre-trained kinematic features with real-environment actions.Experiments show MAPO steadily improves average returns over baselines in 8 tasks on DeepMind Control Suite and Meta-World,especially in long-horizon tasks.This study offers a new cross-modal world model training approach,highlighting the importance of semantic action generation in causal reasoning.
-
Review of Graph Embedding Learning Research:From Simple Graph to Complex Graph
黄苗苗, 王慧颖, 王梅霞, 王业江, 赵宇海. 图嵌入学习研究综述:从简单图到复杂图[J]. 计算机科学, 2026, 53(1): 58-76.
HUANG Miaomiao, WANG Huiying, WANG Meixia, WANG Yejiang , ZHAO Yuhai. Review of Graph Embedding Learning Research:From Simple Graph to Complex Graph[J]. Computer Science, 2026, 53(1): 58-76. - HUANG Miaomiao, WANG Huiying, WANG Meixia, WANG Yejiang , ZHAO Yuhai
- Computer Science. 2026, 53 (1): 58-76. doi:10.11896/jsjkx.250300081
-
Abstract
(
59 )
PDF(2381KB) (
3
)
- References | Related Articles | Metrics
-
Graph data,as a data type with strong expressive power,is difficult to model efficiently due to its complex structure.How to effectively capture its intrinsic information has become a challenging problem.Graph embedding methods have received increasing attention by mapping high-dimensional sparse graphs into low-dimensional dense feature vectors,while preserving the structural information of graphs.However,the existing reviews do not summarize the graph embedding methods comprehensively enough,especially paying less attention to complex graph embedding,which leads to the failure to systematically sort out the current status of research on graph embedding in dealing with diverse graph data.Therefore,this paper presents a systematic review of graph embedding learning methods from simple to complex graphs.Firstly,it gives the common definitions of various types of graphs and graph embedding.Secondly,it systematically summarizes the embedding methods on simple graphs,including shallow and deep embedding methods.Then,it summarizes the embedding methods on complex graphs according to the types of graphs,focusing on the application of deep embedding techniques in complex graph structures such as dynamic graphs,heterogeneous graphs,multiplex graphs,and hypergraphs,to fill the gaps in the existing literature that is insufficiently researched on complex graph structures.Finally,it discusses the practical application scenarios of graph embedding techniques,and looks forward to the future development directions.
Database & Big Data & Data Science-
Survey on Optimization B+ Tree Index for Persistent Memory
卢超, 杨朝树, 姚政竹, 刘颖, 张润宇. 基于持久内存的B+树索引优化综述[J]. 计算机科学, 2026, 53(1): 77-88.
LU Chao, YANG Chaoshu, YAO Zhengzhu, LIU Ying, ZHANG Runyu. Survey on Optimization B+ Tree Index for Persistent Memory[J]. Computer Science, 2026, 53(1): 77-88. - LU Chao, YANG Chaoshu, YAO Zhengzhu, LIU Ying, ZHANG Runyu
- Computer Science. 2026, 53 (1): 77-88. doi:10.11896/jsjkx.250200109
-
Abstract
(
47 )
PDF(3086KB) (
4
)
- References | Related Articles | Metrics
-
The advent of persistent memory(PM) introduces new perspectives for index structure design while presenting challenges in data consistency,persistence overhead,and concurrency control.As a widely adopted index structure in storage systems,the B+ tree requires tailored adaptations to harness PM’s unique features,including byte-addressability,non-volatility,and low latency.This paper focuses on the optimization of B+ tree indexes for persistent memory,beginning with an analysis of the challenges in designing PM-based B+ trees.Then,optimization strategies are reviewed from two perspectives:PM-only architectures and DRAM-PM hybrid architectures.For PM-only architectures,advancements in data consistency mechanisms,concurrency control optimizations,and innovative leaf node designs are summarized,with an emphasis on enhancing write operation efficiency while ensuring instant recovery.For DRAM-PM hybrid architectures,strategies based on leaf node structure optimization and auxiliary structure enhancement are examined,highlighting approaches to improve indexing performance through selective persis-tence.Finally,a detailed analysis of the design characteristics,advantages,and limitations of optimization schemes under both architectures is presented,offering insights into future research directions for B+ tree index optimization in these contexts.
-
KAN-based Unsupervised Multivariate Time Series Anomaly Detection Network
王成, 金城. 基于KAN的无监督多元时间序列异常检测网络[J]. 计算机科学, 2026, 53(1): 89-96.
WANG Cheng, JIN Cheng. KAN-based Unsupervised Multivariate Time Series Anomaly Detection Network[J]. Computer Science, 2026, 53(1): 89-96. - WANG Cheng, JIN Cheng
- Computer Science. 2026, 53 (1): 89-96. doi:10.11896/jsjkx.241200190
-
Abstract
(
78 )
PDF(1677KB) (
12
)
- References | Related Articles | Metrics
-
Time series data is widely present in fields such as finance,healthcare,industry,and transportation.Time Series Ano-maly Detection(TSAD) is crucial for ensuring system stability and safety.Most current time series anomaly detection methods are unsupervised due to the difficulty in collecting anomaly samples.However,these methods commonly face the problem of over-generalization,where the model can not only reconstruct normal samples,but also effectively reconstruct anomaly samples,leading to poor anomaly detection performance.Therefore,this paper proposes a time series anomaly detection method based on Kolmo-gorov-Arnold representation theory,called TS-KAN.TS-KAN leverages its parameter efficiency and local plasticity to better fit normal samples and alleviate the overgeneralization problem.Additionally,this paper introduces a local feature enhancement layer,namely Local-KAN,to enhance the representation of temporal features and improve contextual anomaly detection capability.Experiments on five mainstream time series anomaly detection datasets demonstrate that TS-KAN significantly outperforms existing methods in anomaly detection capability.
-
Pedestrian Trajectory Prediction Method Based on Graph Attention Interaction
刘宏鉴, 邹丹平, 李萍. 基于图注意力交互的行人轨迹预测方法[J]. 计算机科学, 2026, 53(1): 97-103.
LIU Hongjian, ZOU Danping, LI Ping. Pedestrian Trajectory Prediction Method Based on Graph Attention Interaction[J]. Computer Science, 2026, 53(1): 97-103. - LIU Hongjian, ZOU Danping, LI Ping
- Computer Science. 2026, 53 (1): 97-103. doi:10.11896/jsjkx.250300132
-
Abstract
(
55 )
PDF(1868KB) (
9
)
- References | Related Articles | Metrics
-
Pedestrian trajectory prediction has made significant research progress in the fields of autonomous driving and intelligent transportation.Due to the dual influence of individual and environmental factors,pedestrian trajectories exhibit uncertainty and complexity.Accurately generating multimodal trajectories by leveraging the interactive features of trajectory data remains a challenge.One of the primary challenges in this field is the accurate modeling of spatial temporal interactions among pedestrians.To address the complexity of pedestrian spatial temporal interactions,this paper proposes a spatial temporal graph neural network based on graph attention.The proposed method quantitatively represents the spatial interactions between pedestrians,focusing on key interactions,and represents pedestrian trajectory information as a directed spatial temporal graph.The spatial position features and interaction features are extracted using a graph attention mechanism,while the temporal features are obtained using a self-attention mechanism.By integrating spatial temporal feature information,the model generates multimodal future trajectories based on historical trajectory data and interaction information.Experiments conducted on the publicly available ETH-UCY dataset demonstrate that the proposed method outperforms the baseline models,achieving improvements of 3.4% and 2.1% in ADE and FDE,respectively.Additionally,the proposed model has a shorter inference time,ensuring real-time inference responses.Visualization results further indicate that the predicted pedestrian trajectories are plausible and socially acceptable,demonstrating promising prospects for engineering applications.
-
Joint Spectrum Embedding Clustering Algorithm Based on Multi-view Diversity Learning
李顺勇, 郑孟蛟, 李嘉茗, 赵兴旺. 基于多视图多样性学习的联合谱嵌入聚类算法[J]. 计算机科学, 2026, 53(1): 104-114.
LI Shunyong, ZHENG Mengjiao, LI Jiaming, ZHAO Xingwang. Joint Spectrum Embedding Clustering Algorithm Based on Multi-view Diversity Learning[J]. Computer Science, 2026, 53(1): 104-114. - LI Shunyong, ZHENG Mengjiao, LI Jiaming, ZHAO Xingwang
- Computer Science. 2026, 53 (1): 104-114. doi:10.11896/jsjkx.241100070
-
Abstract
(
72 )
PDF(4783KB) (
9
)
- References | Related Articles | Metrics
-
Most of the existing multi-view clustering algorithms only rely on the low-order similarity information between views,fail to capture the high-order structural features in the data effectively,and pay insufficient attention to the diversity features of the multi-view data,resulting in the accuracy and robustness of the clustering results.To solve these problems,JSEC,a joint spectral embedding clustering algorithm based on multi-view diversity learning,is proposed.Through view diversity learning,multiple features between data are preserved,so as to effectively remove the noise in the view.Then,a method of mining higher-order information of views is proposed to make the diversity features of views as close as possible to the hybrid similarity graph,so as to realize efficient integration of information of different views,and realize the diversity and complementary integration of views.Finally,the diversity feature matrix of the view is fused into the joint spectral embedding matrix in the spectral embedding module,and the graph clustering is realized by spectral clustering.In addition,an alternate iteration method is designed to optimize the objective function.In comparison with the latest multi-view clustering algorithms,JSEC algorithm shows superior performance on 3 indicators of 5 medium and small scale real datasets,and also on 2 large scale datasets.Compared with the suboptimal algorithm,ARI index has an improvement of 1.27% and 2.57% ondatasets of different scales.The superiority of the algorithm is proved theoretically and experimentally.
-
Attribute Grouping-based Categorical Outlier Detection Using Isolation Forest Ensemble Strategy
宋亦静, 张继福. 基于隔离森林集成策略的分类型属性分组离群检测[J]. 计算机科学, 2026, 53(1): 115-127.
SONG Yijing, ZHANG Jifu. Attribute Grouping-based Categorical Outlier Detection Using Isolation Forest Ensemble Strategy[J]. Computer Science, 2026, 53(1): 115-127. - SONG Yijing, ZHANG Jifu
- Computer Science. 2026, 53 (1): 115-127. doi:10.11896/jsjkx.241000163
-
Abstract
(
35 )
PDF(2429KB) (
5
)
- References | Related Articles | Metrics
-
Attribute grouping is one of the effective steps in high-dimensional outlier detection,but the current ensemble strategies in attribute grouping-based outlier detection only take into account the local outlier information within each attribute group,and ignore the global outlier information of all attribute groups,which can lead to a biased ensemble of attribute group outlier information.This paper proposes an attribute grouping outlier detection approach based on Isolated Forest ensemble strategy by using the local and global outlier information of attribute groups.Firstly,attributes are automatically divided into several attribute groups based on the local and global correlation among attributes,and the outlier information of data objects is obtained in each attribute group.Secondly,from the perspective of attribute grouping,the ensemble bias of the current outlier information ensemble strategy is theoretically analyzed,and the ensemble deviation coefficient are defined as the evaluation index of the outlier information ensemble strategy.Then an attribute grouping-based isolation forest ensemble strategy for categorical outlier detection is proposed,this strategy effectively depicts the local and global outlier information of attribute groups and lowers the ensemble bias of attribute group outlier detection.In the end,experimental results on the UCI validate that the ensemble strategy effectively alleviates the ensemble bias and improves the outlier detection performance.Importantly,compared with the competing methods,the algorithm bolsters the AUC index and the detection efficiency by averages of 7.83% and 48.43%.
-
Review of Retinal Image Analysis Methods for OCT/OCTA Based on Deep Learning
薛静艳, 夏佳楠, 霍蕊莉, 刘杰, 周雪忠. 基于深度学习的OCT/OCTA视网膜图像分析方法综述[J]. 计算机科学, 2026, 53(1): 128-140.
XUE Jingyan, XIA Jianan, HUO Ruili, LIU Jie, ZHOU Xuezhong. Review of Retinal Image Analysis Methods for OCT/OCTA Based on Deep Learning[J]. Computer Science, 2026, 53(1): 128-140. - XUE Jingyan, XIA Jianan, HUO Ruili, LIU Jie, ZHOU Xuezhong
- Computer Science. 2026, 53 (1): 128-140. doi:10.11896/jsjkx.241100047
-
Abstract
(
66 )
PDF(2925KB) (
10
)
- References | Related Articles | Metrics
-
Deep learning is a branch of artificial intelligence that relies on deep neural networks for data processing and analysis.In recent years,deep learning has made significant breakthroughs in the field of medical imaging,especially in image classification,segmentation and efficacy evaluation.In the field of ophthalmology,there is an increasing need to apply deep learning techniques for efficient and accurate analysis of OCT and OCTA.Compared with traditional manual methods,deep learning methods show higher accuracy and automation in dealing with complex fundus structure and pathological changes.However,most of the previous reviews focuse on single imaging mode or single task research,and often ignore the correlation between different imaging technology,as well as the acceptability and correlation between tasks.This paper not only summarizes the commonly used data sets,systematically reviews the segmentation methods of retina-related disease biomarkers based on different OCT and OCTA devices,but also summarizes the typical classification methods of retina-related diseases from the perspective of different disease characteristics.Finally,this paper also looks forward to the future research direction from the perspectives of data privacy and security,model interpretability,and model universality,which provides a valuable reference for subsequent research.
Computer Graphics & Multimedia-
PKHOI:Enhancing Human-Object Interaction Detection Algorithms with Prior Knowledge
赵文豪, 梅萌, 王小平, 罗航宇. PKHOI:利用先验知识增强人-物交互检测算法[J]. 计算机科学, 2026, 53(1): 141-152.
ZHAO Wenhao, MEI Meng, WANG Xiaoping, LUO Hangyu. PKHOI:Enhancing Human-Object Interaction Detection Algorithms with Prior Knowledge[J]. Computer Science, 2026, 53(1): 141-152. - ZHAO Wenhao, MEI Meng, WANG Xiaoping, LUO Hangyu
- Computer Science. 2026, 53 (1): 141-152. doi:10.11896/jsjkx.250100086
-
Abstract
(
51 )
PDF(4406KB) (
4
)
- References | Related Articles | Metrics
-
HOI detection plays a crucial role in visual scene understanding.With the advancement of deep learning technologies,vision-based interaction detection models have achieved promising performance.However,most existing methods lack the utilization of prior logical knowledge,sometimes leading to unreasonable predictions.Additionally,while some methods employ spatial information and human pose information for reasoning,they only construct losses between inference results and annotations,preventing decoders from learning accurate implicit relationships.Therefore,this paper proposes the PKHOI method,which enhances existing HOI detection algorithms by leveraging prior knowledge,effectively improving the accuracy of current HOI detection algorithms.Specifically,it constructs a logical rule table from the training set,encompassing object functionality,spatial relationships,human poses,and verb co-occurrence.These rules are transformed into first-order logic and mapped to continuous space.The prior logical rules are then incorporated into neural networks through loss functions during training and matrix multiplication during inference,enhancing model accuracy.Furthermore,this paper proposes a method to generate human-object pair queries by fusing multimodal information(spatial,semantic,and human pose information).Combined with logical loss functions,this approach guides the decoder to learn more implicit knowledge.The proposed method enhances two mainstream HOI detection algorithms,UPT and PViC,and evaluates them on V-COCO,HICO-DET,and Flickr30k datasets.Experimental results demonstrate that the proposed method can effectively improve the performance of existing approaches.
-
EvR-DETR:Event-RGB Fusion for Lightweight End-to-End Object Detection
周秉泉, 蒋杰, 陈江民, 詹礼新. EvR-DETR:融合事件与RGB图像的轻量级端到端目标检测[J]. 计算机科学, 2026, 53(1): 153-162.
ZHOU Bingquan, JIANG Jie, CHEN Jiangmin, ZHAN Lixin. EvR-DETR:Event-RGB Fusion for Lightweight End-to-End Object Detection[J]. Computer Science, 2026, 53(1): 153-162. - ZHOU Bingquan, JIANG Jie, CHEN Jiangmin, ZHAN Lixin
- Computer Science. 2026, 53 (1): 153-162. doi:10.11896/jsjkx.250300021
-
Abstract
(
54 )
PDF(3771KB) (
2
)
- References | Related Articles | Metrics
-
Event cameras based on neuromorphic spike signals can provide information about illumination changes,compensating for the performance degradation of traditional RGB cameras in object detection under adverse environments.However,existing methods fusing event cameras with conventional cameras suffer from large model parameters and non-end-to-end training approaches,which restrict the effectiveness of modality fusion.To address this,this paper proposes a lightweight end-to-end object detection framework that integrates event and RGB information through multi-granularity fusion of multi-scale features across different network levels.By implementing lightweight fusion modules with reparameterized convolutions and enabling end-to-end training,the proposed framework enhances the model’s capability to extract complementary information from both modalities,overcoming challenging conditions in autonomous driving.Evaluated on the large-scale PKU-SOD dataset containing vehicular visual data under low-light,high-speed motion blur,and normal illumination scenarios,the proposed method significantly reduces model parameters compared to state-of-the-art multimodal approaches while improving detection accuracy and inference speed,demonstrating superior performance over existing methods.
-
Co-salient Object Detection Guided by Category Labels
李芳芳, 孔雨秋, 刘洋, 李朋玥. 基于类别标签引导的协同显著性目标检测方法[J]. 计算机科学, 2026, 53(1): 163-172.
LI Fangfang, KONG Yuqiu, LIU Yang , LI Pengyue. Co-salient Object Detection Guided by Category Labels[J]. Computer Science, 2026, 53(1): 163-172. - LI Fangfang, KONG Yuqiu, LIU Yang , LI Pengyue
- Computer Science. 2026, 53 (1): 163-172. doi:10.11896/jsjkx.250100071
-
Abstract
(
54 )
PDF(2859KB) (
4
)
- References | Related Articles | Metrics
-
Acquiring pixel-level labels is laborious and time-consuming,whereas image-level labels can be obtained much more easily.However,the use of image-level labels for co-salient object detection(CoSOD) remains underexplored.This paper presents a two-stage approach for weakly supervised CoSOD,relying solely on image-level labels(class labels) for model training.By utilizing the semantic information of class labels,this approach enables the localization and segmentation of co-salient objects.In the first stage,a pseudo-label generation network is proposed to generate saliency maps for input images,supervised by class labels.In the second stage,a co-salient object segmentation network is trained using these saliency maps as pseudo-labels.A self-corrective learning strategy is also incorporated to enhance model performance.For the first time,this paper proposes using image-level labelsbased training approach for CoSOD.Experiments on three representative datasets demonstrate the effectiveness and feasibility of the proposed method.
-
Camouflaged Object Detection for Aerial Images Based on Bidirectional Cross-attentionCross-domain Fusion
李昂, 章杰元, 刘逊韵. 基于双向交叉注意力跨域融合的航拍图像伪装目标识别方法[J]. 计算机科学, 2026, 53(1): 173-179.
LI Ang, ZHANG Jieyuan, LIU Xunyun. Camouflaged Object Detection for Aerial Images Based on Bidirectional Cross-attentionCross-domain Fusion[J]. Computer Science, 2026, 53(1): 173-179. - LI Ang, ZHANG Jieyuan, LIU Xunyun
- Computer Science. 2026, 53 (1): 173-179. doi:10.11896/jsjkx.250300009
-
Abstract
(
58 )
PDF(2986KB) (
9
)
- References | Related Articles | Metrics
-
To address the challenges of highly integration with the environment and the high demand for real-time performance,this paper proposes a camouflaged object detection model for aerial images using bidirectional cross-attention cross-domain fusion.Firstly,a feature extraction network with two branches is constructed to extract features from both RGB and frequency domain.Simultaneously,frequency features and RGB features are crossly fused at multiple scales using bidirectional cross-attention fusion modules,effectively improving the network’s representational capacity.Experimental results show that the proposed model achieves a better balance between target recognition accuracy and real-time performance,compared to other representative models.
-
Multi-task Speech Emotion Recognition Incorporating Gender Information
姚佳, 李冬冬, 王喆. 结合性别信息的多任务语音情感识别[J]. 计算机科学, 2026, 53(1): 180-186.
YAO Jia, LI Dongdong, WANG Zhe. Multi-task Speech Emotion Recognition Incorporating Gender Information[J]. Computer Science, 2026, 53(1): 180-186. - YAO Jia, LI Dongdong, WANG Zhe
- Computer Science. 2026, 53 (1): 180-186. doi:10.11896/jsjkx.241200006
-
Abstract
(
40 )
PDF(2261KB) (
4
)
- References | Related Articles | Metrics
-
Existing methods for speech emotion recognition usually rely on deep learning models to extract acoustic features,but most of them focus only on modelling generic features,failing to fully explore a priori knowledge in the data that is closely related to emotion.To this end,this paper proposes an end-to-end multi-task learning framework that utilizes the self-supervised pre-training model WavLM to extract speech features rich in emotional information,and introduces gender recognition as an auxiliary task to account for the influence of gender differences on emotion recognition.To address the learning imbalance issue caused by the fixed weight calculation in traditional multi-task learning frameworks,this paper proposes a Temperature-aware Dynamic Weight Averaging(TA-DWA) method.This method balances the learning speeds of different tasks by dynamically adjusting the temperature coefficient and achieves more reasonable weight allocation by incorporating the rate of change in task losses.Experimental results on the IEMOCAP and EMODB datasets demonstrate that the proposed approach significantly improves emotion recognition accuracy.These findings validate the effectiveness of using gender recognition as an auxiliary task and highlight the advantages of the dynamic weighting strategy in multi-task learning.
-
Multimodal Sentiment Analysis for Interactive Fusion of Dual Perspectives Under Cross-modalInconsistent Perception
卜韵阳, 齐彬廷, 卜凡亮. 跨模态不一致感知下双视角交互融合的多模态情感分析[J]. 计算机科学, 2026, 53(1): 187-194.
BU Yunyang, QI Binting, BU Fanliang. Multimodal Sentiment Analysis for Interactive Fusion of Dual Perspectives Under Cross-modalInconsistent Perception[J]. Computer Science, 2026, 53(1): 187-194. - BU Yunyang, QI Binting, BU Fanliang
- Computer Science. 2026, 53 (1): 187-194. doi:10.11896/jsjkx.241100029
-
Abstract
(
39 )
PDF(2614KB) (
3
)
- References | Related Articles | Metrics
-
In social media,people’s comments usually describe a certain sentiment region in the corresponding image,and there is correspondence information between image and text.Most previous multimodal sentiment analysis methods only explore the interactions between images and text from a single perspective,capturing the correspondence between image regions and text words,leading to results that are not optimal.In addition,data on social media is strongly personal and subjective,and the sentiment in the data is multidimensional and complex,which leads to the emergence of data with weak image and text sentiment consistency.To address the above two problems,a multimodal sentiment analysis model with interactive fusion of two perspectives under cross-modal inconsistency perception is proposed.On the one hand,cross-modal interaction of graphic and textual features from both global and local perspectives provides a more comprehensive and accurate sentiment analysis,which improves the perfor-mance and application of the model.On the other hand,the inconsistency scores of the graphical features are calculated to representthe degree of graphical inconsistency,as a way to dynamically regulate the weights of the unimodal and multimodal representations in the final sentiment features,thus improving the robustness of the model.Extensive experiments are conducted on two public datasets,MVSA-Single and MVSA-Multiple,and the results demonstrate the validity and superiority of the proposed multimodal sentiment analysis model compared to the existing baseline models,with F1 values increasing by 0.59 persentage points and 0.39 persentage points,respectively.
-
Facial Expression Recognition with Channel Attention Guided Global-Local Semantic Cooperation
吕景刚, 高硕, 李玉芝, 周金. 通道注意力指导全局-局部语义协同的表情识别[J]. 计算机科学, 2026, 53(1): 195-205.
LYU Jinggang, GAO Shuo, LI Yuzhi, ZHOU Jin. Facial Expression Recognition with Channel Attention Guided Global-Local Semantic Cooperation[J]. Computer Science, 2026, 53(1): 195-205. - LYU Jinggang, GAO Shuo, LI Yuzhi, ZHOU Jin
- Computer Science. 2026, 53 (1): 195-205. doi:10.11896/jsjkx.250900051
-
Abstract
(
61 )
PDF(3593KB) (
3
)
- References | Related Articles | Metrics
-
In facial emotion recognition,noisy data caused by poor image quality often degrades recognition accuracy,while limited sample sizes hinder the ability of conventional deep learning models to distinguish noisy from clean facial features.To address these challenges,this paper proposes a novel framework,CAFSC,which integrates an adaptive channel attention strategy with a local-global collaborative mechanism to enhance recognition performance.A noise-robust data augmentation strategy is first introduced,combining Gaussian blur,perspective transformation,and color perturbation with image stitching,flipping,and rotation.This not only preserves subtle facial expression cues but also improves image clarity,dataset diversity,and model robustness.It further designs a Channel Attention Module with Adaptive Channel Reordering(CAM-ACR) that reorders channel features,followed by grouped convolution and concatenation,to capture multi-dimensional local semantics.A local-global feature enhancement mechanism is then employed,where local features guide global feature extraction to strengthen the representation of complex emotional patterns and contextual information.Finally,an improved cross-attention fusion module achieves bidirectional interaction and collaborative enhancement between global and local features.Experimental results show that CAFSC achieves accuracies of 91.21% on RAF-DB,98.31% on CK+,74.54% on FER2013,and 86.74% on FER2013PLUS,demonstrating superior lear-ning efficiency and convergence stability compared to existing methods.
-
Method for Symbol Detection in Substation Layout Diagrams Based on Text-Image MultimodalFusion
范家斌, 王宝会, 陈继轩. 基于文本-图像多模态融合的变电所布局图纸图符检测方法[J]. 计算机科学, 2026, 53(1): 206-215.
FAN Jiabin, WANG Baohui, CHEN Jixuan. Method for Symbol Detection in Substation Layout Diagrams Based on Text-Image MultimodalFusion[J]. Computer Science, 2026, 53(1): 206-215. - FAN Jiabin, WANG Baohui, CHEN Jixuan
- Computer Science. 2026, 53 (1): 206-215. doi:10.11896/jsjkx.250200090
-
Abstract
(
75 )
PDF(4645KB) (
4
)
- References | Related Articles | Metrics
-
To address the issues of inconvenient operation,low efficiency,and difficulty in managing recognition data during the manual identification of substation layout drawings,this paper proposes a morphology-based large-size drawing segmentation method and a text-image multimodal fusion drawing symbol detection method.Combined with post-processing methods for symbol detection,this forms a detectable and adaptable approach to large-size layout drawing symbol detection that can be generalized to other fields.The text-image multimodal fusion drawing symbol detection model is improved upon the open-set object detection model YOLO-World,by introducing the CTCM,SOFEM,and CJFFM.These enhancements significantly improve the model's performance in symbol recognition.Using the proposed methods,the detection of symbols in actual high-speed railway traction substation general layout drawings dataset is achieved.Compared to the original model,the proposed improved model,while maintaining a similar level of complexity,reaches an average precision of 97.5% for symbol recognition,with mAP@50:95 and mAP@90 increasing by 1.1% and 3.0%,respectively.
-
Visual Floorplan Localization Based on BEV Perception
陈集伟, 陈泽彬, 谭光. 基于BEV感知的视觉平面图定位[J]. 计算机科学, 2026, 53(1): 216-223.
CHEN Jiwei, CHEN Zebin, TAN Guang. Visual Floorplan Localization Based on BEV Perception[J]. Computer Science, 2026, 53(1): 216-223. - CHEN Jiwei, CHEN Zebin, TAN Guang
- Computer Science. 2026, 53 (1): 216-223. doi:10.11896/jsjkx.250300045
-
Abstract
(
54 )
PDF(2659KB) (
2
)
- References | Related Articles | Metrics
-
Visual floorplan localization task achieves pose estimation by matching visual observation with scene floorplan representation.In practical applications,how to effectively integrate geometric and semantic correlations between observation and floorplan in matching is particularly important for improving localization accuracy.However,existing methods have two main li-mitations.Firstly,they fail to fully utilize the semantic information within the camera’s field of view.Secondly,they lack a joint matching mechanism for geometric and semantic clues.To address these issues,this study proposes a visual floorplan localization framework based on BEV perception,which includes three core components.Firstly,the BEV semantic mapping module constructs the BEV semantic representation of local scenes through multimodal image projection transformation,achieving structured representation of observation data.Secondly,the expected observation generation module generates an expected observation database within the floorplan space,and achieves rapid generation of observation data through differentiable rendering method.Finally,the multi-level matching and localizing module proposes a geometric-semantic joint matching mechanism,which integrates the geometric layout and semantic category information from BEV observations through a hierarchical matching strategy to achieve accurate matching with the floorplan.The experimental results show that the framework achieves an improvement in localization recall from 0.32% and 4.82% to 3.12% and 58.77% on the publicly available dataset Structured3D and the self built simulation environment dataset IndoorEnv,respectively,which is significantly better than the existing baseline methods Laser and F3Loc.This validates the effectiveness and robustness of the proposed method in complex indoor scenes.
-
Data Augmentation Methods for Tibetan-Chinese Machine Translation Based on Long-tail Words
格桑加措, 尼玛扎西, 群诺, 嘎玛扎西, 道吉扎西, 罗桑益西, 拉毛吉, 钱木吉. 基于长尾词分布的藏汉机器翻译数据增强方法[J]. 计算机科学, 2026, 53(1): 224-230.
KALZANG Gyatso, NYIMA Tashi, QUN Nuo, GAMA Tashi, DORJE Tashi, LOBSANG Yeshi, LHAMO Kyi, ZOM Kyi. Data Augmentation Methods for Tibetan-Chinese Machine Translation Based on Long-tail Words[J]. Computer Science, 2026, 53(1): 224-230. - KALZANG Gyatso, NYIMA Tashi, QUN Nuo, GAMA Tashi, DORJE Tashi, LOBSANG Yeshi, LHAMO Kyi, ZOM Kyi
- Computer Science. 2026, 53 (1): 224-230. doi:10.11896/jsjkx.241200147
-
Abstract
(
57 )
PDF(2186KB) (
4
)
- References | Related Articles | Metrics
-
The existing Tibetan-Chinese machine translation corpora exhibit significant domain data imbalance,resulting in inconsistent translation performance of trained models across different domains.Back-translation,as a common data augmentation method,enhances model performance by generating diverse pseudo-parallel data.However,traditional back-translation approaches struggle to fully account for the domain imbalance in the data distribution,leading to limited improvements in translation perfor-mance for resource-scarce domains,even as overall performance increases.This paper proposes a strategy that involves an in-depth analysis of the distribution of long-tail words in existing corpora,and targeted selection of monolingual data using these long-tail words from the existing Tibetan-Chinese bilingual corpora.By generating pseudo-data through back-translation,it performs data augmentation.This strategy aims to improve the overall performance of Tibetan-Chinese machine translation models while enhancing translation performance in data-scarce domains.Experiment results demonstrate that by fully considering domain data imba-lance and incorporating long-tail word data augmentation,the translation performance of machine translation models in resource-scarce domains can be effectively improved,providing a targeted approach to address the issue of domain data imbalance.
Artificial Intelligence-
Reinforcement Learning Method for Solving Flexible Job Shop Scheduling Problem Based onDouble Layer Attention Network
王皓焱, 李崇寿, 李天瑞. 基于双层注意力网络的强化学习方法求解柔性作业车间调度问题[J]. 计算机科学, 2026, 53(1): 231-240.
WANG Haoyan, LI Chongshou, LI Tianrui. Reinforcement Learning Method for Solving Flexible Job Shop Scheduling Problem Based onDouble Layer Attention Network[J]. Computer Science, 2026, 53(1): 231-240. - WANG Haoyan, LI Chongshou, LI Tianrui
- Computer Science. 2026, 53 (1): 231-240. doi:10.11896/jsjkx.250100088
-
Abstract
(
64 )
PDF(2408KB) (
7
)
- References | Related Articles | Metrics
-
Flexible job shop scheduling problem(FJSP),as a variant of the job shop scheduling problem,has become an important research topic in the intelligent transformation of modern manufacturing industry due to the wide applicability.In recent years,deep reinforcement learning(DRL) has been applied to solve flexible job shop scheduling problems.However,the characteristic that operations can be assigned to multiple compatible machines with different processing times brings additional complexity to decision making and state representation.This paper proposes an end-to-end deep reinforcement learning framework based on an improved attention mechanism and proximal policy optimization algorithm to solve the FJSP.Considering the characteristics of heterogeneous disjunction graph structure,it designs a double-layer attention network based on hierarchical attention,including node-level attention layers and type-level attention layers,to fully extract the complex information between operations and machines to support high-quality scheduling decisions.Experimental results on synthetic and public datasets show that the proposed method outperforms traditional priority dispatching rules and currently state-of-the-art DRL methods in both of performance and generalization ability while maintaining high efficiency.
-
Energy-efficient Task Scheduling on Heterogeneous Multicore Real-time Systems with Synchronization
赵小松, 黄超, 李鉴, 康玉龙. 基于任务同步的异构多核实时系统节能调度算法[J]. 计算机科学, 2026, 53(1): 241-251.
ZHAO Xiaosong, HUANG Chao, LI Jian, KANG Yulong. Energy-efficient Task Scheduling on Heterogeneous Multicore Real-time Systems with Synchronization[J]. Computer Science, 2026, 53(1): 241-251. - ZHAO Xiaosong, HUANG Chao, LI Jian, KANG Yulong
- Computer Science. 2026, 53 (1): 241-251. doi:10.11896/jsjkx.250300148
-
Abstract
(
44 )
PDF(2538KB) (
8
)
- References | Related Articles | Metrics
-
The research on energy-efficient scheduling of synchronous tasks in multi-core real-time systems mainly focuses on homogeneous multi-core processor platforms.The architecture of heterogeneous multi-core processors can more effectively exert the system performance.If the existing research is directly applied to heterogeneous multi-core systems,guaranteeing schedulabi-lity may lead to higher energy consumption.Using DVFS technology,the energy-efficient scheduling problem based on task synchronization in heterogeneous multi-core real-time systems is studied,and an algorithm named SA-LESF(Synchronization Aware-Largest Energy Saved First) is proposed.The algorithm iteratively optimizes the speed configuration of all tasks until all tasks reach their maximum energy-saving speed configuration.In addition,the SA-LESF-DR(Synchronization Aware-Largest Energy Saved First with Dynamic Reclamation) based on reusing dynamic slack time is further proposed.While ensuring that real-time tasks can be scheduled,the algorithm implements corresponding reusing strategies to further reduce system energy consumption.The simulation results show that the SA-LESF and SA-LESF-DR algorithms have advantages in energy consumption perfor-mance,under the same task set,it can save up to 30% more energy compared to other algorithms.
-
Collaborative Semantics Fusion for Multi-agent Behavior Decision-making
段鹏婷, 温超, 王保平, 王珍妮. 基于协作语义融合的多智能体行为决策方法[J]. 计算机科学, 2026, 53(1): 252-261.
DUAN Pengting, WEN Chao, WANG Baoping, WANG Zhenni. Collaborative Semantics Fusion for Multi-agent Behavior Decision-making[J]. Computer Science, 2026, 53(1): 252-261. - DUAN Pengting, WEN Chao, WANG Baoping, WANG Zhenni
- Computer Science. 2026, 53 (1): 252-261. doi:10.11896/jsjkx.250300145
-
Abstract
(
72 )
PDF(4318KB) (
4
)
- References | Related Articles | Metrics
-
Multi-agent decision-making offers extensive engineering applications,particularly in the cooperative control tasks.Po-licy gradient-based reinforcement learning methods,which directly model policy distributions,are more conducive to exploring diverse strategies in complex reward scenarios.These methods also demonstrate consistently high empirical efficiency across both discrete and continuous action spaces.Although parameter-sharing mechanisms are widely adopted in policy gradient frameworks to improve convergence efficiency for collaborative tasks,the lack of attention to action semantic modeling introduces critical limitations,especially in mitigating action homogenization among agents.To solve this issue,this paper proposes CSF method from a graph-based modeling perspective.The CSF framework employs a graph autoencoder to learn correlation-aware semantic embeddings within the action space,subsequently achieving information fusion through dynamic integration of agent-specific beha-vioral features with semantic embeddings.This fusion mechanism aggregates collaborative behavioral information into agent-specific latent representations,enabling interdependent policy space exploration across agents.Comprehensive experiments conducted on diverse complex task scenarios within the StarCraft and Google Research Football environments demonstrate that CSF achieves superior performance over state-of-the-art algorithms,thus validating its effectiveness in facilitating inter-agent collaboration.
-
Research on User Data-driven App Fading Functions
贾经冬, 侯鑫, 王哲, 黄坚. 用户数据驱动的App消退功能研究[J]. 计算机科学, 2026, 53(1): 262-270.
JIA Jingdong, HOU Xin, WANG Zhe, HUANG Jian. Research on User Data-driven App Fading Functions[J]. Computer Science, 2026, 53(1): 262-270. - JIA Jingdong, HOU Xin, WANG Zhe, HUANG Jian
- Computer Science. 2026, 53 (1): 262-270. doi:10.11896/jsjkx.250100070
-
Abstract
(
44 )
PDF(2220KB) (
2
)
- References | Related Articles | Metrics
-
In order to effectively promote App function iteration,most existing studies generally focus on improving existing functions or adding new functions to promote version update by mining user requirements in user reviews,while neglecting to identify functions that should be eliminated from user reviews.To address the issue,an analysis method about user data-driven App fading functions is proposed.User reviews from App market are collected.Keyword templates are built to filter out reviews that contain fading functions.From these reviews,function phrases are mined by syntax paradigms.A classifier is trained by these phrases to identify fading functions to be studied,so the dataset of fading functions is built.Lifecycle of a fading function is found based on version update log and user reviews backtracking.Then,user reviews for the lifecycle of fading functions are analyzed.A word weight threshold method is proposed to detect and correct false ratings based on text sentiment analysis.BERT algorithm is used to classify the review data.BERTopic-Corex topic model is proposed to generate theme words of reviews.Key user reviews are identified based on the previous analysis results and the word count of reviews.Thus,fading functions can be effectively identified and analyzed from user reviews.Experimental results and examples prove the feasibility and effectiveness of the proposed method.
-
Cross-language Knowledge Graph Entity Alignment Based on Meta-learning
陈壮壮, 邓怡辰, 余敦辉, 肖奎. 基于元学习的跨语言知识图谱实体对齐框架[J]. 计算机科学, 2026, 53(1): 271-277.
CHEN Zhuangzhuang, DENG Yichen, YU Dunhui, XIAO Kui. Cross-language Knowledge Graph Entity Alignment Based on Meta-learning[J]. Computer Science, 2026, 53(1): 271-277. - CHEN Zhuangzhuang, DENG Yichen, YU Dunhui, XIAO Kui
- Computer Science. 2026, 53 (1): 271-277. doi:10.11896/jsjkx.241100069
-
Abstract
(
51 )
PDF(2284KB) (
9
)
- References | Related Articles | Metrics
-
Cross-language knowledge graph entity alignment is a key step in connecting knowledge graphs of different languages,and it plays an important role in tasks such as multilingual information retrieval and data fusion.However,the existing entity alignment methods rely on a variety of information in the knowledge graph,which cannot handle the entity alignment task of the sparse knowledge graph well,and has poor adaptability to new languages.To solve this problem,a cross-language entity alignment framework based on meta-learning is proposed.The framework is generally divided into two stages,the outer loop and the inner loop.In the outer loop stage,multiple tasks are selected through the sampling method based on task similarity,and then the model is jointly trained with multiple tasks to construct the teacher model.In the inner loop stage,the teacher model trained in the outer loop stage is used to guide the student model to carry out the training and entity alignment tasks,in order to improve the entity alignment performance and generalization of the student model.Experimental results on the SRPRS and WK31-60K dataset show that the proposed framework improves the Hits@1 index by 3.5%,Hits@10 index by 4.0%,and MRR index by 6.3% on average in the entity alignment problem.
-
Bidirectional Prompt-Tuning for Event Argument Extraction with Topic and Entity Embeddings
陈千, 成凯璇, 郭鑫, 张晓霞, 王素格, 李艳红. 融合主题和实体嵌入的双向提示调优事件论元抽取[J]. 计算机科学, 2026, 53(1): 278-284.
CHEN Qian, CHENG Kaixuan, GUO Xin, ZHANG Xiaoxia, WANG Suge, LI Yanhong. Bidirectional Prompt-Tuning for Event Argument Extraction with Topic and Entity Embeddings[J]. Computer Science, 2026, 53(1): 278-284. - CHEN Qian, CHENG Kaixuan, GUO Xin, ZHANG Xiaoxia, WANG Suge, LI Yanhong
- Computer Science. 2026, 53 (1): 278-284. doi:10.11896/jsjkx.250100046
-
Abstract
(
76 )
PDF(3244KB) (
3
)
- References | Related Articles | Metrics
-
In recent years,prompt learning has been widely applied in the field of natural language processing.According to research,argument roles are highly semantically related to topics in text,and existing prompt tuning methods overlook entity information and interactions between arguments.Therefore,this paper proposes a bidirectional prompt tuning event argument extraction model(TEPEAE) that integrates topic and entity embeddings.Firstly,topic features are extracted using a topic model and embedded into a topic representation.Secondly,prompt templates are constructed based on trigger words,arguments,and entity information,incorporating topic embeddings into the template.Thirdly,masked language model(MLM) is utilized to predict the role label for each entity.Finally,labels are mapped from the label word space to the argument role space.Experiments on ACE2005-EN and ERE-EN datasets show that TEPEAE outperforms baseline models and achieves 79.53% and 78.60% in terms of F1,respectively,which demonstrates the effectiveness of TEPEAE.Moreover,it continues to demonstrate exceptional performance in low-resource scenarios,further proving its enhanced robustness.
-
Survey on Security of Android SDKs
许腾, 刘路遥, 姜灏宇, 罗畅, 李珩, 袁巍. Android SDK安全性研究综述[J]. 计算机科学, 2026, 53(1): 285-297.
XU Teng, LIU Luyao, JIANG Haoyu, LUO Chang, LI Heng, YUAN Wei. Survey on Security of Android SDKs[J]. Computer Science, 2026, 53(1): 285-297. - XU Teng, LIU Luyao, JIANG Haoyu, LUO Chang, LI Heng, YUAN Wei
- Computer Science. 2026, 53 (1): 285-297. doi:10.11896/jsjkx.250500023
-
Abstract
(
45 )
PDF(3236KB) (
4
)
- References | Related Articles | Metrics
-
Android SDK is a software toolkit used for Android application development.Since a single Android SDK can be integrated into multiple applications,its security implications for the installation ecosystem are chain-like,exposing the Android ecosystem to comprehensive threats from SDKs.In recent years,a series of security issues related to the Android SDK,such as SDK cross-library harvests private data and SDK library resource merging and overlay,have attracted high attention from both industry and academia.However,there remains a lack of comprehensive reviews on the security of Android SDKs.This paper syste-matically organizes existing related work,focusing on two key dimensions:the security of internal component code in Android SDKs and the security of runtime data interaction.For the former,it compiles research findings at both the system SDK and for third-party SDKs.For the latter,it summarizes studies on SDK self-violations and external intrusions into SDKs.Additionally,this paper analyzes recent advancements in Android SDK security research,introduces performance metrics for horizontal compari-son,combs through its development context and evolutionary process.Finally,prospects the future research directions for combining this field with emerging technologies such as current AI large language models.
Information Security-
Privacy-preserving Computation in Edge Service Scenario of Internet of Vehicles:A Review ofTechnical Basis and Research Progress
李佳惠, 李英龙, 陈铁明. 车联网边缘服务场景下的隐私保护计算:技术基础与研究进展综述[J]. 计算机科学, 2026, 53(1): 298-322.
LI Jiahui, LI Yinglong, CHEN Tieming. Privacy-preserving Computation in Edge Service Scenario of Internet of Vehicles:A Review ofTechnical Basis and Research Progress[J]. Computer Science, 2026, 53(1): 298-322. - LI Jiahui, LI Yinglong, CHEN Tieming
- Computer Science. 2026, 53 (1): 298-322. doi:10.11896/jsjkx.250200113
-
Abstract
(
75 )
PDF(3942KB) (
5
)
- References | Related Articles | Metrics
-
With the deep integration of intelligent vehicles,edge computing,and wireless communication technologies,the “vehicle-road-cloud” collaborative intelligent IoV edge service system is rapidly developing,optimizing traffic efficiency and driving safety through real-time data processing.However,the interaction and computation of massive vehicle perception data(such as location trajectories and driving behaviors) in an open edge network environment face privacy leakage risks such as eavesdropping attacks and inference attacks.Although the existing privacy protection schemes have gradually enhanced the effect of privacy protection,the characteristics of dynamic topology and resource constraints in the edge environment of the IoV create a conflict between the strength of privacy protection and service performance.Privacy-preserving computation,as an effective means of privacy protection,is of significant importance for safeguarding users’ personal rights and promoting the sustainable development of the IoV industry,and has become one of the key research areas for ensuring the services in the IoV.Initially,it outlines the edge service architecture of IoV and analyzes the potential privacy leakage risks within it.Subsequently,based on the different mechanisms of privacy-preserving computation technologies,it categorizes and discusses the privacy-preserving computation methods for IoV based on data transformation,secure multi-party computation,federated learning,and trusted execution environment technologies.On this basis,a systematic analysis and comparison of these privacy-preserving computation methods are conducted from four key evaluation dimensions:privacy leakage risk,data utility,overhead,and scalability,along with corresponding optimization strategies.Finally,the challenges faced by privacy-preserving computation technologies for IoV edge services and future research directions are discussed.
-
Section Sparse Attack:A More Powerful Sparse Attack Method
温泽瑞, 姜天, 黄子健, 崔晓晖. 分区稀疏攻击:一种更高效的黑盒稀疏对抗攻击[J]. 计算机科学, 2026, 53(1): 323-330.
WEN Zerui, JIANG Tian, HUANG Zijian, CUI Xiaohui. Section Sparse Attack:A More Powerful Sparse Attack Method[J]. Computer Science, 2026, 53(1): 323-330. - WEN Zerui, JIANG Tian, HUANG Zijian, CUI Xiaohui
- Computer Science. 2026, 53 (1): 323-330. doi:10.11896/jsjkx.241200002
-
Abstract
(
48 )
PDF(3146KB) (
5
)
- References | Related Articles | Metrics
-
Deep neural networks(DNNs) have long been threatened by adversarial attacks,particularly sparse attacks in black-box attacks.These attacks rely on the target model’s output to guide the generation of adversarial examples and typically deceive image classifiers by perturbing only a few pixels.However,existing sparse attack methods suffer from inefficiencies due to the use of fixed step-size strategies and poor initialization approaches,which fail to fully exploit perturbations.To address these issues,SSA is proposed. Unlike other methods that use fixed step sizes,SSA adapts the step size based on historical search information,thus accelerating the discovery of adversarial examples.Additionally,recognizing that sparse attacks in black-box settings tend to perturb high-importance pixels,SSA uses an initialization strategy based on the CAM,interpretability method,to quickly identify and initialize populations of high-importance pixels.Finally,to confine perturbations within critical sections and maximize their effectiveness during the search process,SSA adopts a section search strategy to reduce the search space.Experimental results de-monstrate that SSA outperforms the SOTA(State-of-the-Art) methods,in attacking traditional convolutional networks and Vision Transformer(ViT) models.Specifically,SSA achieves a 2%~8% improvement in attack success rates and approximately a 30% enhancement in efficiency.
-
PBFT Consensus Algorithm Based on Bayesian Theory
潘彦炀, 杨槟豪, 纪庆革. 基于贝叶斯理论的PBFT共识算法[J]. 计算机科学, 2026, 53(1): 331-340.
PAN Yanyang, YANG Binhao, JI Qingge. PBFT Consensus Algorithm Based on Bayesian Theory[J]. Computer Science, 2026, 53(1): 331-340. - PAN Yanyang, YANG Binhao, JI Qingge
- Computer Science. 2026, 53 (1): 331-340. doi:10.11896/jsjkx.241100053
-
Abstract
(
42 )
PDF(1978KB) (
4
)
- References | Related Articles | Metrics
-
Consensus algorithm is a method to ensure that all nodes in the blockchain network reach a consensus,such as PoW and PoS.The consensus mechanism affects the performance of the blockchain system.In order to solve the problems of low throughput and long delay of existing blockchain consensus algorithms,this paper improves the PBFT algorithm in blockchain,introduces a dynamic trust model based on Bayes theory,changes the trust of nodes in the consensus round through the node trust mechanism,and conducts group operations according to the trust degree.In addition to ensuring the stability of PBFT,the system scalability is improved,and the joining and exiting mechanism of network nodes is perfected,so that the network scalability is improved.Through experimental tests,compared with traditional PBFT algorithms,the improved algorithm has a 25% improvement in throughput,and the delay is only half of that of PBFT when the number of nodes reaches 50.These two indicators also have a 20%~30% improvement compared with HotStuff algorithm and Paxos algorithm.
-
Research on Safety Analysis of Mode Transition of Flight Guidance System Based on STPA
左辰翠, 黄志球, 胡军, 谢健, 徐恒, 石帆. 基于STPA的飞行导引系统模式转换的安全性分析研究[J]. 计算机科学, 2026, 53(1): 341-352.
ZUO Chencui, HUANG Zhiqiu, HU Jun, XIE Jian, XU Heng, SHI Fan. Research on Safety Analysis of Mode Transition of Flight Guidance System Based on STPA[J]. Computer Science, 2026, 53(1): 341-352. - ZUO Chencui, HUANG Zhiqiu, HU Jun, XIE Jian, XU Heng, SHI Fan
- Computer Science. 2026, 53 (1): 341-352. doi:10.11896/jsjkx.241000156
-
Abstract
(
29 )
PDF(4897KB) (
4
)
- References | Related Articles | Metrics
-
In the process of automatic flight of civil aircraft,the mode transition of the flight guidance system is an important factor affecting safety and should be subject to a comprehensive safety analysis.Traditional safety analysis methods mainly focus on the failure factors of individual components,ignoring the safety issues arising from the nonlinear interactions between components.For this reason,this paper adopts the System Theory Process Analysis(STPA) method to conduct a systematic and complete analysis of the mode transition of the flight guidance system.Meanwhile,considering that there are parts in the STPA me-thod that require manual analysis,the formal model checking tool UPPAAL based on the theory of timed automata is introduced to model and verify the system,so as to ensure the correctness of the control structure diagram and identify the truly Unsafe Control Actions(UCA),thus avoiding the waste of resources.Finally,a standardized causal factor analysis framework is proposed to analyze the verified UCAs one by one.The example proves that the proposed method is effective for the safety analysis of aviation complex systems.
-
Variational Quantum Algorithm for Solving Discrete Logarithms
张兴兰, 容潇军. 基于变分量子的离散对数求解算法[J]. 计算机科学, 2026, 53(1): 353-362.
ZHANG Xinglan, RONG Xiaojun. Variational Quantum Algorithm for Solving Discrete Logarithms[J]. Computer Science, 2026, 53(1): 353-362. - ZHANG Xinglan, RONG Xiaojun
- Computer Science. 2026, 53 (1): 353-362. doi:10.11896/jsjkx.241100181
-
Abstract
(
40 )
PDF(2704KB) (
5
)
- References | Related Articles | Metrics
-
The discrete logarithm problem is a significant challenge in number theory,and due to the difficulty of solving it,classical computers lack efficient algorithms for this task.As a result,the discrete logarithm problem is widely used in public key cryptosystems,and if it were cracked,it would pose a direct threat to the security of these systems.However,with the advent of quantum computing,researchers have begun exploring quantum computers as a potential solution for the discrete logarithm problem.Currently,most quantum algorithms for solving the discrete logarithm problem are based on Shor’s algorithm.However,due to Shor’s inherent limitations,these algorithms often face issues such as large quantum circuit depth,high qubit usage and complex post-processing steps.This makes it difficult for Shor’s algorithm to be implemented on NISQ computers.To address these issues,this paper proposes a novel approach by introducing a variational quantum algorithm for solving the discrete logarithm pro-blem.This algorithm leverages the parallelism of quantum computing to compute the modular exponentiation of parameterized quantum states.It also designs a marked solution circuit that maps valid solutions of the discrete logarithm problem onto auxiliary qubits.Then,a classical optimizer is used to iteratively adjust the parameters within the parameterized quantum circuit,continuously reducing the value of a designed loss function.Finally,the optimized parameters from the classical optimizer are fed into the measurement circuit,where the solution to the discrete logarithm problem can be obtained with high probability.Compared to Shor’s algorithm,the proposed method significantly reduces the required number of qubits and nearly halves the quantum circuit depth.Furthermore,this paper provides a detailed design of the quantum circuit and verifies the correctness of the proposed algorithm using the Qiskit package in Python.
-
Software-defined Perimeter Anonymous Authentication Scheme Based on Verifiable Credentials
司雪鸽, 贾洪勇, 李惟贤, 曾俊杰, 门蕊蕊. 基于可验证凭证的软件定义边界匿名身份认证方案[J]. 计算机科学, 2026, 53(1): 363-370.
SI Xuege, JIA Hongyong, LI Weixian, ZENG Junjie , MEN Ruirui. Software-defined Perimeter Anonymous Authentication Scheme Based on Verifiable Credentials[J]. Computer Science, 2026, 53(1): 363-370. - SI Xuege, JIA Hongyong, LI Weixian, ZENG Junjie , MEN Ruirui
- Computer Science. 2026, 53 (1): 363-370. doi:10.11896/jsjkx.250100080
-
Abstract
(
43 )
PDF(3111KB) (
7
)
- References | Related Articles | Metrics
-
The standard SDP architecture employs identity-based authentication and authorization strategies to monitor and audit access activities in real time.However,users must fully disclose their identity information to obtain access,potentially exposing sensitive data unrelated to the requested service and introducing privacy risks.To address challenges such as ineffective user privacy protection and vulnerability of access records to malicious linkage in the current SDP architecture,this paper proposes an anonymous authentication scheme based on verifiable credentials(VCs) for SDP.The scheme constructs a VC verification algorithm using bilinear pairing and CL-signature,integrating the VC system with the standard SDP architecture to enable anonymous user access without altering the original single-packet authorization and TLS secure connection authentication model.Theoretical analysis demonstrates that the proposed scheme resists common network attacks,including knock amplification and identity impersonation.Experimental results show that it achieves shorter authentication latency in multi-node network environments.
-
Attack Graph-assisted Deep Reinforcement Learning-based Service Function Chain AttackRecovery Method
周德强, 季新生, 游伟, 邱航, 杨杰. 攻击图辅助下基于深度强化学习的服务功能链攻击恢复方法[J]. 计算机科学, 2026, 53(1): 371-381.
ZHOU Deqiang, JI Xinsheng, YOU Wei, QIU Hang , YANG Jie. Attack Graph-assisted Deep Reinforcement Learning-based Service Function Chain AttackRecovery Method[J]. Computer Science, 2026, 53(1): 371-381. - ZHOU Deqiang, JI Xinsheng, YOU Wei, QIU Hang , YANG Jie
- Computer Science. 2026, 53 (1): 371-381. doi:10.11896/jsjkx.250300076
-
Abstract
(
47 )
PDF(4797KB) (
2
)
- References | Related Articles | Metrics
-
SFC can provide customized services for the six scenarios of 6G with the advantages of on-demand orchestration,flexible networking,and other benefits,and 6G networks also put forward higher requirements for SFC.Resilience is receiving attention for the first time in 6G networks,requiring SFC to ensure stable and continuous service provision of fundamental function,with resilience recovery being a key stage.Existing recovery methods are often based on backup mechanisms leading to resource wastage,while ignoring the impact of network attack characteristics on recovery leading to difficulty in guaranteeing the recovery effect.Considering the characteristics of network attacks,this paper uses SFC attack graph to determine the customized attack recovery scheme for SFC,including the VNF recovery range and the demand of attack recovery level.To solve the placement scheme that conforms to the customized attack recovery scheme,a deep reinforcement learning-based SFC attack recovery method(DRL-SFCAR) is proposed.Extensive simulation results show that DRL-SFCAR performs better in terms of delay and recovery cost than the three comparison methods while ensuring recovery success rate.DRL-SFCAR can meet the attack recovery level requirement and minimize the long-term recovery cost,which achieves the customized recovery for SFC in network attack scenarios.
-
Composite Trigger Backdoor Attack Combining Visual and Textual Features
黄荣, 唐迎春, 周树波, 蒋学芹. 联合视觉-文本特征的复合型触发器后门攻击[J]. 计算机科学, 2026, 53(1): 382-394.
HUANG Rong, TANG Yingchun, ZHOU Shubo , JIANG Xueqin. Composite Trigger Backdoor Attack Combining Visual and Textual Features[J]. Computer Science, 2026, 53(1): 382-394. - HUANG Rong, TANG Yingchun, ZHOU Shubo , JIANG Xueqin
- Computer Science. 2026, 53 (1): 382-394. doi:10.11896/jsjkx.241200105
-
Abstract
(
43 )
PDF(4493KB) (
2
)
- References | Related Articles | Metrics
-
A backdoor attack refers to an attack covertly poisoning the dataset,subtly inducing the victim model to associate the poisoned data with a target label,thereby posing a threat to the trustworthiness and security of artificial intelligence technologies.Existing backdoor attack methods generally face a trade-off between effectiveness and stealthiness.Triggers with high effectiveness tend to lack stealthiness,while those with good stealthiness tend to have weak effectiveness.To address this issue,this paper proposes a composite trigger for clean-label backdoor attack,which combines visual and textual features.The composite trigger is composed of two learnable triggers:a universal part and an individual part.During the design and optimization of the composite trigger,pixel values within patches are constrained to follow a congruence rule.This constraint aims to induce the victim model to capture the congruence,thereby establishing an association between the trigger and the target label,forming a backdoor.The universal trigger ensures that pixel values within patches of poisoned images are congruent modulo 2,maintaining a fixed signal pattern across all poisoned images.The individual trigger,on the other hand,ensures that edge pixel values of poisoned images are congruent with respect to the weight of the LoSB,rendering its signal specific to the edge positions of each image.The two parts of the trigger are integrated to balance both effectiveness and stealthiness.Building on this,this paper introduces the CLIP model,which combines visual and textual features to construct the supervisory signal for training the composite trigger.The pre-trained CLIP model has strong generalization capabilities,enabling the composite trigger to absorb disparate textual features,which helps diminish the image content features and further enhances the trigger’s effectiveness.Experiments are conducted on three datasets:CIFAR-10,ImageNet,and GTSRB.The results show that the proposed method can evade detection by backdoor defense techniques and outperforms the second-best method by an average of 2.48 percentage points in terms of attack success rate.Additionally,it surpasses the second-best method by an average of 10.61%,0.31%,68.44%,and 46.38% in peak signal-to-noise ratio(PSNR),structural similarity index(SSIM),gradient magnitude similarity deviation(GMSD),and learned perceptual image patch similarity(LPIPS),respectively.The results of the ablation experiments demonstrate the advantage of combining visual and textual features in guiding the training of the composite trigger.These results also validate the roles of both the universal and individual triggers in enhancing the effectiveness and stealthiness of the backdoor attack.
-
Tor Multipath Selection Based on Threaten Awareness
陈尚煜, 扈红超, 张帅, 周大成, 杨晓晗. 基于威胁感知的Tor多路径选择[J]. 计算机科学, 2026, 53(1): 395-403.
CHEN Shangyu, HU Hongchao, ZHANG Shuai, ZHOU Dacheng, YANG Xiaohan. Tor Multipath Selection Based on Threaten Awareness[J]. Computer Science, 2026, 53(1): 395-403. - CHEN Shangyu, HU Hongchao, ZHANG Shuai, ZHOU Dacheng, YANG Xiaohan
- Computer Science. 2026, 53 (1): 395-403. doi:10.11896/jsjkx.241200118
-
Abstract
(
40 )
PDF(3057KB) (
3
)
- References | Related Articles | Metrics
-
With the development and application of machine learning and deep learning,attackers can conduct traffic analysis on malicious nodes and malicious AS on Tor user links,thus carrying out de-anonymization attacks on Tor users.At present,one of the common defense methods for traffic analysis attacks is to insert virtual packets or delay real packets to change traffic characteristics,which will introduce bandwidth and delay costs.The other type defends by dividing user traffic and transmitting it through multiple paths.This method lacks the perception of malicious nodes and malicious AS on the circuit.When an attacker collects a complete traffic trail,it is still difficult to resist the de-anonymization attack on Tor users by traffic analysis.In order to make up for the lack of threat awareness in the path selection of multi-path defense methods,a multi-path selection algorithm based on threat awareness is proposed,which integrates malicious node awareness and malicious AS awareness.Firstly,an improved method of node distance measurement is proposed,and then the improved distance measurement is used to cluster nodes based on K-Mediods algorithm,which improves the detection effect of malicious nodes.Then the AS sensing algorithm is improved to improve the anonymity requirement.Finally,a multi-path selection algorithm based on threat perception is proposed by combining malicious node detection and AS sensing algorithm.The experimental results show that the proposed algorithm can not only resist a variety of traffic analysis attacks,but also ensure certain performance requirements of Tor circuits.
-
Adaptive Box-constraint Optimization Method for Adversarial Attacks
周强, 李哲, 陶蔚, 陶卿. 自适应约束上界的对抗攻击优化方法[J]. 计算机科学, 2026, 53(1): 404-412.
ZHOU Qiang, LI Zhe, TAO Wei, TAO Qing. Adaptive Box-constraint Optimization Method for Adversarial Attacks[J]. Computer Science, 2026, 53(1): 404-412. - ZHOU Qiang, LI Zhe, TAO Wei, TAO Qing
- Computer Science. 2026, 53 (1): 404-412. doi:10.11896/jsjkx.250600144
-
Abstract
(
48 )
PDF(3034KB) (
6
)
- References | Related Articles | Metrics
-
Deep neural networks are vulnerable to adversarial example attacks.Existing transfer-based attack optimization methods commonly employ fixed constraint upper bounds to represent imperceptibility intensity,focusing primarily on improving attack success rates.However,such approaches overlook inter-sample sensitivity variations,resulting in suboptimal imperceptibi-lity(measured by Fréchet Inception Distance,FID).Inspired by adaptive gradient methods,this paper proposes an adversarial attack optimization method with adaptive constraint upper bounds,aiming to enhance imperceptibility.Firstly,a sensitivity metric based on gradient magnitudes is established to quantify sensitivity differences across samples.Building on this,adaptive constraint upper bounds are determined to enable differentiated perturbation handling-applying low-intensity perturbations to sensitive samples and high-intensity perturbations to non-sensitive ones.Furthermore,by replacing the projection operator and step size,the adaptive constraintmechanism is seamlessly integrated into existing attack methods.Experiments on the ImageNet-Compatible dataset demonstrate that,under equivalent black-box attack success rates,the proposed method reduces FID by 2.68%~3.49% compared to traditional fixed-constraint methods.Additionally,the MI-LA attack algorithm based on this approach achieves 6.32%~26.35% lower FID than five state-of-the-art adversarial attack methods.
-
Screen-shooting Resilient Watermarking Method for Document Image Based on Attention Mechanism
张小敏, 赵军智, 和红杰. 基于注意力机制的文档图像屏摄鲁棒水印方法[J]. 计算机科学, 2026, 53(1): 413-422.
ZHANG Xiaomin, ZHAO Junzhi, HE Hongjie. Screen-shooting Resilient Watermarking Method for Document Image Based on Attention Mechanism[J]. Computer Science, 2026, 53(1): 413-422. - ZHANG Xiaomin, ZHAO Junzhi, HE Hongjie
- Computer Science. 2026, 53 (1): 413-422. doi:10.11896/jsjkx.241100040
-
Abstract
(
59 )
PDF(5047KB) (
4
)
- References | Related Articles | Metrics
-
Screen-shooting resilient watermarking algorithms are of significant importance in fields such as copyright protection and traceability.Existing screen-shooting resilient watermarking algorithms mostly focus on natural images,neglecting research on document images.Document carriers inherently contain less redundant information,making it challenging to balance robustness and imperceptibility of the watermark.To address this issue,a screen-shooting resilient watermarking method for document image based on attention mechanism is proposed.To enhance the imperceptibility of the watermark,an attention feature fusion module is introduced in the encoder network to adaptively aggregate shallow and deep features.To improve the robustness of the algorithm for extraction,an adaptive channel-spatial attention module is designed in the decoder network to emphasize features that are particularly important in both channel and spatial dimensions.Additionally,a Moiré distortion layer is designed during screen-shoo-ting noise simulation to enhance the algorithm’s robustness against real Moiré distortions.Experimental results demonstrate that the proposed method achieves an average PSNR of 33.4 dB,SSIM of 0.988 5,RMSE of 5.48,and an average extraction accuracy of 99.49% in various screen-shooting scenarios.In terms of image quality and watermark robustness,the proposed method outperforms existing similar methods.
-
Deep Learning Model Protection Method Based on Robust Partitioned Watermarking
吕正浩, 咸鹤群. 基于鲁棒分区水印的深度学习模型保护方法[J]. 计算机科学, 2026, 53(1): 423-429.
LYU Zhenghao, XIAN Hequn. Deep Learning Model Protection Method Based on Robust Partitioned Watermarking[J]. Computer Science, 2026, 53(1): 423-429. - LYU Zhenghao, XIAN Hequn
- Computer Science. 2026, 53 (1): 423-429. doi:10.11896/jsjkx.241200005
-
Abstract
(
62 )
PDF(2010KB) (
6
)
- References | Related Articles | Metrics
-
Machine learning often involves high costs related to data collection and model training,which raises concerns for mo-del owners about unauthorized replication or misuse,potentially infringing on their intellectual property(IP).Consequently,the protection of intellectual property in machine learning models has become a pressing issue.In response,researchers have introduced the concept of model watermarking.Similar to how digital watermarking embeds identifiable marks into images,model watermarking involves embedding unique identifiers into machine learning models to facilitate copyright verification.However,exis-ting watermarking techniques face several limitations in practical applications.Firstly,embedding watermarks inevitably affects model performance to some degree.Secondly,watermarks can be removed through techniques such as model fine-tuning.To address these challenges,this paper proposes a novel neural network watermarking scheme,employing a regional and staged embedding approach.This method not only aims to minimize the impact on model performance but also seeks to enhance the robustness of the watermark itself.Experiments conducted on the MNIST,CIFAR-10,and CIFAR-100 datasets validate the effectiveness of the proposed scheme.The results demonstrate that this watermarking approach maintains a high watermark retention rate while having minimal impact on model performance.Compared to existing baseline watermarking schemes,this method achieves performance improvements of up to 18 percentage points.Additionally,the proposed scheme exhibits strong robustness against attacks such as fine-tuning and remains unaffected by model pruning operations.Even if adversaries attempt to completely remove the watermark,they would have to significantly degrade the model’s performance as a trade-off.
-
Comprehensive Survey of LLM-based Agent Operating Systems
- [an error occurred while processing this directive]
-
More>> -
Robust Hash Learning Method Based on Dual-teacher Self-supervised Distillation (3657)
MIAO Zhuang, WANG Ya-peng, LI Yang, WANG Jia-bao, ZHANG Rui, ZHAO Xin-xin
Computer Science. 2022, No.10:159-168Abstract (3657) PDF (4472KB) (15842) Data-free Model Evaluation Method Based on Feature Chirality (2284)
MIAO Zhuang, JI Shipeng, WU Bo, FU Ruizhi, CUI Haoran, LI Yang
Computer Science. 2024, No.7:337-344Abstract (2284) PDF (3883KB) (15435) Review of Time Series Prediction Methods (4376)
YANG Hai-min, PAN Zhi-song, BAI Wei
Computer Science. 2019, No.1:21-28Abstract (4376) PDF (1294KB) (12754) Polynomial Time Algorithm for Hamilton Circuit Problem (6299)
JIANG Xin-wen
Computer Science. 2020, No.7:8-20Abstract (6299) PDF (1760KB) (12575) Web Application Page Element Recognition and Visual Script Generation Based on Machine Vision (885)
LI Zi-dong, YAO Yi-fei, WANG Wei-wei, ZHAO Rui-lian
Computer Science. 2022, No.11:65-75Abstract (885) PDF (2624KB) (12479) Optimization Method of Streaming Storage Based on GCC Compiler (1375)
GAO Xiu-wu, HUANG Liang-ming, JIANG Jun
Computer Science. 2022, No.11:76-82Abstract (1375) PDF (2713KB) (12174) Research and Progress on Bug Report-oriented Bug Localization Techniques (1043)
NI Zhen, LI Bin, SUN Xiao-bing, LI Bi-xin, ZHU Cheng
Computer Science. 2022, No.11:8-23Abstract (1043) PDF (2280KB) (11582) Patch Validation Approach Combining Doc2Vec and BERT Embedding Technologies (618)
HUANG Ying, JIANG Shu-juan, JIANG Ting-ting
Computer Science. 2022, No.11:83-89Abstract (618) PDF (2492KB) (11571) Semantic Restoration and Automatic Transplant for ROP Exploit Script (635)
SHI Rui-heng, ZHU Yun-cong, ZHAO Yi-ru, ZHAO Lei
Computer Science. 2022, No.11:49-54Abstract (635) PDF (2661KB) (11256) Decision Tree Algorithm-based API Misuse Detection (1030)
LI Kang-le, REN Zhi-lei, ZHOU Zhi-de, JIANG He
Computer Science. 2022, No.11:30-38Abstract (1030) PDF (3144KB) (11191) Study on Effectiveness of Quality Objectives and Non-quality Objectives for Automated Software Refactoring (597)
GUO Ya-lin, LI Xiao-chen, REN Zhi-lei, JIANG He
Computer Science. 2022, No.11:55-64Abstract (597) PDF (3409KB) (11179) AutoUnit:Automatic Test Generation Based on Active Learning and Prediction Guidance (1180)
ZHANG Da-lin, ZHANG Zhe-wei, WANG Nan, LIU Ji-qiang
Computer Science. 2022, No.11:39-48Abstract (1180) PDF (2609KB) (11063) Study on Integration Test Order Generation Algorithm for SOA (885)
ZHANG Bing-qing, FEI Qi, WANG Yi-chen, Yang Zhao
Computer Science. 2022, No.11:24-29Abstract (885) PDF (1866KB) (10492) Studies on Community Question Answering-A Survey (456)
ZHANG Zhong-feng,LI Qiu-dan
Computer Science. 2010, No.11:19-23Abstract (456) PDF (551KB) (10463) Research Progress and Challenge of Programming by Examples (815)
YAN Qian-yu, LI Yi, PENG Xin
Computer Science. 2022, No.11:1-7Abstract (815) PDF (1921KB) (9875) Survey of Cloud-edge Collaboration (2259)
CHEN Yu-ping, LIU Bo, LIN Wei-wei, CHENG Hui-wen
Computer Science. 2021, No.3:259-268Abstract (2259) PDF (1593KB) (9094) Survey of Distributed Machine Learning Platforms and Algorithms (2079)
SHU Na,LIU Bo,LIN Wei-wei,LI Peng-fei
Computer Science. 2019, No.3:9-18Abstract (2079) PDF (1744KB) (8858) Survey of Fuzz Testing Technology (1607)
ZHANG Xiong and LI Zhou-jun
Computer Science. 2016, No.5:1-8Abstract (1607) PDF (833KB) (8123) Multisource Information Fusion:Key Issues,Research Progress and New Trends (383)
CHEN Ke-wen,ZHANG Zu-ping and LONG Jun
Computer Science. 2013, No.8:6-13Abstract (383) PDF (746KB) (8037) Physics-informed Neural Networks:Recent Advances and Prospects (5978)
LI Ye, CHEN Song-can
Computer Science. 2022, No.4:254-262Abstract (5978) PDF (2620KB) (7794)
Material for Copyright Register
Copyright Transfer
CCF Recommendation Journal
OSID Code Construction Process
Fund Transaction with English
Definition Rule of Academic Misconduct
CLC number
Requirements for elements
Proofreading marks
Requirements for revision
Reference description rules
Specification format
Home






