
Supervised and Sponsored by Chongqing Southwest Information Co., Ltd.
CODEN JKIEBK


-
CONTENTS
. 第52卷第1期目录[J]. 计算机科学, 2025, 52(1): 0-0.
- Computer Science. 2025, 52 (1): 0-0.
-
Abstract
PDF(238KB) ( 232 )
- RelatedCitation | Metrics
-
Survey of Research on Knowledge Graph Based on Pre-trained Language Models
曾泽凡, 胡星辰, 成清, 司悦航, 刘忠. 基于预训练语言模型的知识图谱研究综述[J]. 计算机科学, 2025, 52(1): 1-33.
ZENG Zefan, HU Xingchen, CHENG Qing, SI Yuehang, LIU Zhong. Survey of Research on Knowledge Graph Based on Pre-trained Language Models[J]. Computer Science, 2025, 52(1): 1-33. - ZENG Zefan, HU Xingchen, CHENG Qing, SI Yuehang, LIU Zhong
- Computer Science. 2025, 52 (1): 1-33. doi:10.11896/jsjkx.240100109
-
Abstract
PDF(4162KB) ( 490 )
- References | Related Articles | Metrics
-
In the era of large language models(LLMs),knowledge graphs(KGs),as a structured representation of knowledge,play an irreplaceable role in enhancing the reliability,security,and interpretability of artificial intelligence.With its superior performance in semantic understanding,pre-trained language models(PLMs) have become the main approach in knowledge graph research in recent years.This paper systematically reviews the research works on PLM-based knowledge graphs,including know-ledge graph construction,representation learning,reasoning,and question answering.The core ideas of the relevant models and methods are introduced and a classification system is established based on the technological approaches.A comparative analysis of the advantages and disadvantages of different categories of methods is provided.In addition,the application status of pre-trained language models in two new types of knowledge graphs,event knowledge graphs and multimodal knowledge graphs,is reviewed.Finally,the challenges faced by current research on knowledge graphs based on pre-trained language models are summarized,and future research directions are prospected.
-
Survey on Large Model Red Teaming
包泽芃, 钱铁云. 大模型红队测试研究综述[J]. 计算机科学, 2025, 52(1): 34-41.
BAO Zepeng, QIAN Tieyun. Survey on Large Model Red Teaming[J]. Computer Science, 2025, 52(1): 34-41. - BAO Zepeng, QIAN Tieyun
- Computer Science. 2025, 52 (1): 34-41. doi:10.11896/jsjkx.240400190
-
Abstract
PDF(1675KB) ( 277 )
- References | Related Articles | Metrics
-
Large model red teaming is an emerging frontier in the field of large language model(LLM),which aims to allow the LLM to receive adversarial testing to induce the model to output harmful test cases,so as to find vulnerabilities in the model and improve its robustness.In recent years,large model red teaming has gained widespread attention from both academia and industry,and numerous solutions have been proposed and some progress has been made in model alignment.However,due to the scarcity of large model red teaming data and the lack of clear evaluation standards,most existing research has been limited to specific scenarios.In this paper,starting from definition of large model security,we discuss the various risks associated with it.Then,we discuss the importance of large model red teaming and its main categories,providing a comprehensive overview and analysis of the development of related red team techniques.Additionally,we introduce existing datasets and evaluation metrics.Finally,the future research trends of large model red teaming are prospected and summarized.
-
Survey on Transmission Optimization Technologies for Federated Large Language Model Training
顿婧博, 李卓. 面向联邦大语言模型训练的传输优化技术综述[J]. 计算机科学, 2025, 52(1): 42-55.
DUN Jingbo, LI Zhuo. Survey on Transmission Optimization Technologies for Federated Large Language Model Training[J]. Computer Science, 2025, 52(1): 42-55. - DUN Jingbo, LI Zhuo
- Computer Science. 2025, 52 (1): 42-55. doi:10.11896/jsjkx.240500095
-
Abstract
PDF(1768KB) ( 483 )
- References | Related Articles | Metrics
-
With the rapid development of artificial intelligence technology,various types of large language models are emerging.However,most users and datasets participating in dedicated large language models have certain requirements for privacy and security,the data security and privacy issues need to be solved urgently,and federated large language models have emerged and gained more and more attention.Due to the huge data volume of large language models and the distributed architecture of federated learning,a large number of model exchanges between a large number of participating nodes and cloud servers result in high communication costs.In order to improve the model convergence rate,researchers have investigated transmission optimization techniques for federated large language model training.This paper analyzes the challenges of federated large language models,reviews the optimization problems of transmission optimization methods based on model fine-tuning,transmission optimization methods based on model structure compression,and transmission optimization based on distributed parallel processing;introduces existing open-source federated large language models and the transmission optimization techniques used,and gives an outlook on future research directions.
-
Survey of Chain-of-Thought Generation and Enhancement Methods in Prompt Learning
郑明琪, 陈晓慧, 刘冰, 张兵, 张然. 提示学习中思维链生成和增强方法综述[J]. 计算机科学, 2025, 52(1): 56-64.
ZHENG Mingqi, CHEN Xiaohui, LIU Bing, ZHANG Bing, ZHANG Ran. Survey of Chain-of-Thought Generation and Enhancement Methods in Prompt Learning[J]. Computer Science, 2025, 52(1): 56-64. - ZHENG Mingqi, CHEN Xiaohui, LIU Bing, ZHANG Bing, ZHANG Ran
- Computer Science. 2025, 52 (1): 56-64. doi:10.11896/jsjkx.240700172
-
Abstract
PDF(1968KB) ( 265 )
- References | Related Articles | Metrics
-
Large language models have made breakthroughs in several domains due to their superior language understanding and text generation capabilities.However,their performance in handling complex reasoning tasks is not very good and the accuracy needs to be improved.As a result,academics have proposed chain of thought(CoT),an innovative approach that aims to enhance the reasoning performance of models by allowing them to generate reasoning processes.In this paper,by comprehensively combing and deeply analyzing the existing research on CoT,we not only summarize its concepts and structural framework,but also explore the inference generation method and enhancement method in detail.The application of CoT in different task scenarios is further extensively explored,demonstrating the potential of CoT in enhancing model performance.At the same time,this paper also critically analyzes the limitations of CoT.Finally,this paper provides a prospective outlook on the future development of the chain-of-thinking strategy,aiming to provide guidance on the future research direction of CoT and to provide valuable references and insights for researchers.
-
Large Language Models Driven Framework for Multi-agent Military Requirement Generation
李嘉晖, 张萌萌, 陈洪辉. 大模型驱动多智能体的军事需求生成框架[J]. 计算机科学, 2025, 52(1): 65-71.
LI Jiahui, ZHANG Mengmeng, CHEN Honghui. Large Language Models Driven Framework for Multi-agent Military Requirement Generation[J]. Computer Science, 2025, 52(1): 65-71. - LI Jiahui, ZHANG Mengmeng, CHEN Honghui
- Computer Science. 2025, 52 (1): 65-71. doi:10.11896/jsjkx.240800022
-
Abstract
PDF(2543KB) ( 250 )
- References | Related Articles | Metrics
-
Military requirement generation in joint operation involves many participants and a heavy workload.The process relies on individual experience and multiple sources of documents,which leads to problems such as low efficiency in requirement generation and difficulty supporting the design of joint operation system.With the development of large language models (LLMs),LLMs-driven agents have shown excellent performance in various fields,and multi-agent system can efficiently handle complex tasks by leveraging group intelligence through distributed decision-making.To address the low efficiency in military requirement generation,a framework for military requirement generation with LLMs-driven multi-agent system is proposed.The framework includes a multi-modal information acquisition agent,military expert agents,a moderator and other components.The multi-modal information acquisition agent can rapidly process multi-modal information,extract military requirements and provide the user with a question-and-answer function.Military expert agents simulate human experts discussing the generation of requirements through natural language dialogues.Driven by LLMs,these agents can perceive the environment and autonomously use tools such as Ar-xiv,search engines and other resources to support the dialogues.The moderator receives instructions from the human user,refines the content of the instructions using LLMs and generates dialogue prompts and problem background descriptions.Using the Russia-Ukraine conflict as an experimental case,military requirements are generated from relevant multi-modal information.The experimental results show that when the multi-modal information capacity is within the maximum processing capacity of LLMs,the framework significantly reduces the time consumption for military requirement generation,with time savings of 80% to 85% for video resources and 90% to 95% for audio resources.
-
SWARM-LLM:An Unmanned Swarm Task Planning System Based on Large Language Models
李婷婷, 王琪, 王嘉康, 徐勇军. SWARM-LLM:基于大语言模型的无人集群任务规划系统[J]. 计算机科学, 2025, 52(1): 72-79.
LI Tingting, WANG Qi, WANG Jiakang, XU Yongjun. SWARM-LLM:An Unmanned Swarm Task Planning System Based on Large Language Models[J]. Computer Science, 2025, 52(1): 72-79. - LI Tingting, WANG Qi, WANG Jiakang, XU Yongjun
- Computer Science. 2025, 52 (1): 72-79. doi:10.11896/jsjkx.241000038
-
Abstract
PDF(4114KB) ( 277 )
- References | Related Articles | Metrics
-
In response to the issues of insufficient autonomous intelligence in unmanned cluster systems,low collaborative efficiency of heterogeneous unmanned clusters,and unbalanced task allocation,this paper first proposes a new unmanned cluster task planning framework(SWARM-LLM) based on large language models to meet the needs of unmanned swarm systems for autonomous planning,efficient collaboration,and intelligent decision-making.This framework leverages large language models to transform high-level task instructions into specific intelligent unmanned cluster task planning solutions,achieving collaborative tasks of unmanned clusters through multiple stages such as task decomposition,task allocation,and task execution.Furthermore,this paper designs a prompt engineering method specifically suited for unmanned swarm planning,called the planning chain(PC),to guide and optimize the implementation of these stages.Finally,we construct tasks of various categories and complexities in an unmanned swarm simulation environment(AirSim) and conduct evaluation experiments.Compared with other algorithms based on optimization and machine learning,experimental results demonstrate the effectiveness of the SWARM-LLM framework,showing a significant advantage in task success rates,with an average performance improvement of 47.8%.
-
COA Generation Based on Pre-trained Large Language Models
颜玉松, 周圆, 王琮, 孔圣麒, 王权, 黎敏讷, 王之元. 基于预训练大模型的行动方案生成方法[J]. 计算机科学, 2025, 52(1): 80-86.
YAN Yusong, ZHOU Yuan, WANG Cong, KONG Shengqi, WANG Quan, LI Minne, WANG Zhiyuan. COA Generation Based on Pre-trained Large Language Models[J]. Computer Science, 2025, 52(1): 80-86. - YAN Yusong, ZHOU Yuan, WANG Cong, KONG Shengqi, WANG Quan, LI Minne, WANG Zhiyuan
- Computer Science. 2025, 52 (1): 80-86. doi:10.11896/jsjkx.240900075
-
Abstract
PDF(3048KB) ( 272 )
- References | Related Articles | Metrics
-
Focusing on empowering the command and control(C2) procedure of generative AI,we analyze the challenges of course of action(COA) generation in C2 and the prospects of pre-trained large language models(LLMs).Then,a COA generation me-thod based on pre-trained LLMs,COA-Gen,is proposed.Firstly,a multi-round generation framework is designed to align the generated plans with objectives.Secondly,a multi-factor prompt templates is constructed to integrate vast amounts of multi-source information.Lastly,knowledge-augmented generation technology is introduced to improve the generation quality of the few-shot military domain.To validate the effectiveness of the generated plans,an emulation environment based on the StarCraft II engine and the “Tiger Claw” scenario is established.The results show the robustness of the method and its alignment with the commander’s intention.The feasibility of using LLMs for COA generation has been verified.Additionally,different pre-trained models exhibit varying performances in the same task,indicating that the choice of model in real-world applications can lead to action plans with different styles,thereby affect the ultimate outcomes.
-
Retrieval-augmented Generative Intelligence Question Answering Technology Based on Knowledge Graph
成志宇, 陈星霖, 王菁, 周中元, 张志政. 一种基于知识图谱的检索增强生成情报问答技术[J]. 计算机科学, 2025, 52(1): 87-93.
CHENG Zhiyu, CHEN Xinglin, WANG Jing, ZHOU Zhongyuan, ZHANG Zhizheng. Retrieval-augmented Generative Intelligence Question Answering Technology Based on Knowledge Graph[J]. Computer Science, 2025, 52(1): 87-93. - CHENG Zhiyu, CHEN Xinglin, WANG Jing, ZHOU Zhongyuan, ZHANG Zhizheng
- Computer Science. 2025, 52 (1): 87-93. doi:10.11896/jsjkx.240900064
-
Abstract
PDF(2296KB) ( 219 )
- References | Related Articles | Metrics
-
A knowledge graph-based retrieval-augmented generation framework is proposed to achieve military intelligence question answering.The framework effectively acquires background knowledge through question classification,entity recognition,entity linking,and knowledge retrieval.Considering the multi-constraint characteristics of intelligence questions,answer set programming is used to reduce the amount of knowledge through constraints or to directly obtain answers.Finally,a large language model solves the questions based on the refined knowledge,minimizing attribute recognition and linking issues during question understanding.Experiments on the MilRE dataset demonstrate that the framework provides enhanced knowledge retrieval capabilities based on knowledge graphs and offers superior performance in answering military intelligence questions.
-
Large Language Model Driven Multi-relational Knowledge Graph Completion Method
刘畅成, 桑磊, 李炜, 张以文. 大语言模型驱动的多元关系知识图谱补全方法[J]. 计算机科学, 2025, 52(1): 94-101.
LIU Changcheng, SANG Lei, LI Wei, ZHANG Yiwen. Large Language Model Driven Multi-relational Knowledge Graph Completion Method[J]. Computer Science, 2025, 52(1): 94-101. - LIU Changcheng, SANG Lei, LI Wei, ZHANG Yiwen
- Computer Science. 2025, 52 (1): 94-101. doi:10.11896/jsjkx.240600170
-
Abstract
PDF(2370KB) ( 244 )
- References | Related Articles | Metrics
-
Knowledge graphs transform complex Internet information into an easily understandable structured format,significantly enhancing the accessibility of information.Knowledge graph completion(KGC) techniques further enhance the completeness of knowledge graphs,markedly improved the performance and user experience of general domain applications such as intelligent question answering and recommendation systems by enhancing the information completeness of knowledge graphs.However,most existing KGC methods focus on triple instances in scenarios with fewer types of relationships and simpler semantics,failing to fully leverage the potential of graphs in handling multi-relational and complex semantics.In response to this issue,we propose a method for multi-relational knowledge graph completion driven by large language model(LLM).By combining the deep linguistic understanding capabilities of LLM with the structural characteristics of knowledge graphs,this method effectively captures complex semantic scenarios and comprehends multi-relational relationships.Additionally,we introduce a chain-of-thought based prompting engineering strategy,aiming to enhancing the accuracy of the completion tasks.Experimental results on two public knowledge graph datasets have demonstrated the significant performance improvements of this method.
-
Survey on Cross-city Human Mobility Prediction
张雨松, 胥帅, 严兴宇, 关东海, 许建秋. 跨城市人类移动行为预测研究综述[J]. 计算机科学, 2025, 52(1): 102-119.
ZHANG Yusong, XU Shuai, YAN Xingyu, GUAN Donghai, XU Jianqiu. Survey on Cross-city Human Mobility Prediction[J]. Computer Science, 2025, 52(1): 102-119. - ZHANG Yusong, XU Shuai, YAN Xingyu, GUAN Donghai, XU Jianqiu
- Computer Science. 2025, 52 (1): 102-119. doi:10.11896/jsjkx.240100032
-
Abstract
PDF(2907KB) ( 160 )
- References | Related Articles | Metrics
-
The advancement of urbanization has accumulated massive spatio-temporal data that records human mobility,providing a favorable data foundation for human mobility modeling and prediction.In the context of smart city construction,cross-city human mobility prediction is an inevitable requirement for achieving urban collaborative management and governance.At this time,there are often problems such as data scarcity and imbalanced data distribution.Traditional machine learning methods are difficult to achieve ideal performance.Therefore,it is crucial to transfer knowledge related to human mobility from the data-rich source cities to the data-scarce target cities.This paper firstly provides an overview of the datasets and commonly used evaluation metrics used in existing studies,followed by a gradual discussion of the cross-city mobility prediction problem at the human indivi-dual-level and group-level respectively,and then categorizes the applicable research methods.For the individual-level humanmo-bility prediction,the application of four types of models,i.e.,collaborative filtering,matrix factorization,statistical learning,and deep learning,are analyzed.For group-level human mobility prediction,two types of machine learning methods for few samples,i.e.,knowledge transfer and meta learning,are specifically analyzed.In the end,important issues that urgently need to be addressed in the field of cross-city human mobility prediction are prospected.
-
Research Progress on Optimization Techniques of Tiered Storage Based on Deduplication
姚子路, 付印金, 肖侬. 基于重复数据删除的分层存储优化技术研究进展[J]. 计算机科学, 2025, 52(1): 120-130.
YAO Zilu, FU Yinjin, XIAO Nong. Research Progress on Optimization Techniques of Tiered Storage Based on Deduplication[J]. Computer Science, 2025, 52(1): 120-130. - YAO Zilu, FU Yinjin, XIAO Nong
- Computer Science. 2025, 52 (1): 120-130. doi:10.11896/jsjkx.231200011
-
Abstract
PDF(1778KB) ( 196 )
- References | Related Articles | Metrics
-
With the explosive growth of global data volume and the increasing diversity of data,storage systems with a single media layer are gradually unable to meet the diverse application demand of users.Tiered storage can classify and store data into storage layers with different access latency,storage capacity,and fault tolerance based on the importance,access frequency,security requirements,and other characteristics of the data.It has been widely applied in various fields.Deduplication is a big data reduction technique that can efficiently remove duplicate data from storage systems and maximize storage space utilization.Unlike single storage layer scenarios,applying deduplication to tiered storage can not only reduce cross-layer data redundancy,further save storage space and reduce storage costs,but also improve data I/O performance and storage device durability.After a brief analysis of the principle,process,and classification of deduplication based tiered storage,this paper starts with three key steps:storage location selection,duplicate content identification,and data migration operation.It summarizes the research progress of many optimization methods and explores the potential technical challenges of deduplication based tiered storage.Finally,the future development trends of deduplication based tiered storage is prospected.
-
Study on Collaborative Data Persistence in NewSQL Databases
左顺, 李永坤, 许胤龙. 面向NewSQL数据库数据协同持久化的研究[J]. 计算机科学, 2025, 52(1): 131-141.
ZUO Shun, LI Yongkun, XU Yinlong. Study on Collaborative Data Persistence in NewSQL Databases[J]. Computer Science, 2025, 52(1): 131-141. - ZUO Shun, LI Yongkun, XU Yinlong
- Computer Science. 2025, 52 (1): 131-141. doi:10.11896/jsjkx.231200079
-
Abstract
PDF(2337KB) ( 124 )
- References | Related Articles | Metrics
-
To ensure high availability of data,modern NewSQL databases often create several copies of data so that it can be accessed from other copies in case one copy is not available.With multiple data copies,it is essential to consider data consistency between them.This means that the results should be the same when different clients read the same data at a particular moment.Therefore,a transaction processing mechanism is introduced.In the interactive transactional process with multiple write operations,each write operation must be performed on both the primary and backup copies of the data,since there are multiple copies.However,the primary and backup replicas are typically located on different machines,resulting in increased latency when writing to remote replicas,which in turn can ultimately lead to an increase in the processing latency of the entire transaction.In this paper,we present a collaborative data persistence scheme where the client caches the transaction write logs locally.When the transaction is finally committed,the client firstly persists the write logs of the transaction and sends the logs to the coordinator node of the transaction to allow the coordinator to distribute the log data,so as to achieve the purpose of the two cooperating in persistence of the transaction data.Experimental results show that in comparison to the synchronous persistence scheme,cooperative persistence scheme can not only reduce the latency of interactive transaction processing,but also improve the system limit throughput rate by roughly 38%.
-
Sequential Tag Recommendation
刘冰, 徐鹏宇, 陆思进, 王诗菁, 孙宏健, 景丽萍, 于剑. 序列标签推荐[J]. 计算机科学, 2025, 52(1): 142-150.
LIU Bing, XU Pengyu, LU Sijin, WANG Shijing, SUN Hongjian, JING Liping, YU Jian. Sequential Tag Recommendation[J]. Computer Science, 2025, 52(1): 142-150. - LIU Bing, XU Pengyu, LU Sijin, WANG Shijing, SUN Hongjian, JING Liping, YU Jian
- Computer Science. 2025, 52 (1): 142-150. doi:10.11896/jsjkx.240700186
-
Abstract
PDF(2367KB) ( 145 )
- References | Related Articles | Metrics
-
With the development of Internet technology and the expansion of social networks,online platforms have become a significant avenue for people to access information.The introduction of tags has facilitated the categorization and retrieval of information.At the same time,the advent of tag recommendation systems not only makes it easier for users to input tags but also improves the quality of tags.Traditional tag recommendation algorithms typically only consider tags and items,overlooking the crucial role of personal intent when users choose tags.Since tags in a recommendation system are ultimately determined by users,user preferences play a key role in tag recommendation.Therefore,we introduce the user as a subject,and by incorporating the chronological order of users’ historical posts,modeling the task of tag recommendation as a sequential tag recommendation task that is more aligned with real-world scenarios.To address this task,this paper proposes a method named MLP for sequential tag recommendation(MLP4STR),which explicitly models user preferences to guide the overall tag recommendation.MLP4STR employs a cross-feature alignment MLP framework for sequence feature extraction,aligns the features of text and tags to capture the dynamic interests of users implicit in their historical post and tag information.Finally,it recommends tags by combining post content and user preferences.Experimental results on four real-world datasets show that MLP4STR can effectively learn information from users’ historical behavior sequences in sequential tag recommendation,and the evaluation metric F1@5 shows a significant improvement compared to the optimal baseline algorithms.
-
Application-aware Disaggregated Storage Design for Remote Memory Graph Database
李纯羽, 邓龙, 李永坤, 许胤龙. 面向远程内存图数据库的应用感知分离式存储设计[J]. 计算机科学, 2025, 52(1): 151-159.
LI Chunyu, DENG Long, LI Yongkun, XU Yinlong. Application-aware Disaggregated Storage Design for Remote Memory Graph Database[J]. Computer Science, 2025, 52(1): 151-159. - LI Chunyu, DENG Long, LI Yongkun, XU Yinlong
- Computer Science. 2025, 52 (1): 151-159. doi:10.11896/jsjkx.231200073
-
Abstract
PDF(2552KB) ( 151 )
- References | Related Articles | Metrics
-
Graph data is becoming more widespread across various applications due to its features to represent multiple entity types and rich associated relationships.For users of graph databases,efficient graph query service is crucial to ensure system performance.As the amount of data grows,the single-machine graph database is difficult to meet the demand of storing all the data in memory,while distributed graph databases,which spread the data across the memory of multiple machines,face challenges in scalability and resource utilization rate.A new solution to these challenges is the introduction of RDMA-based remote memory systems.These systems separate the tasks of computing and storing graph data,offering a more flexible way to use memory.Yet,a big challenge in current solutions is how to ensure the performance of graph query when using remote memory.This study takes a close look at the challenges that come up when building remote memory graph databases on general-purpose far memory platforms,which use remote memory transparently.It suggests a new approach,where the design of the remote memory graph database is aware of how it’s being used.This design method creates a storage model that understands the different types of property graph data and how it’s accessed.Specifically,the study explains how to make sure the data is arranged and accessed in the best way.Experimental results show that when the local memory is limited,the awareness method outperforms the transparent approach,giving a 12x improvement in graph query performance.
-
Path-masked Autoencoder Guiding Unsupervised Attribute Graph Node Clustering
丁新宇, 孔兵, 陈红梅, 包崇明, 周丽华. 路径掩码自编码器引导无监督属性图节点聚类[J]. 计算机科学, 2025, 52(1): 160-169.
DING Xinyu, KONG Bing, CHEN Hongmei, BAO Chongming, ZHOU Lihua. Path-masked Autoencoder Guiding Unsupervised Attribute Graph Node Clustering[J]. Computer Science, 2025, 52(1): 160-169. - DING Xinyu, KONG Bing, CHEN Hongmei, BAO Chongming, ZHOU Lihua
- Computer Science. 2025, 52 (1): 160-169. doi:10.11896/jsjkx.231100117
-
Abstract
PDF(2977KB) ( 169 )
- References | Related Articles | Metrics
-
The purpose of graph clustering is to discover the community structure of the network.Aiming at the problem that the current clustering methods can not well obtain the deep potential community information of the network,and can not make sui-table information integration of the features,resulting in unclear semantics of the node community,a path-masked autoencoder guiding unsupervised attribute graph node clustering(PAUGC)model is proposed.This model utilizes an autoencoder to deeply dig the network topology structure by randomly masking network paths,thereby obtaining excellent global structural semantic information.Utilizing a normative method for information integration of the features,so that the node features are able to better characterize the class information of the features.In addition,the model combines modularity maximization to capture the under-lying community clusters information in the whole graph,aiming to more reasonably fuse it into the low-dimensional node features.Finally,the model iteratively optimizes and updates the clustering representation through self-training clustering to obtain the final node features.By conducting extensive experiments and comparisons with 11 classical methods on 8 benchmark datasets,PAUGC has been proven to be effective compared to current mainstream methods.
-
Multi-granularity Time Series Contrastive Learning Method Incorporating Time-Frequency Features
叶力硕, 何志学. 融合时频特征的多粒度时间序列对比学习方法[J]. 计算机科学, 2025, 52(1): 170-182.
YE Lishuo, HE Zhixue. Multi-granularity Time Series Contrastive Learning Method Incorporating Time-Frequency Features[J]. Computer Science, 2025, 52(1): 170-182. - YE Lishuo, HE Zhixue
- Computer Science. 2025, 52 (1): 170-182. doi:10.11896/jsjkx.231100171
-
Abstract
PDF(6159KB) ( 164 )
- References | Related Articles | Metrics
-
Existing time series contrastive learning methods have some problems,such as augmented sample construction methods rely too much on manual experience,insufficient generalization ability,positive samples are not defined in a general enough way,and coarse-grained representations of contrastive measures,resulting in weak overall time series representation.Therefore,a multi-granularity time series contrastive learning method based on time-frequency features(TSDC) is proposed.The seasonal-trend generation network generates temporal augmentation samples with stable variations in the time domain,and the multi-band fusion perturbation operation generates non-stable variations temporal augmentation samples in the frequency domain,and the two augmentation samples are learned through coarse-grained contrastive at the instance level and fine-grained contrastive at the dimension level,so that the model can be better adapted to different types of downstream time series tasks while obtaining better representation.Experiments on classification,prediction,and anomaly detection on multiple time series public datasets show that the representation obtained by the TSDC method outperforms typical baseline models for downstream tasks.
-
Review of Federated Learning in Medical Image Processing
刘育铭, 代煜, 陈公平. 联邦学习在医学图像处理任务中的研究综述[J]. 计算机科学, 2025, 52(1): 183-193.
LIU Yuming, DAI Yu, CHEN Gongping. Review of Federated Learning in Medical Image Processing[J]. Computer Science, 2025, 52(1): 183-193. - LIU Yuming, DAI Yu, CHEN Gongping
- Computer Science. 2025, 52 (1): 183-193. doi:10.11896/jsjkx.231200057
-
Abstract
PDF(1850KB) ( 148 )
- References | Related Articles | Metrics
-
In the medical field,due to patient privacy concerns,it is difficult to collect and label images,which brings great difficulties to the training and deployment of deep learning models.As a distributed learning framework that can effectively protect data privacy,federated learning can conduct joint modeling on the basis that participants do not share data,and technically break the data island.With these advantages,it has been widely used in many industries.Due to the high degree of compliance with the needs of medical image processing,many federated learning research works applied to medical image processing have emerged in recent years.However,most of the new methods have not been summarized and analyzed,which is not conducive to further exploration.This paper gives a brief introduction to federated learning,lists some of its applications in medical image processing,and classifies and summarizes the existing research according to the improvement direction.Finally,the problems and challenges of federated learning in medical image are discussed,and future research directions are prospected,hoping to provide some help for subsequent research.
-
Survey of Vision Transformers(ViT)
李玉洁, 马子航, 王艺甫, 王星河, 谭本英. 视觉Transformer(ViT)发展综述[J]. 计算机科学, 2025, 52(1): 194-209.
LI Yujie, MA Zihang, WANG Yifu, WANG Xinghe, TAN Benying. Survey of Vision Transformers(ViT)[J]. Computer Science, 2025, 52(1): 194-209. - LI Yujie, MA Zihang, WANG Yifu, WANG Xinghe, TAN Benying
- Computer Science. 2025, 52 (1): 194-209. doi:10.11896/jsjkx.240600135
-
Abstract
PDF(3037KB) ( 299 )
- References | Related Articles | Metrics
-
The Vision Transformer(ViT),an application of the Transformer architecture with an encoder-decoder structure,has garnered remarkable success in the field of computer vision.Over the past few years,research centered around ViT has witnessed a prolific surge and has consistently exhibited exceptional performance.Therefore,endeavors rooted in this model have evolved into a pivotal and prominent research trajectory within the domain of computer vision tasks.Consequently,this paper seeks to provide a comprehensive survey of the recent advancements and developments in ViT during the recent years.To begin with,it briefly revisits the fundamental principles of the Transformer and its adaptation into ViT,analyzing the structural characteristics and advantages of the ViT model.Then it categorizes and synthesizes the various directions of improvement for ViT backbone networks and their representative improvement models based on the distinguishing features of each ViT variant.These directions include enhancements in locality,structural modifications,self-supervised improvements,and lightweight and efficient improvements,which are thoroughly examined and compared.Lastly,this paper discusses the remaining shortcomings of the current ViT and its enhancement models,while also offering a prospective view on the future research directions for ViT.This comprehensive analysis serves as a valuable reference for researchers when deliberating on the choice of deep learning methodologies for their investigations into ViT backbone networks.
-
Contrastive Representation Learning for Industrial Defect Detection
罗航宇, 王小平, 梅萌, 赵文豪, 刘思纯. 面向工业品缺陷检测的对比表示学习[J]. 计算机科学, 2025, 52(1): 210-220.
LUO Hangyu, WANG Xiaoping, MEI Meng, ZHAO Wenhao, LIU Sichun. Contrastive Representation Learning for Industrial Defect Detection[J]. Computer Science, 2025, 52(1): 210-220. - LUO Hangyu, WANG Xiaoping, MEI Meng, ZHAO Wenhao, LIU Sichun
- Computer Science. 2025, 52 (1): 210-220. doi:10.11896/jsjkx.240100202
-
Abstract
PDF(3186KB) ( 146 )
- References | Related Articles | Metrics
-
Defect detection in large-scale manufacturing aims to find defective components,such as damaged,misaligned components,and components with printing errors.Due to unknown defect types and shortage of defect samples,industrial defect detection faces great challenges.To overcome the above difficulties,some methods utilize common visual representations from natural image datasets to extract generalized features for defect detection.However,there are distribution differences between the extracted pre-trained features and the target data.Using this feature directly will lead to poor detection performance.Therefore,ConPatch,a method based on contrastive representation learning is proposed.This method employs contrastive representation lear-ning to collect similar features or separate dissimilar features,resulting in goal-oriented representations of features.In order to solve the problem of lack of defect annotation,two similarity measures in data representations,pairwise similarity and global similarity,are used as pseudo labels.In addition,the method uses a lightweight memory bank and only stores the feature centers of all normal sample which are all defect-free sample in the memory bank,reducing the space complexity and the size of the memory bank.Finally,the normal features are brought closer to a hypersphere and the defect features are distributed outside the hypersphere to gather the normal features.Experimental results show that the I-AUROC and P-AUROC of the ConPatch model based on Wide-ResNet50 reaches 99.35% and 98.26% respectively in the industrial defect detection dataset MVTec AD.In the VisA dataset,I-AUROC and P-AUROC reaches 95.50% and 98.21%,respectively.The above results verify the effectiveness of the proposed model.
-
Contact-free IR-UWB Human Motion Recognition Based on Dual-stream Fusion Network
张传宗, 王冬子, 郭政鑫, 桂林卿, 肖甫. 基于双流融合网络的非接触式IR-UWB人体动作识别方法[J]. 计算机科学, 2025, 52(1): 221-231.
ZHANG Chuanzong, WANG Dongzi, GUO Zhengxin, GUI Linqing, XIAO Fu. Contact-free IR-UWB Human Motion Recognition Based on Dual-stream Fusion Network[J]. Computer Science, 2025, 52(1): 221-231. - ZHANG Chuanzong, WANG Dongzi, GUO Zhengxin, GUI Linqing, XIAO Fu
- Computer Science. 2025, 52 (1): 221-231. doi:10.11896/jsjkx.240400108
-
Abstract
PDF(4935KB) ( 168 )
- References | Related Articles | Metrics
-
With the rapid development of intelligent sensing technology,the field of human computer interaction(HCI) has entered a new era.Traditional HCI methods,predominantly reliant on wearable devices and cameras to collect user behavior data,have significant limitations despite their precise recognition capabilities.Wearable devices,for instance,impose additional burden on users,whereas camera-based solutions are susceptible to ambient lighting conditions and pose significant privacy concerns.These challenges considerably restrict their applicability in daily life.To solve these challenges,we utilize the exceptional sensiti-vity and spatial resolution of impulse radio ultra-wideband(IR-UWB) in the field of radio frequency(RF) to propose a novel and contact-free method for human motion recognition based on a dual-stream fusion network.This method adeptly captures the temporal signal variations caused by target movements and extracts the corresponding frequency-domain features by analyzing Doppler frequency shift(DFS) changes on the time-domain signals.Subsequently,a sophisticated dual-stream network model,integrating multi-dimensional convolutional neural networks(CNNs) and GoogLeNet modules,is developed to facilitate precise action recognition.Through extensive experimental tests,the results show that the proposed method achieves an average accuracy of 94.89% for eight common daily human actions and maintains an accuracy of over 90% under varying test conditions,thereby va-lidating the robustness of the proposed method.
-
Feature Construction for Effort-aware Just-In-Time Software Defect Prediction Based on Multi-objective Optimization
赵晨阳, 刘磊, 江贺. 基于多目标优化的工作量感知即时软件缺陷预测特征构建方法[J]. 计算机科学, 2025, 52(1): 232-241.
ZHAO Chenyang, LIU Lei, JIANG He. Feature Construction for Effort-aware Just-In-Time Software Defect Prediction Based on Multi-objective Optimization[J]. Computer Science, 2025, 52(1): 232-241. - ZHAO Chenyang, LIU Lei, JIANG He
- Computer Science. 2025, 52 (1): 232-241. doi:10.11896/jsjkx.240100198
-
Abstract
PDF(1966KB) ( 159 )
- References | Related Articles | Metrics
-
Just-in-time software defect prediction(JIT-SDP) is a software defect prediction technology for code changes,which has the advantages of fine granularity,instantaneity,and traceability.Effort-aware JIT-SDP further considers the cost of code inspection and aims to detect more defective code changes with limited testing resources.Although many effort-aware JIT-SDPs have been proposed,most of them only optimize model algorithms.In order to improve the performance and generalizability of effort-aware JIT-SDP,an effort-aware evolutionary feature construction method EEF is proposed for the first time from the aspect of feature engineering.Firstly,EEF represents features through genetic programming trees.From the two aspects of classification performance and effort-aware performance,a new feature transformation is obtained through an evolutionary feature construction method based on multi-objective optimization.After that,a new feature set is constructed through the obtained feature transformation,and the classification model is trained and tested on the new feature set.In order to verify the effectiveness of EEF,expe-riments are conducted in three different evaluation schemes on six open source datasets.The results prove that EEF can improve the performance of the classification model in effort-aware scenarios and performs better than other feature engineering methods.Moreover,under the premise of ensuring the diversity of feature selection,EEF based on a single model can also improve the performance of other models.
-
Just-In-Time Software Defect Prediction Approach Based on Fine-grained Code Representationand Feature Fusion
朱晓燕, 王文格, 王嘉寅, 张选平. 基于细粒度代码表示和特征融合的即时软件缺陷预测方法[J]. 计算机科学, 2025, 52(1): 242-249.
ZHU Xiaoyan, WANG Wenge, WANG Jiayin, ZHANG Xuanping. Just-In-Time Software Defect Prediction Approach Based on Fine-grained Code Representationand Feature Fusion[J]. Computer Science, 2025, 52(1): 242-249. - ZHU Xiaoyan, WANG Wenge, WANG Jiayin, ZHANG Xuanping
- Computer Science. 2025, 52 (1): 242-249. doi:10.11896/jsjkx.240200046
-
Abstract
PDF(2019KB) ( 146 )
- References | Related Articles | Metrics
-
Just-in-time software defect prediction(JIT-SDP) aims to predict the defect tendency of software changes at the time when they are first committed.Such predictions are made on a single program change rather than on a coarse granularity.It has been widely used in fields such as continuous testing due to its immediacy and traceability.Existing JIT-SDP studies extract features from code changes at a coarse granularity,merely marking the changed lines without fine-grained tagging.Moreover,studies based on commit content are limited to simple concatenation of features extracted from commit messages and code changes,lacking deep alignment in feature space.This makes the prediction results tend to be disturbed by noise when the quality of committed message cannot be guaranteed.Existing methods also fail to fully utilize artificial features designed by domain experts and semantic syntax structure information in commit content at the same time,thus not fully leveraging existing features.To address these problems,a JIT-SDP approach based on fine-grained code changes and feature fusion is proposed.The method introduces new change embeddings to represent code changes at a fine granularity.By designing a feature alignment module,the impact of noise in low-quality commit message on performance is reduced.Meanwhile,neural networks are used to learn domain-specific knowledge from artificial features and fully utilize existing features.Experimental results show that compared to existing me-thods,this approach improves significantly on three performance metrics.
-
Patch Correctness Verification Method Based on CodeBERT and Stacking Ensemble Learning
韩威, 姜淑娟, 周伟. 基于CodeBERT和Stacking集成学习的补丁正确性验证方法[J]. 计算机科学, 2025, 52(1): 250-258.
HAN Wei, JIANG Shujuan, ZHOU Wei. Patch Correctness Verification Method Based on CodeBERT and Stacking Ensemble Learning[J]. Computer Science, 2025, 52(1): 250-258. - HAN Wei, JIANG Shujuan, ZHOU Wei
- Computer Science. 2025, 52 (1): 250-258. doi:10.11896/jsjkx.240100019
-
Abstract
PDF(2540KB) ( 201 )
- References | Related Articles | Metrics
-
In recent years,automatic program repair has become an important research topics in the field of software engineering.However,most of the existing automatic repair technologies are based on patch generation and testing,which consumes a significant amount of time and cost in the patch verification process.In addition,because the test suite is not completeness,many candidate patches can pass the test,but the test results are not consistent with the facts,which leads to the patch overfitting problem.To improve the efficiency of patch verification and alleviate patch overfitting issues,a static patch verification method is proposed.The method first uses the large pre-training model CodeBERT to automatically extract the semantic features of defect code fragments and patch code fragments,and then uses the historical defect repair patch data to train a Stacking ensemble learning model.The trained model can effectively verify the new defect repair patch.The verification ability of the proposed method is evaluated on the 1 000 patch data related to the Defects4J defect dataset.Experimental results show that the static patch verification method can effectively verify the correctness of the patch,thereby improving the efficiency of patch verification.
-
Review of Pre-training Methods for Visually-rich Document Understanding
张剑, 李晖, 张晟铭, 吴杰, 彭滢. 视觉富文档理解预训练综述[J]. 计算机科学, 2025, 52(1): 259-276.
ZHANG Jian, LI Hui, ZHANG Shengming, WU Jie, PENG Ying. Review of Pre-training Methods for Visually-rich Document Understanding[J]. Computer Science, 2025, 52(1): 259-276. - ZHANG Jian, LI Hui, ZHANG Shengming, WU Jie, PENG Ying
- Computer Science. 2025, 52 (1): 259-276. doi:10.11896/jsjkx.240300028
-
Abstract
PDF(4850KB) ( 136 )
- References | Related Articles | Metrics
-
Visually-rich document(VrD) refers to a document whose semantic structures are related to visual elements like typesetting formats and table structures in addition to being determined by the textual content.Numerous application scenarios,such as receipt understanding and card recognition,require automatically reading, analyzing and processing VrD(e.g.,forms,invoices,and resumes).This process is called visually-rich document understanding(VrDU),which is the cross-filed between natural language processing(NLP) and computer vision(CV).Recently,self-supervised pre-training techniques of VrDU have made significant progress in breaking down the training barriers between downstream tasks and improving model performance.However,a comprehensive summary and in-depth analysis of the pre-training models of VrDU is still lacking.To this end,we conduct an in-depth investigation and comprehensive summary of pre-training techniques of VrDU.Firstly,we introduce the data processing stage of pre-training technology,including the traditional pre-training datasets and optical character recognition(OCR) engines.Then,we discuss three key technique modules in the model pre-training stage,namely single-modality representation learning,multi-modal feature fusion,and pre-training tasks.Meanwhile,the similarities and differences between the pre-training models are elaborated on the basis of the above three modules.In addition,we briefly introduce the multi-modal large models applied in VrDU.Furthermore,we analyze the experimental results of pre-training models on three representative downstream tasks.Finally,the challenges and future research directions related to the pre-training models are pointed out.
-
Option Discovery Method Based on Symbolic Knowledge
王麒迪, 沈立炜, 吴天一. 基于符号知识的选项发现方法[J]. 计算机科学, 2025, 52(1): 277-288.
WANG Qidi, SHEN Liwei, WU Tianyi. Option Discovery Method Based on Symbolic Knowledge[J]. Computer Science, 2025, 52(1): 277-288. - WANG Qidi, SHEN Liwei, WU Tianyi
- Computer Science. 2025, 52 (1): 277-288. doi:10.11896/jsjkx.240100221
-
Abstract
PDF(2547KB) ( 115 )
- References | Related Articles | Metrics
-
Hierarchical strategy learning based on options is a prominent approach in the field of hierarchical reinforcement lear-ning.Options represent temporal abstractions of specific actions,and a set of options can be combined in a hierarchical manner to tackle complex reinforcement learning tasks.For the goal of option discovery,existing research has focused on the discovery of meaningful options using supervised or unsupervised methods from unstructured demonstration trajectories.However,supervised option discovery requires manual task decomposition and option policy definition,leading to a lot of additional burden.On the other hand,options discovered through unsupervised methods often lack rich semantics,limiting the subsequent reuse of options.Therefore,this paper proposes a symbol-knowledge-based option discovery method that only requires modeling the symbolic knowledge of the environment.The acquired knowledge can guide option discovery for various tasks in the environment and assign symbolic semantics to the discovered options,enabling their reuse in new task executions.This method decomposes the option discovery process into two stages:trajectory segmentation and behavior cloning.Trajectory segmentation aims to extract semantically meaningful trajectory segments from demonstration trajectories.To achieve this,a segmentation model is trained specifically for demonstration trajectories,incorporating symbolic knowledge to define the accuracy of segmentation in reinforcement learning reward evaluation.Behavior cloning,on the other hand,supervises the training of options based on the segmented data,aiming to make the options mimic trajectory behaviors.The proposed method is evaluated in multiple domain environments,including both discrete and continuous spaces,for option discovery and option reuse experiments.In the option discovery experiments,the results of trajectory segmentation show that the proposed method achieves higher segmentation accuracy compared to the baseline method,with an improvement of several percentage points in both discrete and continuous space environments.More-over,in complex environment tasks,the segmentation accuracy is further improved by 20%.Additionally,the results of the option reuse experiments demonstrate that options enriched with symbolic semantics exhibit faster training speed in adapting to new tasks compared to the baseline method.Furthermore,these symbolic semantics enhanced options show good convergence even in complex tasks that the baseline method fails to accomplish.
-
Active Learning Based on Maximum Influence Set
李雅和, 谢志鹏. 基于最大影响力集合的主动学习方法[J]. 计算机科学, 2025, 52(1): 289-297.
LI Yahe, XIE Zhipeng. Active Learning Based on Maximum Influence Set[J]. Computer Science, 2025, 52(1): 289-297. - LI Yahe, XIE Zhipeng
- Computer Science. 2025, 52 (1): 289-297. doi:10.11896/jsjkx.231100075
-
Abstract
PDF(2868KB) ( 135 )
- References | Related Articles | Metrics
-
With the continuous progress of deep learning,it has been widely applied in numerous fields.However,the training of deep models requires a large amount of labeled data,and the cost of time and resources is high.How to maximize the model performance with the least amount of labeled data has become an important research topic.Active learning aims to address this issue by selecting the most valuable samples for annotation and utilizing them for model training.Traditional active learning approaches usually concentrate on uncertainty or diversity,aiming to query the most difficult or representative samples.Nevertheless,these methods typically only take into account one-sided effects and overlook the interaction between labeled and unlabeled data in active learning scenarios.Another type of active learning method utilizes auxiliary networks for sample selection,but these methods usually result in higher computational complexity.This paper proposes a novel active learning approach designed to optimize the model’s total performance gain by taking into account sample-to-sample interactions and comprehensively measuring local uncertainty and the influence of candidate samples on other samples.The method first estimates the influence of samples on each other based on the distance between the hidden layer representations of the samples,and further estimates the potential gain that the sample can bring based on the influence of candidate samples and the uncertainty of unlabeled samples.The sample with the highest global gain is iteratively chosen for annotation.On a series of tasks across several domains,the study further compares the proposed method with other active learning strategies.Experimental results demonstrate that the proposed method outperforms all competitors in all tasks.Further quantitative analysis experiments have also demonstrated that it balances uncertainty and diversity well,and explores the factors that should be emphasized at different stages of active learning.
-
Social Bots Detection Based on Multi-relationship Graph Attention Network
孟令君, 陈鸿昶, 王庚润. 基于多关系图注意力网络的社交机器人检测[J]. 计算机科学, 2025, 52(1): 298-306.
MENG Lingjun, CHEN Hongchang, WANG Gengrun. Social Bots Detection Based on Multi-relationship Graph Attention Network[J]. Computer Science, 2025, 52(1): 298-306. - MENG Lingjun, CHEN Hongchang, WANG Gengrun
- Computer Science. 2025, 52 (1): 298-306. doi:10.11896/jsjkx.231100161
-
Abstract
PDF(3244KB) ( 178 )
- References | Related Articles | Metrics
-
At present,social bots have gained extensive utilization across social platforms and the existence of social bots makes the public opinion environment on the network artificially manipulated.This not only compromises the integrity of a healthy and harmonious online atmosphere but also significantly disrupts people’s regular online activities.Existing detection methods can be divided into feature-based,text-based,and graph-based methods.However,graph-based detection methods predominantly ignore the heterogeneous relationships,and cannot perform deep detection due to the transition smoothing phenomenon in graph neural networks.To solve the above problems,a social bots detection method based on a multi-relationship graph attention network is proposed.Firstly,we extract subgraphs with different relationships,then apply the attention mechanism to aggregate the nodes within the subgraph and conduct node representation learning across diverse relationships,resulting in the acquisition of node representations.Finally,we use channel attention to fuse the same node under different relationships to obtain node representation,while using the post-connection operation based on LSTM attention to allow nodes to adaptively select neighborhoods for aggregation,thereby alleviating the over-smoothing phenomenon.Experiments are conducted on three datasets:Cresci15,Twibot20,and MGTAB,and the experimental results show that,compared with the optimal values of the evaluation indicators of 11 models,the accuracy of the model is increased by 0.47%,1.19% and 0.38%,respectively,which demonstrates the effectiveness of the multi-relationship graph attention network for social bots detection.
-
Dialogue Generation Model Integrating Emotional and Commonsense Knowledge
程金凤, 蒋宗礼. 融合情感和常识知识的对话生成模型[J]. 计算机科学, 2025, 52(1): 307-314.
CHENG Jinfeng, JIANG Zongli. Dialogue Generation Model Integrating Emotional and Commonsense Knowledge[J]. Computer Science, 2025, 52(1): 307-314. - CHENG Jinfeng, JIANG Zongli
- Computer Science. 2025, 52 (1): 307-314. doi:10.11896/jsjkx.231100130
-
Abstract
PDF(1899KB) ( 140 )
- References | Related Articles | Metrics
-
With the development of deep learning technology,as an important branch of human-machine dialogue system,open domain dialogue system has also developed rapidly.However,there are still problems such as poor empathy and low diversity in response sentences generated by existing dialogue models in open domains.To address these problems,a dialogue generation model integrating emotional and commonsense knowledge is proposed in this paper.Commonsense knowledge vector corresponding to each word is firstly obtained based on the emotion dictionary and commonsense knowledge graph,and the vector is input into the encoder for encoding along with the word embedding vector of the word itself.Then a two-stage decodingprocess is used to ge-nerate response sentence:the first decoding stage is to predict the emotional intensity of the word to be generated and obtain the corresponding emotional vector for that word based on it,the second decoding stage combines the encoding result of the first stage with the word embedding vector of the generated word and its corresponding common sense knowledge vector as input to predict the word to be generated.Experimental results show that the response sentences generated by the proposed model are more empathetic and diverse,and it has a certain improvement in PPL,BLEU,ACC and DISTINCT evaluation compared with the baseline models.
-
Generation of Enrich Semantic Video Dialogue Based on Hierarchical Visual Attention
赵倩, 郭斌, 刘宇博, 孙卓, 王豪, 陈梦琦. 基于层次化视觉注意力的富语义视频对话生成[J]. 计算机科学, 2025, 52(1): 315-322.
ZHAO Qian, GUO Bin, LIU Yubo, SUN Zhuo, WANG Hao, CHEN Mengqi. Generation of Enrich Semantic Video Dialogue Based on Hierarchical Visual Attention[J]. Computer Science, 2025, 52(1): 315-322. - ZHAO Qian, GUO Bin, LIU Yubo, SUN Zhuo, WANG Hao, CHEN Mengqi
- Computer Science. 2025, 52 (1): 315-322. doi:10.11896/jsjkx.231100107
-
Abstract
PDF(2442KB) ( 124 )
- References | Related Articles | Metrics
-
As an important research direction in the field of multimodal human-computer interaction,video dialogue emerges.The large amount of temporal and spatial visual information and complex multimodal relationships makes it challenging to design efficient video dialogue systems.Existing video dialogue systems utilize cross-modal attention mechanisms or graph structures to capture the correlation between video semantics and dialogue context.However,all visual information is processed with a single coarse granularity.It results in a loss of some fine-grained temporal and spatial information,such as the continuous motion of the same object and the insignificant position information of an image.Moreover,the fine-grained process of all visual information increases the delay and degrades the dialogue fluency.Therefore,we propose a hierarchical visual attention-based semantic-rich video dialogue generation method in this paper.Firstly,according to the dialogue context,global visual semantic information is captured by using global visual attention and located to the time sequence/spatial scope of the video associated with the dialogue input.Secondly,the local attention mechanism is used to further capture fine-grained visual information in the localized area,and to generate the dialogue response by exploiting the multi-task learning method.Experimental results on DSTC7 AVSD datasets show that the dialogue generated by the proposed method has higher accuracy and variety,and its METEOR index improves by 23.24%.
-
Multi-agent Pursuit Decision-making Method Based on Hybrid Imitation Learning
王焱宁, 张锋镝, 肖登敏, 孙中奇. 基于混合模仿学习的多智能体追捕决策方法[J]. 计算机科学, 2025, 52(1): 323-330.
WANG Yanning, ZHANG Fengdi, XIAO Dengmin, SUN Zhongqi. Multi-agent Pursuit Decision-making Method Based on Hybrid Imitation Learning[J]. Computer Science, 2025, 52(1): 323-330. - WANG Yanning, ZHANG Fengdi, XIAO Dengmin, SUN Zhongqi
- Computer Science. 2025, 52 (1): 323-330. doi:10.11896/jsjkx.240800072
-
Abstract
PDF(4135KB) ( 138 )
- References | Related Articles | Metrics
-
Aiming at the limitations of traditional imitation learning approaches in handling diverse expert trajectories,particularly the difficulty in effectively integrating fixed-modality expert data of varying quality,this paper innovatively integrates the multiple trajectories generative adversarial imitation learning(MT-GAIL) method with temporal-difference error behavioral cloning(TD-BC) technology to construct a hybrid imitation learning framework.This framework not only enhances the model’s adaptability to complex and dynamic expert strategies but also improves its robustness in extracting useful information from low-quality data.The resulting model from this framework is directly applicable to reinforcement learning,requiring only minor adjustments and optimizations to train a readily usable reinforcement learning model grounded in expert experience.Experimental validation in a two-dimensional dynamic-static hybrid target pursuit scenario demonstrates the method’s impressive performance.The results indicate that the proposed method effectively assimilates expert knowledge,providing a high-starting-point and effective initial model for subsequent reinforcement learning training phases.
-
Decomposition-based Multi-objective Evolutionary Algorithm for Industrial Dynamic Pickup andDelivery Problems
蔡俊创, 朱庆灵, 林秋镇, 李坚强, 明仲. 面向工业动态取送货问题的分解多目标进化算法[J]. 计算机科学, 2025, 52(1): 331-344.
CAI Junchuang, ZHU Qingling, LIN Qiuzhen, LI Jianqiang, MING Zhong. Decomposition-based Multi-objective Evolutionary Algorithm for Industrial Dynamic Pickup andDelivery Problems[J]. Computer Science, 2025, 52(1): 331-344. - CAI Junchuang, ZHU Qingling, LIN Qiuzhen, LI Jianqiang, MING Zhong
- Computer Science. 2025, 52 (1): 331-344. doi:10.11896/jsjkx.231200132
-
Abstract
PDF(2916KB) ( 139 )
- References | Related Articles | Metrics
-
Due to the constraints of industrial dynamic pickup and delivery problems(DPDPs),such as docks,time windows,capacity,and last-in-first-out loading,most of the existing vehicle routing algorithms only optimize a single weighted objective function,which is difficult to maintain the diversity of solutions,so it is easily get stuck in local optimal region and stop converging.To alleviate this issue,this paper introduces a decomposition-based multi-objective evolutionary algorithm with efficient local search for solving the above DPDPs.Firstly,our algorithm models the DPDP into a multi-objective optimization problem(MOP),which is further decomposed into multiple sub-problems and solves them simultaneously.Then,crossover operation is used to enhance the diversity of solutions,followed by using an efficient local search to speed up the convergence.By this way,our algorithm can better balance the diversity and convergence of solutions when solving this MOP.Finally,the best solution can be selected from the population to complete the pickup and delivery tasks.Simulation results on 64 test problems from practical scenario of Huawei company demonstrate that our algorithm outperforms other competitive algorithms for tackling DPDPs.Meanwhile,the algorithm is also tested on 20 large-scale delivery problems of JD Logistics to validate its generalization.
-
Adversarial Sample Detection in Computer Vision:A Survey
张鑫, 张晗, 牛曼宇, 姬莉霞. 计算机视觉领域对抗样本检测综述[J]. 计算机科学, 2025, 52(1): 345-361.
ZHANG Xin, ZHANG Han, NIU Manyu, JI Lixia. Adversarial Sample Detection in Computer Vision:A Survey[J]. Computer Science, 2025, 52(1): 345-361. - ZHANG Xin, ZHANG Han, NIU Manyu, JI Lixia
- Computer Science. 2025, 52 (1): 345-361. doi:10.11896/jsjkx.240300080
-
Abstract
PDF(3366KB) ( 135 )
- References | Related Articles | Metrics
-
With the increase in data volume and improvement in hardware performance,deep learning(DL) has made significant progress in the field of computer vision.However,deep learning models are vulnerable to adversarial samples,causing significant changes in the output.As an effective defense method,adversarial sample detection can prevent adversarial samples from affecting the deep learning model without changing the model structure.This paper organizes the research work on adversarial example detection in recent years,analyzes the relationship between adversarial example detection and training data,classifies them according to the characteristics used in the detection method,and systematically and comprehensively introduces adversarial sample detection methods in the field of computer vision.Then,some detection methods that combine cross-domain technologies are introduced in detail,and the experimental configurations for training and evaluating detection methods are statistically analyzed.Finally,some technologies that are expected to be applied to adversarial sample detection are summarized,and future research challenges and development directions are prospected.
-
Federated Graph Learning:Problems,Methods and Challenges
王鑫, 熊书博, 孙凌云. 图联邦学习:问题、方法与挑战[J]. 计算机科学, 2025, 52(1): 362-373.
WANG Xin, XIONG Shubo, SUN Lingyun. Federated Graph Learning:Problems,Methods and Challenges[J]. Computer Science, 2025, 52(1): 362-373. - WANG Xin, XIONG Shubo, SUN Lingyun
- Computer Science. 2025, 52 (1): 362-373. doi:10.11896/jsjkx.240500118
-
Abstract
PDF(1667KB) ( 168 )
- References | Related Articles | Metrics
-
Graph has been widely used in various fields for many years as an efficient,flexible,and versatile data structure.In recent years,graph-based deep learning algorithms have emerged,achieving significant success in areas like social network,bioinformatics,and recommendation systems.Although publicly graph data online is increasing,high-quality data remains scattered among different owners.With society’s growing demand for data privacy protection,existing graph learning algorithms require enhancement.Graph federated learning is a novel approach to addresses this issue.This paper systematically reviews the research progress in the field of federated graph learning over the past five years.The core problems in the field are divided into three parts,and the structure is vertically integrated and the relationships are progressively explained:1)structural heterogeneity from differences in raw graph data;2)model aggregation issues due to federated graph learning characteristics;3)overall model tuning.For each section,it provides a detailed analysis of representative works and their advantages and disadvantages,summarizes the typical applications and future challenges in the field of federated graph learning.
-
Network Microburst Traffic Measurement Method Based on Sketch Data Structure
王佳宇, 于俊清, 李冬, 赵君杨. 基于概要数据结构的网络微突发流量检测方法[J]. 计算机科学, 2025, 52(1): 374-382.
WANG Jiayu, YU Junqing, LI Dong, ZHAO Junyang. Network Microburst Traffic Measurement Method Based on Sketch Data Structure[J]. Computer Science, 2025, 52(1): 374-382. - WANG Jiayu, YU Junqing, LI Dong, ZHAO Junyang
- Computer Science. 2025, 52 (1): 374-382. doi:10.11896/jsjkx.231200080
-
Abstract
PDF(2592KB) ( 138 )
- References | Related Articles | Metrics
-
Microburst traffic is a common type of traffic in data center network,which grows rapidly in a very short period of time,and has serious effect on network performance and is difficult to detect.Existing microburst traffic detection methods cannot take into account both fine-grained detection and low-resource transmission.This paper proposes a lightweight fine-grained microburst detection method based on sketch data structure.Firstly,the architectural characteristics of the programmable switch is used to measure the queuing delay for each packet,microburst detection algorithm is put forward to process network traffic and the microburst traffic is filtered out to achieve the purpose of fine-grained detection.Then sketch is used to save microburst traffic information,which is sent to controller at the end of the time slice or the end of the microburst stream by mirroring transmission,so as to achieve lightweight transmission.Finally,the microburst traffic detection system is implemented on P4 programmable switch in real-world network environment.Experiments show that this method has good microburst measurement accuracy,and greatly reduces the bandwidth overhead required for microburst information transmission.
-
Study on Malicious Access Detection in Industrial Control Networks Based on Dynamic BayesianGames
刘浩含, 陈泽茂. 基于动态贝叶斯博弈的工业控制网络恶意接入检测研究[J]. 计算机科学, 2025, 52(1): 383-392.
LIU Haohan, CHEN Zemao. Study on Malicious Access Detection in Industrial Control Networks Based on Dynamic BayesianGames[J]. Computer Science, 2025, 52(1): 383-392. - LIU Haohan, CHEN Zemao
- Computer Science. 2025, 52 (1): 383-392. doi:10.11896/jsjkx.231200083
-
Abstract
PDF(3017KB) ( 129 )
- References | Related Articles | Metrics
-
In view of security issues such as unauthorized access,denial of service attacks,spoofing attacks and information disclosure in the remote access scenario of industrial control network(ICN),the STRIDE threat modeling method is used to analyze the potential threats in this scenario.An access detection framework based on dynamic Bayesian game is proposed.This method can screen and block illegal and malicious requests trying to access the ICN.At the same time,it uses the continuous multiple rounds of game iterations and the flexible and dynamic characteristics of SDN to adjust the policy parameters in real time to prevent the same malicious access source from being accessing again.Simulation experimental results show that as the number of game rounds increases,compared with the existing two types of malicious access defense methods,the detection accuracy of this framework increases by more than 3%,the false positive rate decreases by more than 1.2%,the detection efficiency has improved by more than 14.7%,and it has good robustness.
-
Anti-semantic Analysis Script Fusion Technology
田博文, 杨巨, 熊小兵, 段爽, 魏然. 抗语义分析的脚本融合技术[J]. 计算机科学, 2025, 52(1): 393-400.
TIAN Bowen, YANG Ju, XIONG Xiaobing, DUAN Shuang, WEI Ran. Anti-semantic Analysis Script Fusion Technology[J]. Computer Science, 2025, 52(1): 393-400. - TIAN Bowen, YANG Ju, XIONG Xiaobing, DUAN Shuang, WEI Ran
- Computer Science. 2025, 52 (1): 393-400. doi:10.11896/jsjkx.231100181
-
Abstract
PDF(1988KB) ( 125 )
- References | Related Articles | Metrics
-
In recent years,script programs have been widely used in the field of computer science.Script programs are increasingly being used in the current network environment due to their powerful functionality and high execution efficiency,simpler writing and smaller file size than binary programs.Currently,the main types of script obfuscation techniques include encoding obfuscation,structural obfuscation,and encryption obfuscation.However,existing script obfuscation methods have obvious features and are at risk of being deobfuscated.Once a script is deobfuscated,its functionality can be easily analyzed and understood.To address this issue,an anti-semantic analysis script fusion technique is proposed.By deeply merging camouflage code with the target code that needs to be protected after dividing them into blocks,the fused code contains the code from both scripts,and the semantics and logic of different scripts are intertwined and interdependent,making semantic analysis more difficult.Understanding and analyzing the fused code requires stronger semantic reasoning and contextual understanding capabilities.Experimental results on PowerShell scripts show that the control flow complexity of the fused script programs is increased by 81.51% on average,and the obfuscation strength of the code is greatly enhanced.This technique effectively blurs the script’s semantics,alters control flow characteristics,and performs well in the face of semantic analysis by ChatGPT.
-
Identity-based Key-insulated Provable Multi-copy Data Possession in Multi-cloud Storage
周杰, 王化群. 基于身份的密钥隔离的多云多副本可证数据持有方案[J]. 计算机科学, 2025, 52(1): 401-411.
ZHOU Jie, WANG Huaqun. Identity-based Key-insulated Provable Multi-copy Data Possession in Multi-cloud Storage[J]. Computer Science, 2025, 52(1): 401-411. - ZHOU Jie, WANG Huaqun
- Computer Science. 2025, 52 (1): 401-411. doi:10.11896/jsjkx.231200081
-
Abstract
PDF(1943KB) ( 120 )
- References | Related Articles | Metrics
-
Provable data possession(PDP) allows users to verify that their outsourced data is intact without downloading all the data.To improve the availability and security of outsourced data,many users store multiple copies of their data on a single server.In case of a single cloud server failure or other unexpected circumstances,the data copy stored by users will be damaged and the original data cannot be restored.At the same time,many PDP schemes rely on the technique of public key infrastructure(PKI),which has key management problems.In addition,most of the existing PDP schemes use the key to process the data on the client side.Because the security awareness of the client is weak or the security settings are low,the key may be exposed.Once the malicious cloud obtains the client’s key,it can hide the event of data loss by forging false proof of data possession.Based on the above problems,we propose a scheme called identity-based key-insulated provable multi-copy data possession in multi-cloud storage.Identity-based PDP scheme eliminates complex certificate management in the technique of public key infrastructure.Multi-copy in multi-cloud ensures that if all copies in one cloud server are tampered with or corrupted,users can still obtain copies from other cloud servers and recover data.At the same time,the key-insulated technology is used to realize forward and backward security.Even if the key is exposed in a certain period of time,the security of cloud storage auditing in other periods of time is not affec-ted.The formal definition,system model and security model of the scheme are given.The security proof of the scheme is given under the standard difficult problem.The security analysis shows that the proposed scheme has strong anti-key leakage,detectability and unforgeability of data block authenticator and proofs.Experimental results show that compared with the existing multi-cloud and multi-copy related schemes,the proposed scheme has relatively high efficiency.
-
RF Fingerprint Recognition Based on SE Attention Multi-source Domain Adversarial Network
苏超然, 张大龙, 黄勇, 董安. 基于SE注意力多源域对抗网络的射频指纹识别[J]. 计算机科学, 2025, 52(1): 412-419.
SU Chaoran, ZHANG Dalong, HUANG Yong, DONG An. RF Fingerprint Recognition Based on SE Attention Multi-source Domain Adversarial Network[J]. Computer Science, 2025, 52(1): 412-419. - SU Chaoran, ZHANG Dalong, HUANG Yong, DONG An
- Computer Science. 2025, 52 (1): 412-419. doi:10.11896/jsjkx.231100076
-
Abstract
PDF(2228KB) ( 157 )
- References | Related Articles | Metrics
-
RF fingerprinting uses the hardware features of RF front-end as identifiers to identify devices.Aiming at the problem that existing RF fingerprinting research ignores the interference of receiver hardware features,resulting in poor generalization of the model on different receiver devices,an RF fingerprinting method based on squeeze and excitation(SE) attention multi-source domain adversarial network is proposed.Multiple source-domain labelled data and a small amount of target-domain unlabelled data are used for adversarial training to extract receiver-domain independent features.Incorporating SE attention mechanism enhances the model’s ability to learn RF fingerprint features from the transmitter.The model parameters are fine-tuned by combining a very small amount of tagged data in the target domain to further improve the performance of transmitter identification.Experimental results on the Wisig dataset show that this method can effectively identify the transmitter device in the cross-receiver scenario,with an average accuracy of up to 83.1%,and the average accuracy can be further improved to 93.1% by adding a small amount of tagged data to fine-tune the model.