Computer Science

Survey on Positional Encoding Algorithms in Deep Learning

YANG Geer, WANG Xin, SUN Wei, WANG Xinge, HU Zhongrui, MENG Wenjun, ZHANG Junqiang, WU Xinghui, LIU Jinshan, YAN Yuming

Computer Science. 2026, 53 (6A): 250300107-16. doi:10.11896/jsjkx.250300107

Abstract

PDF(4141KB) ( 38 )

References | Related Articles | Metrics

In deep learning,positional encoding constitutes a critical component for enhancing neural networks' capabilities in understanding sequence structures.Particularly within Transformers and their variants,positional encoding addresses the inherent limitation of the self-attention mechanism,which lacks the ability to intrinsically capture sequential order.This paper systematically reviews the theoretical foundations of positional encoding,the conceptual design of various encoding strategies,and their applications across diverse neural network architectures.Initially,the paper revisits traditional models such as Recurrent Neural Networks(RNNs) and Long Short-Term Memory networks(LSTMs),discussing their implicit methods of modeling sequence positions and examining the theoretical motivations behind the introduction of explicit positional encoding in Transformers.Subsequently,a detailed exposition is presented on absolute positional encoding strategies-including sinusoidal positional encoding and learnable positional embeddings-relative positional encoding methods such as Transformer-XL and RoPE,bias-based positional encoding methods like ALiBi and KERPLE,and recent optimization techniques tailored for extremely long sequence tasks,notably NTK-aware RoPE,YaRN,and CoPE.Moreover,the paper conducts an in-depth analysis of positional encoding's impact on model performance,encompassing computational efficiency,extrapolation capabilities,and modeling of long-range dependencies.Frontier topics including numerical stability and frequency spectrum optimization are also addressed.Finally,the study summarizes current research trends in positional encoding and outlines its future prospects in areas such as large-scale sequence modeling,hybrid network architectures,and hierarchical data structure modeling.The overarching aim is to provide researchers and practitioners with a comprehensive and detailed reference to facilitate the selection of appropriate positional encoding methods for specific tasks and to foster further advancements in related fields.

Cognitive LLM Agent for Mater Education Based on Hybrid Reasoning

ZHENG Jiaqi, PENG Shihao, ZHAO Junjie, HONG Daocheng, ZHU Dandan, SANG Jinqiu, ZHANG Guixu

Computer Science. 2026, 53 (6A): 250400056-7. doi:10.11896/jsjkx.250400056

Abstract

PDF(3169KB) ( 22 )

References | Related Articles | Metrics

LLM-driven agents have led a cognitive revolution in education,transforming traditional static Q&A systems into digital tutors with dynamic knowledge integration and intelligent interaction capabilities.However,existing general-purpose LLMs face significant challenges in the context of master popularization education,including knowledge hallucination,static knowledge representation limitations,and insufficient adaptability of teaching strategies.To address these challenges,this paper proposes the Master Education Cognitive LLM Agent(MECA),chatMaster,based on a three-layer cognitive architecture comprising a perception layer,reasoning layer and action layer.MECA establishes a closed-loop decision-making paradigm of “Intention Perception → Knowledge Reasoning → Educational Execution.”It introduces a dynamic cognitive enhancement mechanism for improving the precision of knowledge generation and personalized adaptation capabilities.In the proposed framework,the perception layer employs a lightweight language model to analyze user intent,while the reasoning layer incorporates an intent-enhanced hybrid reasoning mechanism that integrates domain knowledge,user needs,and teaching strategies for multi-dimensional reasoning.To build a solid data foundation,it develops the first high-quality Q&A dataset specifically for master education information in China,leveraging both human review and LLM-based language understanding.This dataset covers renowned scholars,educators and academicians in Shanghai,focusing on key figures with significant contributions.It encompasses multiple dimensions,including biographical details,academic ideologies,and educational contributions,addressing the lack of high-quality domain-specific corpora in this field.Experimental results demonstrate that chatMaster achieves significant improvements across multiple evaluation metrics,enhancing the cognitive interaction capabilities and precision of knowledge dissemination in master popularization education information.This research provides a generalizable paradigm for constructing educational agents,promoting the evolution of intelligent education towards greater specialization and dynamism.

Design and Application of Semantic Model for Medical Record Knowledge Graph Querying

CHU Xiaolong, DU Jinlian, LUO Fangyuan, JIN Xueyun

Computer Science. 2026, 53 (6A): 250900023-9. doi:10.11896/jsjkx.250900023

Abstract

PDF(2025KB) ( 23 )

References | Related Articles | Metrics

Cypher language is widely applied in knowledge graph querying.However,it exhibits inherent limitations:insufficient semantic support in specific-domain scenarios and cumbersome expression of complex queries,which are unfavorable for the design of query engines.By analyzing the characteristics of clinical diagnosis and treatment query application based on medical record knowledge graph,a query meta operation for medical record(MR) knowledge graph and a query application semantic model generated by the combination of these meta operations are designed.Combined into a hierarchical semantic model,these meta-ope-rations provide a clear and structured way to represent clinical queries.This model serves as an effective logical-layer abstraction for query engines,leveraging a production rule system to efficiently convert queries into Cypher execution plans.The rules define direct mappings from the model's constructs to Cypher syntax,enabling automatic translation of high-level query intentions into executable statements.Experiments show that this semantic model performs well in terms of expressiveness and accuracy for queryapplications oriented to the medical record knowledge graph,as well as in the operational robustness of the query engine built on it.

Study on Financial Text Sentiment Analysis Method Based on Large Language Models with Market Feedback Supervision

ZHANG Yongyu, GUO Chenjuan, FEI Xueqin, LI Feng

Computer Science. 2026, 53 (6A): 250500073-14. doi:10.11896/jsjkx.250500073

Abstract

PDF(3311KB) ( 21 )

References | Related Articles | Metrics

In the financial market,market sentiment has a profound impact on asset prices and volatility.Although large language models bring opportunities for financial text sentiment analysis,current research still has issues such as difficulties in handling the professionalism and dynamics of financial texts and poor consistency with market reactions.This study constructs an innovative financial text sentiment analysis system.It integrates the advantages of multiple large language models,combines knowledge graph enhancement technology and chain-of-thought technology to optimize the hybrid language model framework.Moreover,it adopts a sliding analysis method that combines multiple time windows and dynamic weights,constructs a market index evaluation system,and develops an adaptive dynamic update algorithm to strengthen the market feedback supervision mechanism.Empirical analysis shows that this system performs excellently in the accuracy of sentiment analysis and has a high consistency with market reactions,significantly outperforming comparative models.This research provides new perspectives and tools for financial market research and investment decision-making,and holds great theoretical and practical significance in the financial field.

Research on Fact Prediction by Integrating Knowledge Graph Embeddings and Large Models

YANG Hua, WANG Baohui

Computer Science. 2026, 53 (6A): 250600055-5. doi:10.11896/jsjkx.250600055

Abstract

PDF(1880KB) ( 21 )

References | Related Articles | Metrics

This paper proposes a fact prediction algorithm that integrates knowledge graph embedding with a large language mo-del,aiming to address the challenges of judging the authenticity of triples in the field of bidding.In view of the insufficient generalization ability of traditional fact prediction algorithms and the limitations of a single large language model in handling structured knowledge,this paper employs the TransR model to perform low-dimensional embedding representation of the entities and relationships extracted from bidding documents.At the same time,the Qwen 2.5-1.5 B large language model is utilized to extract text semantic features through LoRa fine-tuning,and a deep integration of the two types of information is achieved in the feature-level fusion module.The experiments are conducted on a real bidding dataset.The experimental results show that theproposed method achieves a precision of 86.4%,a recall rate of 93.2%,and an F1 score of 89.7% in the fact prediction task.Compared with traditional knowledge graph embedding algorithms,the F1 score is improved by 14 percentage points,and compared with the method of only fine-tuning the large language model,the F1 score is increased by 11.3 percentage points.

Construction and Application of Dataset Knowledge Graph Based on Metadata Semantic Enhancement

SHEN Jianwei, CHEN Jiawen, CHEN Hanlin, MA Xinjian, CHEN Xing

Computer Science. 2026, 53 (6A): 250500052-10. doi:10.11896/jsjkx.250500052

Abstract

PDF(3401KB) ( 20 )

References | Related Articles | Metrics

The rapid expansion of data resources has led to a significant emphasis on the effective organization,discovery,and utilization of datasets within the domain of data management.Conventional approaches that rely on metadata matching or statistical retrieval often fail to adequately capture intricate semantic relationships,resulting in diminished accuracy and interpretability in the retrieval of datasets.In response to this challenge,this study proposes a methodology for the construction and retrieval of dataset knowledge graphs through the enhancement of metadata semantics,with the objective of augmenting the semantic retrieval capabilities of datasets.Initially,it standardizes dataset metadata in accordance with the W3C DCAT specification to establish a foundational knowledge graph that encompasses essential attributes such as titles,keywords,subject categories,and data items.Subsequently,to address the shortcomings associated with the semantic descriptions of metadata,it incorporates the Wikidata general knowledge graph to enrich entity semantics via cross-domain semantic expansion.In the retrieval phase,the BERT-BiLSTM-CRF model is employed to extract key entities from user queries and construct semantic relationship subgraphs.By integrating entity vector representations generated via Wikipedia2vec,it implements structured semantic retrieval ranking using cosine similarity calculations.Experiments conducted on the government open data platforms of Fuzhou and Shenzhen demonstrate that the proposed method achieves Top-10 hit rates of 97.92% and 98.25%,respectively－representing improvements of 8.92%~12.04% over traditional BM25 and 5.72%~8.96% over Word2Vec-enhanced methods.The results highlight that semantic enhancement via Wikidata and structured graph matching significantly boost retrieval accuracy by explicitly modeling entity relationships and enriching metadata semantics.This study provides a feasible technical solution for enhancing dataset discovery in scenarios such as open data platforms and research data management,showcasing the effectiveness of integrating semantic enrichment with knowledge graph structures.

Dual-stream Heterogeneous Social Graph for Micro-video Popularity Prediction

ZHANG Xinliang, LIU Lilong, CHEN Shangheng, CHEN Ziyang, QIAN Shengsheng

Computer Science. 2026, 53 (6A): 250800073-8. doi:10.11896/jsjkx.250800073

Abstract

PDF(2846KB) ( 28 )

References | Related Articles | Metrics

With the rapid rise of short videos on digital platforms,micro-video popularity prediction(MVPP) has become an important research area.Short videos contain rich multimodal content,including video frames,text,and social network interaction data,all of which significantly influence their popularity.However,existing methods have two main shortcomings:they typically rely only on the multimodal content features of the short video itself and fail to effectively model the complex social network structure information formed by user interactions(such as comments,likes,and shares);when handling large-scale social multimodal graphs,existing graph learning methods often lead to the loss of valuable multimodal signals due to neighbor sampling strategies.To address these shortcomings,this paper proposes a novel approach－DHSGL(Dual-Stream Heterogeneous Social Graph Learning Framework).The core innovation of this method lies in:1)proposing an efficient graph learning pre-calculation strategy,which aggregates the complete graph structure information through a single global propagation and constructs unimodal graphs to preserve the original modality features,thus significantly reducing information loss;2)constructing a social multimodal graph that integrates social interactions and multimodal content to fully leverage the neglected social structure information.Experimental results demonstrate a significant improvement in prediction performance,validating the effectiveness of integrating social network structure with multimodal content to enhance micro-video popularity prediction.

Multi-objective Charging Optimization Strategy Based on Improved Black-winged Kite Algorithm

ZHAO Xuejian, ZHOU Jingjing, DONG Wenkai, TIAN Hao, ZHANG Hongzhu

Computer Science. 2026, 53 (6A): 250800063-11. doi:10.11896/jsjkx.250800063

Abstract

PDF(4280KB) ( 27 )

References | Related Articles | Metrics

Existing charging strategies for high-power charging technology suffer from limitations such as insufficient adaptability of static parameter settings,simplified models failing to adequately characterize multi-physics coupling effects,and traditional optimization algorithms prone to local optima.To address these issues,this paper proposes a multi-objective charging optimization strategy based on an improved black-winged kite algorithm(Improved BKA).Firstly,an electrochemical-thermal-aging multi-physics coupled model is established,encompassing electrical dynamics,thermal management,and aging mechanisms,to accurately characterize the complex physical behavior during high-power charging.A multi-objective optimization function is constructed with the goals of minimizing charging time,maximizing energy efficiency,and maximizing state-of-health(SOH) retention rate.Secondly,an efficient solution method for the objective function based on the Improved BKA is proposed.This method enhances the quality of initial solutions through a hybrid initialization strategy combining uniform and exponential distributions,dynamically balances global exploration and local exploitation capabilities via an adaptive step size mechanism grounded in Lyapunov stability theory,and efficiently handles safety constraints using convex projection-based constraint repair techniques.Finally,simulation experiments validate that compared to six representative algorithms including PSO and GWO,the Improved BKA strategy achieves at least a 17.0% improvement in charging speed,reaches 91.3% energy efficiency,controls the peak temperature below 44.2 ℃,and reduces the capacity fade rate by at least 21.0%.

Multi-objective Evolutionary Method by Training Front Modeling Based on MOEA/D

LI Li, YI Jiali, LI Youjun, LI Guangpeng

Computer Science. 2026, 53 (6A): 250500081-11. doi:10.11896/jsjkx.250500081

Abstract

PDF(2003KB) ( 23 )

References | Related Articles | Metrics

Multi-objective evolutionary algorithm based on decomposition(MOEA/D) is a widely employed optimization strategy in real-world applications.However,choosing a decomposition strategy of the MOEA/D that is not suitable for the curvature of Pareto Front(PF) can produce unsatisfactory results when dealing with multi-objective optimization problems.To address this issue,a multi-objective evolutionary algorithm based on MOEA/D by training PF models,named MOEA/D-ECM,is designed and adopted to solve the problem of the sensitivity of decomposition strategies to PF curvature.The algorithm trains a generic PF model to predict the curvature of the PF and then selects an appropriate decomposition strategy based on the predicted curvature.In addition,to ensure the diversity of the algorithm,a niche technique and distribution strategy is incorporated into the MOEA/D algorithm to select mating parents and improve the quality of the offspring.To evaluate the performance of this algorithm,several multi-objective evolutionary algorithms are compared on different test problems with concave,convex,and linear PF.The experimental results demonstrate that the MOEA/D-ECM algorithm can effectively solve multi-objective optimization problems for PF with different curvatures and has good performance and competitiveness.

Dynamic Adjustment Technology of Eye Movement Input Based on TCN-AttnRNN Model

CHEN Di, YIN Jibin

Computer Science. 2026, 53 (6A): 250300095-7. doi:10.11896/jsjkx.250300095

Abstract

PDF(2344KB) ( 16 )

References | Related Articles | Metrics

This paper presents an eye movement input technology that dynamically adjusts key dwell time based on character prediction results.In this study,a character-level language model named TCN-AttnRNN is designed.In this model,TCN is responsible for extracting global spatial features and long-term dependencies of sequences,RNN enhances the long-term memory performance of time series,and the multi-head self-attention mechanism optimizes the network by allocating weights through probability distribution to enhance the role of key features.Experimental results show that the BPC values of the TCN-AttnRNN model on the PTB and DailyDialog datasets are 1.26 and 1.22 respectively,which are superior to the current mainstream TCN,LSTM,and Transformer models.Based on the TCN-AttnRNN model,a dynamic adjustment technology for eye movement input is designed.By using the TCN-AttnRNN model for character prediction,this technology adjusts the dwell time of keys according to the pro-bability of users' next key selection.Experimental results confirm the effectiveness of this technology,compared with the traditional fixed dwell time method,it increases users' input speed by 22.31% and reduces the correction error rate by 19.57%.

Integrated Optimization of Automated Warehouse Based on Improved Sparrow Search Algorithm

WU Fu, MA Yapeng, LI Zhongxue, GAO Lingxia

Computer Science. 2026, 53 (6A): 250800014-9. doi:10.11896/jsjkx.250800014

Abstract

PDF(3643KB) ( 19 )

References | Related Articles | Metrics

Addressing the problems of stacker crane load imbalance and low storage/retrieval efficiency in multi-aisle automated storage and retrieval systems under dynamic operating conditions,this paper aims to construct an integrated optimization model for storage location assignment and job scheduling that considers load balancing to enhance overall warehouse operational efficiency.Firstly,soft set theory is employed to partition the shelves,and a parameterized classification mechanism is used to divide goods into regions based on their characteristic attributes and demand,effectively reducing the search space for storage locations.Secondly,a two-stage mathematical model is developed for integrated optimization of stacker crane load balancing and job scheduling.Subsequently,an improved sparrow search algorithm suitable for discrete sequencing problems is designed.Simulation expe-riments demonstrate that the optimized sparrow search algorithm outperforms the comparison algorithms across different instruction scales.The load balancing model effectively improves the uniformity of the stacker crane operation time distribution,enhancing the overall optimization results.The proposed method exhibits excellent performance in terms of optimization efficiency,convergence speed,and robustness.

Bilinear Attention Network-based Drug-target Interaction Prediction

LU Biyao, XU Youran, LIU Ying, LIU Jindong, LIU Jian, YIN Wenfei, JIANG Ye

Computer Science. 2026, 53 (6A): 250800090-7. doi:10.11896/jsjkx.250800090

Abstract

PDF(2897KB) ( 21 )

References | Related Articles | Metrics

Drug-target interaction(DTI) prediction is a core component in the process of new drug development.In recent years,the integration of DTI prediction with deep learning has emerged as an important direction in the field of drug discovery.How-ever,existing methods still face numerous challenges in handling the three-dimensional spatial structural information of drug molecules and protein targets,feature fusion strategies,and computational complexity.To address these challenges,a model based on the bilinear attention network is proposed to accurately predict the interactions between drugs and targets.The model generates feature representations of drug molecules using a multi-layer graph attention network and creates feature representations of protein targets using a multi-layer convolutional neural network combined with an SE module.Subsequently,the bilinear attention network with multi-head attention mechanism fuses these two types of feature representations and effectively reduces the model parameter quantity through a block tensor decomposition module,thereby achieving superior predictive performance.Experiments are conducted on two public datasets,BindingDB and BioSNAP,and the proposed BAN_DTI model significantly outperforms the compared state-of-the-art methods in five evaluation metrics:AUROC,AUPRC,Specificity,Accuracy,and Sensitivity.

Social Text MBTI Personality Feature Recognition Method Based on Data Fusion and Deep Learning

FU Yue, SHI Wei

Computer Science. 2026, 53 (6A): 250500101-8. doi:10.11896/jsjkx.250500101

Abstract

PDF(2307KB) ( 18 )

References | Related Articles | Metrics

With the widespread adoption of social networking platforms,users increasingly express their personal opinions,emotions,and attitudes through text content.These social texts not only carry linguistic information but also implicitly reflect users' behavioral patterns and personality traits.As a fundamental component in areas such as user profiling,personalized recommendation,and mental health analysis,personality recognition has gained increasing research attention.However,current methods still face challenges such as insufficient accuracy and limited model generalization when processing unstructured textual data.To improve the accuracy and efficiency of personality recognition in social texts,a personality trait recognition method based on data mapping and deep learning is proposed.This method firstly introduces a dataset mapping algorithm to effectively unify the feature space of multi-source data and alleviate issues related to inconsistent sample distributions.In terms of model design,multiple mainstream pre-trained language models(such as BERT,RoBERTa,and ERNIE) are fine-tuned to deeply extract personality-related cues from the semantic level of the text.Experiments conducted on a standard social text dataset demonstrate that the ERNIE model achieves the best performance,with an accuracy of 89.942% and an F1 score of 88.576%,significantly outperforming other models.These results validate the effectiveness of multi-source data integration and deep semantic modeling in personality recognition tasks.The proposed method enhances classification performance and provides technical support and methodological reference for future research and practical applications in personality modeling.

Quantitative Diagnostic Model for Generative AI Information Cocoons Based on MultidimensionalSemantic Contrastive Learning

LI Xiao, SUN Xinyu

Computer Science. 2026, 53 (6A): 251200031-16. doi:10.11896/jsjkx.251200031

Abstract

PDF(5549KB) ( 25 )

References | Related Articles | Metrics

This paper proposes GCI-MVP,a theory-driven multidimensional diagnostic model that quantifies information cocoon risks in Generative AI dialogues.Unlike traditional recommender systems,Generative AI exhibits “unidirectional compliance”.This tendency risks constructing closed cognitive spaces through human-AI co-construction－manifested as topic narrowing,opi-nion homogenization,and cognitive frame repetition.Existing studies remain largely theoretical and lack fine-grained,interpretable diagnostic tools for dynamic conversational flows.GCI-MVP addresses this gap through three synergistic innovations.Firstly,it establishes a theory-computable-explainable triadic paradigm.This paradigm encodes three core information cocoon mechanisms—cognitive selection bias,machine cognitive resonance,and cognitive frame repetition—into learnable neural computation units:to-pic prototypes,semantic anchors,and frame probes.Secondly,it synthesizes three interpretable metrics—topic cocoon index(TCI),semantic uniformity index(SUI),and frame repetition index(FRI)－via a multi-branch diagnostic architecture.Thirdly,it enables end-to-end risk assessment through a theory-driven linear fusion layer.Experiments on real-world Chinese dialogues(WildChat) demonstrate that GCI-MVP effectively diagnoses information cocoon risks at different levels,achieving 85.4% accuracy and a macro F1-score of 0.825 in three-level risk classification.It significantly outperforms LDA-based topic diversity,lexical diversity,and fine-tuned BERT baselines.Bootstrap tests and McNemar's test jointly confirm the statistical significance of this advantage(p<0.001).Systematic ablation studies validate the necessity of each diagnostic dimension,and typical case analyses further reveal strong alignment between the proposed metrics and theoretical mechanisms such as “cognitive selection bias” and “machine cognitive resonance.” Cross-model generalization tests on GLM-4,ERNIE 4.0,and BLOOMZ-7B demonstrate that the model achieves stable diagnosis without fine-tuning(macro F1=0.803),exhibiting strong generalizability.GCI-MVP provides a computable,interpretable,and auditable tool for assessing cognitive safety risks in generative AI,offering important theoretical value and application prospects for building secure and trustworthy human-AI dialogue systems.

Semantic Modeling and Co-attention Mechanism for Multimodal Sarcasm Detection Method

WEI Wei, LI Bicheng, ZHU Zhenshui, ZUO Jun

Computer Science. 2026, 53 (6A): 250400127-6. doi:10.11896/jsjkx.250400127

Abstract

PDF(2124KB) ( 15 )

References | Related Articles | Metrics

Sarcasm is widely used in social media and other forms of computer-mediated communication.Multimodal sarcasm detection,which leverages both textual and visual information,faces challenges due to the diversity and complexity of content,often relying on implicit contrast and semantic conflict across modalities.To better capture such cross-modal semantic discrepancies,this paper proposes a method that integrates semantic modeling with a co-attention mechanism(Co-Attention Transformer).Leveraging the representational power of the CLIP pre-trained model,the approach employs co-attention to enhance deep interaction and feature fusion across modalities.Moreover,it incorporates syntactic dependency trees for graph-based modeling and introduces semantic similarity enhancement to improve semantic alignment between text and image.Experiments on a public sarcasm detection dataset demonstrate the superiority of the proposed method over traditional baselines.

Criminal Law Adaptation for Digital Civilization:Paths Regulation and System Construction of Artificial Intelligence Crime

LI Xiao, LI Xinyi

Computer Science. 2026, 53 (6A): 251000061-9. doi:10.11896/jsjkx.251000061

Abstract

PDF(2059KB) ( 24 )

References | Related Articles | Metrics

Artificial intelligence technology,characterized by its autonomy,lack of explicability,and systemic impact,is fostering new forms of criminal activity.While traditional criminal offenses can partially address technology-abuse crimes through expansive interpretation,their limitations regarding the specificity of protected legal interests and the constituent elements of acts have become apparent.The current regulation of AI crime faces a dual dilemma:the absence of legislation for new types of criminal acts and the lack of specific laws addressing the misuse of AI technology.This paper addresses the structural challenges posed by AI crime to the traditional criminal law system against the backdrop of the digital civilization transition,aiming to construct an integrated response framework combining “path regulation” and “system construction”.Regarding the path of regulation,it seeks to systematically integrate governance resources across three dimensions－ex-ante,interim,and ex-post:technical regulation embeds legal obligations into the code architecture,procedural innovation tackles the algorithmic black box in judicial proof,and collaborative governance,based on controllability,achieves precise imputation.In terms of system construction,an approach that “emphasizes both interpretative and legislative theories” is adopted.This involves conducting limited expansive interpretations of traditional offenses to cover new behavioral patterns such as virtual property infringement and system deception,while simultaneously introducing new charges such as the crime of unlawful algorithmic manipulation and the crime of deepfake information abuse to fill the normative void arising from technological autonomy.Furthermore,it establishes judicial identification standards for AI technical facts,institutionalizing procedural innovation to ensure the practical implementation of substantive rules.It also constructs judicial identification norms for AI technical facts,legalizing the technical verification process to provide procedural safeguards for the judicial application of the aforementioned paths and systems.Ultimately,this research aims to promote the transformation of criminal law from a punitive law of the industrial era to a safeguarding law of the digital age,achieving a dynamic balance between technological innovation and criminal justice.

Research on Cooperative Trajectory Optimization of Multi-truck-UAV System Based on UAV Exchange

JIN Kehan, JIA Riheng

Computer Science. 2026, 53 (6A): 250900105-9. doi:10.11896/jsjkx.250900105

Abstract

PDF(2408KB) ( 22 )

References | Related Articles | Metrics

For the last-mile delivery logistics,a novel synchronous truck-drone routing problem is proposed.The framework is extended by introducing two innovative mechanisms:a drone-swapping mechanism that allows drones to be launched and retrieved by different trucks;and a dynamic waiting mechanism that enables trucks to wait at customer nodes for optimal drone retrieval.These mechanisms improve the flexibility of routing and delivery efficiency.The problem is formulated as a multi-objective mixed-integer linear programming problem,aiming to minimize the total transportation cost,maximize customer satisfaction through timely delivery,and optimize the makespan.The genetic algorithm(GA) is improved and compared with three meta-heuristic algorithms,namely the simulated annealing algorithm(SA),the adaptive large-neighborhood search algorithm(ALNS),and the ant-colony optimization algorithm(ACO).Experimental results show that compared with the traditional fixed truck-drone model,the new mechanisms significantly improve the cost,satisfaction,and time metrics(with an optimization degree of 5%~15%).Comparative analysis highlights the advantages of the improved genetic algorithm,which shows excellent adaptability to the new mechanisms.The research results emphasize the potential of flexible drone-truck collaboration in efficient last-mile logistics.

Triple Extraction Based on Pixel Difference Convolutional Network and Attention Mechanism

FENG Guang, LIN Jianzhong, ZHONG Ting, ZHOU Yuanhua, ZHENG Runting, LIU Tianxiang

Computer Science. 2026, 53 (6A): 250400136-10. doi:10.11896/jsjkx.250400136

Abstract

PDF(2772KB) ( 21 )

References | Related Articles | Metrics

Extracting relational triples from unstructured text is crucial for building knowledge graphs.Traditional models often suffer from relational redundancy and overlap due to insufficient contextual information capture.To tackle this,this paper proposes a relation extraction model based on pixel difference convolutional networks and attention mechanisms.It uses BERT to encode sentence representations and generate subject,object,and relation markers.By capturing contextual semantic information from local and global perspectives,the proposed model enhances entity pair interaction,reduces error propagation via bidirectional extraction,and strengthens sentence-entity connections through conditional normalization.A double imitation mechanism is employed to predict triples.Experiments on NYT and WebNLG datasets show the proposed model outperforms baselines in extracting overlapping triples.

Study on Text-to-SQL Approach Integrating Chain-of-Thought Reasoning with Retrieval Augmentation

XU Yafei, LIU Chuanyou, LIU Shaohua

Computer Science. 2026, 53 (6A): 250900107-7. doi:10.11896/jsjkx.250900107

Abstract

PDF(3320KB) ( 23 )

References | Related Articles | Metrics

Text-to-SQL technology enables the automatic conversion of natural language into SQL,significantly lowering the barrier for non-experts to interact with databases.However,in vertical domains such as finance,it still faces two major challenges:complex query intents and ambiguous expressions.To address these,this paper proposes a collaborative framework that integrates Chain-of-Thought reasoning with retrieval augmentation.Firstly,it designs an iterative distillation algorithm that leverages a compact 3-billion-parameter model to automatically generate and verify over 20 000 high-quality Chain-of-Thought seed examples with detailed reasoning steps on the Spider,BIRD,and BookSQL datasets,effectively compensating for the limitations of small models in complex reasoning tasks.Secondly,it introduces an innovative dual-vector retrieval mechanism based on “question ske-letons” and “SQL skeletons” to eliminate interference from table and column names,dynamically incorporating historically similar queries and implicit semantics as examples within prompts,thereby achieving precise alignment of ambiguous domain-specific expressions.Experimental results demonstrate that Qwen2.5-Coder,with only 3 billion parameters,attains 86.5% execution accuracy on Spider-dev—comparable to powerful models such as GPT-4,achieves 59.6% on the more challenging BIRD-dev,outperforming many larger models,and reaches 81.2% on a proprietary financial dataset,exceeding existing methods by over 1 percentage point.This approach delivers high-precision SQL generation for complex and ambiguous natural language queries at low cost and with a relatively small model size.

FIN-GDAN:Sentiment Adversarial Transfer Network for Shanghai Gold Futures News

ZHAO Jingyun, LIU Keying, GUO Wenke

Computer Science. 2026, 53 (6A): 250700179-9. doi:10.11896/jsjkx.250700179

Abstract

PDF(4161KB) ( 26 )

References | Related Articles | Metrics

Through improving market ecology and enhancing pricing autonomy,Shanghai gold futures have become a crucial pillar for the high-quality development of China's gold futures market.However,research on sentiment classification for Shanghai gold futures remains relatively limited,primarily facing challenges of insufficient samples and strong domain-specific semantic features.To address this issue,this study proposes a dual adversarial network for small-sample financial text sentiment modeling,termed FIN-GDAN(Financial Domain-Guided Dual Adversarial Network).This model incorporates a cross-domain knowledge transfer mechanism and a dual adversarial training strategy.While preserving the expressive power of pre-trained language mo-dels,it guides the model to focus on finance-relevant sentiment semantics and effectively mitigates domain bias interference.Experimental results demonstrate that the FIN-GDAN framework achieves outstanding performance on the Shanghai gold futures news sentiment three-class classification task,attaining an accuracy of 88.48%,which represents a 30.89% improvement over baseline models.Furthermore,the adaptive adversarial coefficient within various FIN-GDAN configurations consistently enhances learning efficacy.

Sparse Graph Generation Method for the Traveling Salesman Problem Based on Frequency Graphs

YANG Yahui, WANG Yong

Computer Science. 2026, 53 (6A): 250300087-7. doi:10.11896/jsjkx.250300087

Abstract

PDF(2611KB) ( 17 )

References | Related Articles | Metrics

TSP is a challenging problem where the search space for the optimal solution grows exponentially according to the problem size.To reduce the search space for the optimal Hamiltonian cycle,a novel optimization strategy is proposed.This strategy generates a sparse TSP graph based on frequency graphs,significantly reducing the search space of the optimal Hamiltonian cycle and lowering the problem's complexity.Firstly,LOP4s are computed from the weighted complete graph.The frequencies of edges appearing in the LOP4s are then calculated to form a frequency graph.Based on this frequency graph,a sparse graph for TSP is constructed.Initially,AF of all edges is set as the frequency threshold,and edges with frequencies below AF are removed to generate the sparse graph of the first-generation.Subsequently,further edge deletions are performed according to different vertex degree thresholds m ranging from 5 to min{n/4,45}).Specifically,the frequencies of all edges connected to each vertex are summed to determine the corresponding vertex frequency.Vertices are sorted by their frequencies,and the edges having the least frequency and connecting the vertices with the highest-frequency are iteratively deleted in case that the degree of each vertex is no less than m.This process yields multiple sparse graphs of the second-generation.Finally,the sparse graphs of the second-generation are added together,and a dual-frequency is introduced to determine which edges to retain for producing the sparse graph of the third-generation.Experiments conducted on 20 standard TSP datasets demonstrate the algorithm's effectiveness.Results show that the sparse graph of the third-generation retains all edges in the optimal Hamiltonian cycle while it contains a small number of edges in the complete graph,which greatly reduces the search space of the optimal Hamiltonian cycle.Based on the online Concorde system,the optimal Hamiltonian cycles are searched using both the complete graph and the sparse graph.It is observed that the search time based on the sparse graph is shorter than that according to the complete graph.This study provides a new perspective for reducing the difficulty of TSP and lays a foundation for future integration with other heuristic algorithms or machine learning techniques to solve TSP.

Application of Quantum Information Fusing Artificial Lemming Algorithm in Qubit Mapping

DU Zuoqiang, LIU Shujuan, LI Hui

Computer Science. 2026, 53 (6A): 250700147-7. doi:10.11896/jsjkx.250700147

Abstract

PDF(2596KB) ( 19 )

References | Related Articles | Metrics

Aiming at the problems that the traditional Qubit mapping algorithm of quantum circuits is limited by the circuit structure and hardware coupling,resulting in insufficient global optimization effect and a large number of additional SWAP gates,this paper proposes an quantum information fusing artificial lemming algorithm(QALA) and applies the algorithm to the qubit mapping process of quantum circuits.Based on the traditional ALA,the Bloch spherical quantum coding technology is introduced to expand the population,increasing the range of the solution space while ensuring the possibility of individuals exploring in multiple directions in the early stage of system evolution.The individual variation mode of t-distribution based on quantum rotation is designed to enhance the diversity of population evolution,and the quantum tunneling effect is utilized to avoid falling into local optimum.An adaptive search direction factor is designed,and the balancing methods of global search and local development are discussed to ensure the flexibility and rapidity of the global optimization process.Results of 30 benchmark test circuits show that,compared with the traditional ALA,the additional gate number of QALA is reduced by 100%.Meanwhile,in the t|ket〉 and Qiskit compilers,compared with the traditional IBM benchmark test methods,the number of SWAP gates added by QALA is decreased by an average of 35.8% and 47.8%,the number of CNOT gates is decreased by an average of 12.9% and 13.8%,and the execution time is decreased by an average of 5.8% and 6.4% respectively.Experiments results show the applicability of the proposed algorithm in different compilation environments.

Intelligent Recommendation System of Chinese Patent Medicine Based on Cloud Service and RAG Technology

TANG Lingshuang, LI Wei, HUANG Pingping, HUANG Xihang, WANG Qingxiang, LIU Jihong

Computer Science. 2026, 53 (6A): 250300167-7. doi:10.11896/jsjkx.250300167

Abstract

PDF(1979KB) ( 19 )

References | Related Articles | Metrics

In recent years,although significant progress has been made in the application of large-scale language models(LLMs) in the medical field,they still face two core challenges in TCM scenarios:firstly,the contradiction between the dynamic demand of arithmetic resources and the cost control,and secondly,the difficulty of the traditional localised deployment architecture to support real-time updating of the knowledge base of TCM and the demand for complex reasoning.This paper proposes an intelligent recommendation system for pCms based on cloud-native architecture and retrieval-enhanced generation(RAG) technology,which constructs a dynamically updated online pCm knowledge base through the elastic arithmetic resources of AliCloud's Hundred Refinement Platform and Tongyi Thousands of Questions-Max-Latest Big Model,combined with the DashScope API interface and the low-code development platform,AppBuilder.Taking cold and flu as a validation scenario,the system integrates the Chinese Pharmacopoeia,the Chinese medicine clinical diagnosis and treatment terminology system,and the personalised consultation data on cold and flu from Foshan Hospital of Traditional Chinese Medicine from 1 January 2022 to 31 December 2024,and integrates in-depth knowledge retrieval and generative reasoning capability driven by RAG technology.Through multiple rounds of interactive consultation and dynamic recommendation strategies,the system is able to generate a personalised medication plan based on patients' symptom characteristics,constitution identification and real-time feedback,which is in line with the principles of Chinese medicine diagnosis and treatment.This system effectively addresses the dual challenges of static knowledge bases and constrained dynamic allocation of computing resources in traditional pCm recommendation systems through the elastic scaling mechanism of cloud services and the efficient knowledge integration of RAG technology.It provides a feasible technical solution for the intelligent development of traditional Chinese medicine,while significantly improving the precision of treatment plans and patient satisfaction.

Exploring the Generalization Ability of Prompt-based Large Language Models for TextClassification

XU Rui, LIU Jin, LIU Xudong, GUAN Jian, DONG Wei

Computer Science. 2026, 53 (6A): 250400092-7. doi:10.11896/jsjkx.250400092

Abstract

PDF(1753KB) ( 27 )

References | Related Articles | Metrics

LLMs have advanced text classification,yet prompt-based performance varies across models,tasks and languages.This study investigates how model size,task type and category semantics shape prompt generalization.It evaluates three families－DeepSeek,Qwen and GPT-4o－on AG News,THUCNews,IMDb and ChnSentiCorp under Zero-Shot and 1/3/5-Shot settings.The results show that larger models deliver more stable Few-Shot gains,sentiment analysis benefits from Few-Shot prompts,whereas news classification prefers Zero-Shot unless high-quality examples are provided,and category representativeness and separability largely determine Prompt efficacy.Based on these insights,this study distils a four-step decision workflow and a semantics-aware guideline for Prompt design,offering practical advice for deploying LLMs in real-world classification.

SoftLexicon-BERT-GlobalPointer-based Approach for Chinese Named Entity Recognition in High-voltage Circuit Breaker

ZHENG Mingkun, PANG Chunjiang, WANG Xinying

Computer Science. 2026, 53 (6A): 250500048-6. doi:10.11896/jsjkx.250500048

Abstract

PDF(4146KB) ( 22 )

References | Related Articles | Metrics

This paper presents a hybrid model based on SoftLexicon-BERT-GlobalPointer for Chinese named entity recognition(NER) in the domain of high-voltage circuit breakers,addressing challenges such as complex technical terms,fuzzy entity boundaries,and long-distance dependencies.The proposed method integrates SoftLexicon with a domain-specific lexicon to enhance word representation,leverages a BERT pre-trained model to capture contextual semantics,and applies the GlobalPointer decoding strategy to improve entity boundary detection.Experiments on a self-constructed high-voltage circuit breaker dataset show that the model achieves an F1 score of 90.62%,significantly outperforming traditional BiLSTM-CRF and baseline BERT models.This approach offers a robust and efficient NLP solution for intelligent operation and maintenance in the power equipment field.

Application of Multi-strategy Fusion Crayfish Optimization Algorithm in Quantum CircuitScheduling

LI Hui, JU Mingmei, WANG Jiepeng, JI Yingsong

Computer Science. 2026, 53 (6A): 250400088-7. doi:10.11896/jsjkx.250400088

Abstract

PDF(2717KB) ( 21 )

References | Related Articles | Metrics

With the rapid development of quantum computing technology,the running time of quantum circuits and the cost of additional gate insertions have become the main challenges in achieving efficient quantum circuit scheduling.To address these,this paper proposes a multi-strategy fusion crayfish optimization algorithm(MPF-COA).By customizing the initial population and combining the two strategies of entropy-based adaptive mechanism and pheromone transmission mechanism,the efficiency and accuracy of the algorithm are optimized,and the optimization efficiency of quantum circuit scheduling under complex constraints is significantly improved.The optimization of the initial population employs dependency-based SWAP insertion strategy(DBSI),ensuring a higher-quality starting solution.On this basis,the entropy-based adaptive temperature adjustment mechanism avoids getting trapped in local optimal solutions.The pheromone transmission mechanism effectively improves the search efficiency for the global optimal solution by guiding the search direction.The performance evaluation is conducted using the 2QAN quantum computing framework on a benchmark test set with qubit scales ranging from 4 to 22.The results show that compared with 2QAN,MPF-COA reduces the average number of SWAP gates by approximately 3.2% in t|ket〉 and by approximately 10.68% in Qiskit,and reduces the number of CNOT gates by approximately 4.69% and 11.89%,respectively.This study demonstrates the potential for the deep integration of bionic algorithms and quantum circuit scheduling,providing a sustainable research foundation for future scheduling and optimization of larger-scale quantum circuits.

Review of Small Object Detection Based on Deep Learning

CHEN Nuo, ZHAO Peng, HUAN Haisheng

Computer Science. 2026, 53 (6A): 250700022-9. doi:10.11896/jsjkx.250700022

Abstract

PDF(2514KB) ( 20 )

References | Related Articles | Metrics

As a key difficulty and important branch in the field of object detection,small object detection has long been a research hotspot in computer vision due to its characteristics such as tiny target size,blurred features,and vulnerability to background interference.In recent years,the rapid development of convolutional neural networks has significantly improved the performance of small object detection.This paper comprehensively reviews deep learning methods for object detection.It summarizes the relevant challenges in small object detection,mainly including the loss of spatial information during feature extraction,the lack of available information of the target itself,and the weak generalization ability of the model caused by the insufficient number of annotated samples.Subsequently,aiming at the above problems,this paper focuses on analyzing the methods and optimization strategies for small object detection.Secondly,this paper focuses on typical application scenarios such as autonomous driving,UAV detection,and medical imaging,and discusses in detail the practical applications and innovative achievements of related detection methods.Finally,it looks forward to the future research directions of small object detection,pointing out the direction for subsequent research work.

Review of SAM-based Vision Applications

ZHANG Xu, WANG Anzhi, YANG Chenbang, WU Jintao

Computer Science. 2026, 53 (6A): 250600115-9. doi:10.11896/jsjkx.250600115

Abstract

PDF(2702KB) ( 29 )

References | Related Articles | Metrics

Segmentation Everything Model(SAM),as a generalized visual segmentation macromodel,brings new opportunities to the field of computer vision by virtue of its powerful zero-sample generalization ability and interactive segmentation capability.SAM builds up an efficient adaptation capability for cross-domain tasks by means of large-scale data training and a flexible cueing mechanism,and is able to quickly adapt to a wide range of vision tasks,such as medical image analysis and autonomous driving.SAM can be quickly adapted to a variety of vision tasks,such as medical image analysis and autonomous driving.In order to gain a deeper understanding of the performance bottlenecks and technical challenges of SAM in various vision applications,this paper explores its optimization path in vision tasks.Firstly,the core framework of SAM is introduced,and then,the improved models are classified and their applicable scenarios are analyzed.On this basis,the current research status of SAM in different visual tasks is sorted out and summarized,and experimental comparisons and demonstrations are made on the common datasets and evaluation metrics of each application scenario.Finally,the limitations and future development directions of SAM in different visual tasks are deeply analyzed and discussed.

Pyramid Pooling Visual State Space Model for UAV-Satellite Cross-view Geo-localization

YUE Wenjie, JIANG Jie, ZHAN Lixin, ZHOU Bingquan, ZHOU Tianjian

Computer Science. 2026, 53 (6A): 250700192-7. doi:10.11896/jsjkx.250700192

Abstract

PDF(2980KB) ( 21 )

References | Related Articles | Metrics

Cross-view geo-localization between UAV and satellite images has emerged as a promising alternative to GNSS-INS,particularly in environments where satellite signals are weak or obstructed.However,significant visual discrepancies caused by differences in viewpoint,illumination,and resolution pose considerable challenges for image matching.To address this issue,a novel method called P2VSSM(Pyramid Pooling Visual State Space Model) is proposed.By integrating a pyramid pooling self-attention mechanism into the Mamba architecture,the model enhances feature extraction capabilities for cross-view images.The proposed PPSA module aggregates multi-scale contextual information,improving both semantic abstraction and global modeling.Additionally,the InfoNCE loss is introduced to replace the traditional triplet loss,thereby avoiding the heavy burden of hard negative mining and significantly improving the diversity of negative samples and the stability of contrastive learning during training.Experimental results on two public UAV-satellite datasets,University-1652 and SUES-200,demonstrate that the proposed method achieves state-of-the-art performance in both UAV-to-satellite and satellite-to-UAV retrieval tasks.Extensive ablation studies further confirm the effectiveness and robustness of the proposed approach.

LitchiNet:Lightweight Litchi Variety Recognition Network with Fused Multi-scale Gated Attention and Class Imbalance Awareness

SU Ye, XU Xin, ZHAO Longlong, LI Xiaoli, CHEN Pan, CHEN Jinsong

Computer Science. 2026, 53 (6A): 250600127-8. doi:10.11896/jsjkx.250600127

Abstract

PDF(3437KB) ( 25 )

References | Related Articles | Metrics

Accurate and efficient recognition of litchi varieties is essential for intelligent postharvest quality assessment.How-ever,existing deep learning models face several challenges in this task,including fine-grained feature discrimination,limited sample numbers,class imbalance,and constraints on deployment resources.To tackle these challenges,a lightweight litchi variety recognition model,named LitchiNet is proposed.LitchiNet adopts a pretrained SqueezeNet1.0 as its backbone and integrates a novel Multi-Scale Gated Attention(MSGA) module.By combining multi-scale convolutional branches,channel attention,and a lightweight gating mechanism within a residual framework,MSGA enhances the model's ability to capture subtle inter-class diffe-rences and emphasize key feature regions.In the final stage of LitchiNet,a computationally efficient classifier structure is designed to ensure high inference speed and deployment friendliness.To further tackle class imbalance,LitchiNet introduces a Class Imba-lance Awareness Loss(CIA Loss) that incorporates both class weighting and a difficulty-aware modulation term,enabling more robust learning from minority classes.Experiments on a public litchi variety dataset demonstrate that LitchiNet achieves excellent performance,reaching a recall of 99.40% and outperforming four state-of-the-art lightweight models across all metrics.With only 3.210×10⁶ parameters,the model is well-suited for edge deployment.Comparative experiments with four state-of-the-art attention modules further reveal that the inclusion of MSGA leads to faster convergence,lower final loss,and better recognition accuracy.Moreover,the modular design of LitchiNet ensures compatibility with various backbone networks,offering strong generalizability and scalability.LitchiNet provides a practical and effective solution for fine-grained litchi variety recognition,and contri-butes a novel approach to lightweight agricultural AI applications.

Occlusion Head Pose Estimation Algorithm Based on Riemann Optimization

WANG Baohui, TAN Yingjie , CHEN Jixuan

Computer Science. 2026, 53 (6A): 250300109-9. doi:10.11896/jsjkx.250300109

Abstract

PDF(2852KB) ( 18 )

References | Related Articles | Metrics

Human head pose estimation is an important task in the field of deep learning,especially in vehicle-mounted assisted driving systems.As a key means of detecting fatigue driving,it has broad application prospects.However,in practical applications,facial occlusion often results in the loss of key feature points,seriously reducing model accuracy.In response to this problem,this paper proposes the 6DRepLKNet-RGO algorithm.Based on 6DRepNet,this algorithm optimizes the network structure and enhances feature extraction capabilities through structural re-parameterization design.At the same time,combined with the Riemannian manifold gradient optimization strategy,the learning process of the three-dimensional pose representation is optimized and the training error is reduced.In order to further improve the model's pose estimation accuracy under occlusion,a data enhancement method of random erasure is added.Experiments show that 6DRepLKNet-RGO reduces errors by more than 5% compared to 6DRepNet on public data sets such as BIWI and AFLW2000,and surpasses other advanced models in terms of the MAE metric,verifying its effectiveness.

Rain and Fog Weather Object Detection Algorithm Based on Improved YOLOv8 Model

ZHANG Shouyi, SHEN Qiang, GUO Yiran, WANG Hanyu

Computer Science. 2026, 53 (6A): 250300090-7. doi:10.11896/jsjkx.250300090

Abstract

PDF(3190KB) ( 17 )

References | Related Articles | Metrics

To address the issue of reduced detection accuracy of traditional object detection algorithms under rainy and foggy weather conditions,this paper proposes an improved YOLOv8-based object detection method.Firstly,the FogEnhanceNet deha-zing enhancement module is introduced to improve the contrast and clarity of target regions at the model input stage,thereby enhancing feature distinguishability.Secondly,an Adaptive Contrast Attention(ACA) mechanism is incorporated to dynamically adjust the weights of channel and spatial information,optimizing target feature representation in low-contrast environments.Finally,a lightweight C2f-Ghost-GF structure is designed to reduce model parameters while leveraging guided filtering(GF) to enhance edge feature extraction for foggy images.Experimental results show that the improved model achieves an 11.3% increase in mAP without a significant increase in model parameters,providing an effective solution for target detection in complex weather conditions.

HCKD:Lightweight Skin Lesion Classification Method Based on Dermoscopic Images

LI Siyu, QIAN Wenhua

Computer Science. 2026, 53 (6A): 250600143-9. doi:10.11896/jsjkx.250600143

Abstract

PDF(4250KB) ( 19 )

References | Related Articles | Metrics

Skin cancer is one of the most common malignant tumors.Early diagnosis and active treatment can significantly improve the survival rate of patients.The existing research on automatic diagnosis of skin diseases based on convolutional neural networks is devoted to developing a deeper architecture,andthe increase in computational resource overhead and model parameter count subsequently restricts the lightweight deployment of the model.At the same time,there is an imbalance in the distribution of skin disease data,which leads to a decrease in the performance of the model in identifying rare diseases.Aiming at the dual challenges of high computational complexity and unbalanced data distribution faced by the current skin disease automatic diagnosis system,this paper proposes a hierarchical collaborative knowledge distillation method HCKD.By jointly optimizing the response knowledge distillation of the fully connected layer,the structural relationship knowledge distillation of the embedded la-yer,and the channel feature knowledge distillation of the convolutional layer,a hierarchical knowledge transfer mechanism is constructed to achieve efficient compression of the model.At the same time,a weighted cross-entropy loss function is introduced to enhance the recognition ability of the model for rare diseases.In the classification task of the ISIC 2019 dataset,the student model trained by the HCKD method achieves an accuracy rate of 85.7% and a balance accuracy rate of 82.6%,and the number of model parameters and computational resource overhead are significantly lower than the teacher model.The ability to identify rare diseases has been improved,and the best results have been achieved compared with the current three popular knowledge distillation methods.

Improved Method for Radar Target Tracking Under Time-varying Non-Gaussian Observation Noise

YANG Hankun, ZHU Bowei, WANG Zuoshuai, XU Yidong

Computer Science. 2026, 53 (6A): 250300058-7. doi:10.11896/jsjkx.250300058

Abstract

PDF(3761KB) ( 18 )

References | Related Articles | Metrics

This paper proposes an improved Minimum Error Entropy Kalman Filter algorithm aimed at addressing the radar target tracking problem in complex time-varying non-Gaussian observation noise environments.By introducing an adaptive noise covariance adjustment strategy and observation data smoothing techniques,the algorithm can dynamically and effectively adapt to sea surface noise characteristics.Through simulation experiments,it is found that in time-varying heavy-tailed noise environments,the improved algorithm reduces the average absolute error distribution in the x and y directions by approximately 67.44% and 69.09%,respectively.In time-varying skewed noise environments,the reductions are approximately 71.99% and 70.90%,respectively.These results demonstrate that the improved MEEKF algorithm has significant advantages in handling time-varying non-Gaussian observation noise environments,providing an effective solution.

Conditional Dual-network Fusion for Illumination-adaptive Infrared and Visible Image

WANG Rongshuo, WANG Jiajia, JIA Zhenhong, ZHOU Gang

Computer Science. 2026, 53 (6A): 250600235-9. doi:10.11896/jsjkx.250600235

Abstract

PDF(4789KB) ( 20 )

References | Related Articles | Metrics

Fusion of infrared and visible images can take into account both details and targets in complex scenes.Still,the diffe-rence in day and night illumination leads to an inherent conflict in the focus of the two types of images:during the daytime,the texture structure of the visible image should be preserved,while at night,it relies on the infrared image to highlight the target,and it is difficult for a single network to optimize under different illumination at the same time,which results in degradation of the performance.The research objective of this paper is to resolve the policy conflict across different illumination conditions and develop a robust fusion method that can adapt to changes in illumination.To this end,this paper proposes a conditionalized dual-network framework that adapts the assignment of day/night scenes through light sensing and designs complementary information extraction and soft switching mechanisms during the fusion process to cope with continuous light transitions smoothly.Experiments on the MSRS,M3FD,and TNO datasets demonstrate that the method outperforms in both structural fidelity and target saliency metrics,significantly alleviating the performance bottleneck caused by day-night conflicts.The results verify that light adaptive modelling is an important way to improve the robustness of the fusion of infrared and visible images.

Q&A Model for Agricultural Diseases Based on Transformer

DUAN Pengsong, LUO Yu, WANG Chao

Computer Science. 2026, 53 (6A): 250400114-9. doi:10.11896/jsjkx.250400114

Abstract

PDF(2980KB) ( 17 )

References | Related Articles | Metrics

To address issues such as insufficient recognition accuracy and the lack of pest control recommendation generation in agricultural pest and disease identification,this paper proposes an automatic question-answering model that integrates computer vision techniques with instruction tuning strategies.An improved Vision Transformer(ViT) model is employed for classifying agricultural crop pest and disease images,incorporating an asymmetric convolution embedding module and a channel attention mechanism to enhance feature extraction capabilities and improve classification accuracy on large-scale datasets.Based on the classification results,LoRA(Low-Rank Adaptation) technology is applied to fine-tune the Baichuan large language model through instruction tuning,generating more precise and practical prevention and control recommendations,thereby enhancing the model's applicability in agricultural scenarios.The entire experiment is conducted on the Huawei MindSpore deep learning framework,leveraging the high-performance computing capabilities of the Ascend 910 NPU for efficient model training and inference.Experimental results demonstrate that combining the improved ViT model with the instruction fine-tuning strategy not only significantly improves classification accuracy but also generates highly actionable prevention and control recommendations.

Vehicle Re-identification Based on RWM and Multi-scale Attention

LI Yalong, WANG Hairui, ZHU Guifu, LU Shiyu

Computer Science. 2026, 53 (6A): 250400017-8. doi:10.11896/jsjkx.250400017

Abstract

PDF(3436KB) ( 19 )

References | Related Articles | Metrics

This paper proposes a vehicle re-identification method based on RWM and multi-scale attention to address the pro-blems of large intra class differences and high inter class similarity in existing vehicle re-identification tasks,which lead to insufficient key feature extraction and fusion of global and local features.Firstly,a Region Weighted Mapping(RWM) is designed to enhance the feature representation of key regions in the image,effectively reducing the interference of background information.Se-condly,based on the self attention mechanism of the Transformer structure,a multi-scale attention module(MAB) is introduced,which combines the large kernel receptive field and multi-scale characteristics to effectively model global structural information,while enhancing the expression ability of local details and improving the model's discriminative ability.Finally,a mixed loss function is constructed to optimize the feature learning process of the model,making the features of different categories of vehicles more distinguishable and improving generalization ability.Experiments on the proposed method are conducted on the VeRi-776 and VehicleID datasets.The CMC@1 values reach 97.4% and 85.8% respectively,while the CMC@5 values reach 98.9% and 97.7% respectively.The results show that the proposed method can extract more discriminative vehicle features.

Monocular Real-time 6D Pose Estimation for Weakly Textured Workpieces

FENG Yingbin, KANG Xueshi , WANG Tianlong

Computer Science. 2026, 53 (6A): 250800006-7. doi:10.11896/jsjkx.250800006

Abstract

PDF(5520KB) ( 22 )

References | Related Articles | Metrics

Aiming at the problems of different sizes,occlusion stacking and lighting changes of weakly textured workpieces in industrial scenes,a monocular 6D pose estimation method RAAS-PVNet is proposed.The design resolution adaptive rectangular convolutional RARConv dynamically adjusts the size of the convolutional kernel and the number of sampling points,which solves the problem of the insufficient ability of the traditional convolutional structure in modeling multi-scale information.The angular distance collaborative weighted voting strategy AS is proposed,the vertical distance constraint of the direction vector extension line is introduced,and the credibility of each voting point is accurately measured by combining the continuous weight fusionmecha-nism,so that the voting results are focused on high-quality points and the anti-occlusion ability of the model is improved.Faced with the problem of lack of industrial part datasets in the field of pose estimation,a dataset production method combining real data and synthetic data in proportion is designed to construct the workpiece dataset 6DInd.Experiments show that the 2D Projection and ADD(-S) of RAAS-PVNet on 6DInd are increased by 10.22% and 10.26%,respectively,and have good robustness under occlusion and lighting changes,and the processing speed of 30 fps meets the real-time requirements.

RGB-IR Multi-modal Fusion-based Tomato Small Object Detection

DONG Ye, LIAN Xinyue, WANG Yuyang, OU Xinyu

Computer Science. 2026, 53 (6A): 250700173-8. doi:10.11896/jsjkx.250700173

Abstract

PDF(2754KB) ( 30 )

References | Related Articles | Metrics

Automated tomato harvesting plays a pivotal role in enhancing agricultural efficiency and ensuring produce quality,but faces significant challenges in complex orchard environments where single-modal systems often fall short.Variabilities in lighting,occlusions by foliage,and the subtle characteristics of small targets in RGB images,coupled with inadequate feature extraction from single-sensor data,significantly impede precise detection.This study introduces the robust tomato detection with multi-modal fusion(RTDMF) model,designed to address these limitations by integrating RGB and infrared(IR) imaging technologies to bolster detection robustness.Constructed on the YOLOv5 framework,RTDMF incorporates lightweight depthwise separable convolutions and adaptive anchor boxes to enhance sensitivity towards small targets.The dual-branch architecture of RTDMF processes RGB and IR data independently,fusing color,texture,and thermal features effectively through self-attention mechanisms and specialized fusion modules.Furthermore,Mosaic data augmentation and dynamic learning rate strategies are employed to further enhance the model's generalization and convergence capabilities.Evaluated on a multimodal tomato dataset encompassing varying levels of maturity,diverse lighting conditions,and occlusion scenarios,RTDMF demonstrates a notable improvement,it achieves a 9.7% increase in mean average precision(mAP) and a0.6% higher recall rate compared to single-modal models.It also significantly reduces the miss and false detection rates by 2.3% and 3.1%,respectively.Visual analysis confirms the model's effectiveness in low-contrast and heavily occluded scenarios,showcasing its superior adaptability to real-world agricultural challenges.This multi-modal approach delivers a robust solution for automated harvesting systems in dynamic environments,marking a significant advancement in the field of agricultural automation.

Multi-scale Feature Screening-integrated Lightweight Algorithm for Blast Heap Ore Image Segmentation in Open-pit Mines

GU Qinghua, MA Xiang, LI Xuexian

Computer Science. 2026, 53 (6A): 250500016-8. doi:10.11896/jsjkx.250500016

Abstract

PDF(4396KB) ( 25 )

References | Related Articles | Metrics

With the rapid development of smart mines,the real-time and precise identification of large rocks in the blasting operation for ore loading has become a key requirement for ensuring transportation safety and efficiency.To address the challenges posed by highly irregular shapes,significant overlap among particles,low image resolution,and sparse features in the images of the blasted heap ore,this paper proposes a lightweight image segmentation algorithm for blasted heap ore that achieves a balance between accuracy and efficiency through multi-dimensional model optimization.Firstly,the topological characteristics of the DynamicHGNetv2(Dynamic High Performance GPU Network version2) hierarchical graph network are utilized to reconstruct the backbone network,compressing redundant features through a dynamic routing mechanism,which reduces the model size by 42.4%.Secondly,the HSFPN(High-level Screening-feature Fusion Pyramid) is designed as the neck network,employing a multi-scale feature screening mechanism guided by channel attention,which reduces the computational load by 27.4% while enhancing the ability for cross-scale feature fusion.Subsequently,a lightweight segmentation head is constructed,optimizing computational efficiency further through depthwise separable convolutions and feature distillation techniques.Finally,the EMASlideLoss(Exponential Moving Average SlideLoss) loss function is introduced,dynamically adjusting the weights of difficult samples based on an exponential moving average strategy,significantly improving the model's edge segmentation accuracy for low-quality ore targets.Experimental results indicate that,compared to the YOLO11n-seg benchmark model,the proposed method reduces the number of parameters and computational costs by 42.4% and 27.4%,respectively,while mAP50 and mAP50:95 improve by 0.1% and 2.1%,respectively.This not only meets the needs for high-precision real-time segmentation in mining scenarios but also can be directly deployed on edge computing devices,providing reliable technical support for early warning systems for large ore in smart shoveling systems.

Prenatal Diagnosis of Fetal Cerebellum Based on Brain Anatomical Structures

WU Xiaoxiao, WU Xinglong

Computer Science. 2026, 53 (6A): 250400049-7. doi:10.11896/jsjkx.250400049

Abstract

PDF(3433KB) ( 19 )

References | Related Articles | Metrics

Fetal Cerebellar Hypoplasia(CH) is a severe developmental disorder of the central nervous system,the early diagnosis of which is crucial for the health of the foetus.This paper proposes a brain anatomy-based network(BAB-Net) for prenatal diagnosis of CH,aiming to improve the accuracy of ultrasound-based diagnosis.BAB-Net takes ultrasound images and brain anatomical features as inputs and uses an anatomy-constrained network for feature extraction and fusion.Ultrasound image data from a tertiary hospital between September 2019 and September 2023 are collected,including a total of 301 cases of CH-affected fetuses and 547 cases of normal fetuses.In these cases,the boundaries of the cerebellums,cisterna magnas,and skulls are marked by experienced sonographers.When the model training is completed,the classification accuracies of BAB-Net on two independent test sets reach 0.977 8 and 0.922 2 respectively,notebly superior to other mainstream networks.In cases where the gestational age is less than 30 weeks,BAB-Net showes higher accuracy.Further analysis finds that the influence of the anatomical structures of the fetal cerebellum and cisterna magna on the network performance is greater than that of the skull structure.By blended with the anatomy-constrained network,BAB-Net effectively improves the diagnostic accuracy of fetal CH,provides a new approach for prenatal screening of CH and offers important references for clinicians in pregnancy management and precise intervention.

Diabetic Retinopathy Grading Based on Label Relaxation Multi-view Feature Fusion

DUAN Lian

Computer Science. 2026, 53 (6A): 250200048-6. doi:10.11896/jsjkx.250200048

Abstract

PDF(3298KB) ( 19 )

References | Related Articles | Metrics

Diabetic retinopathy is a common complication of diabetes,and accurately identifying the stages of diabetic retinopathy is crucial for subsequent treatment.Fundus images play a key role in the grading of diabetic retinopathy.With the advancement of artificial intelligence technologies,many researchers have extracted deep features and radiomic features from fundus images to conduct studies on the grading of diabetic retinopathy.This study combines deep features and radiomic features to design a feature fusion algorithm.Firstly,deep features are extracted from fundus images using convolutional neural networks,while radiomic features are obtained through radiomic methods.Subsequently,a label relaxation-based multi-view learning algorithm is designed for feature fusion.The primary goal of label relaxation is to enhance the distinguishability of training samples in the label space,thereby improving the classification accuracy of the model.Furthermore,this study introduces a graph constraint based on manifold learning methods to mitigate the overfitting issues caused by label relaxation.Finally,theeffectiveness of the proposed methodis validated on two fundus image datasets:the DR1 dataset and the MESSIDOR dataset.

Water Meter Reading Recognition Based on Deep Learning and Prior Correction

CHU Chunyu, JIANG Feilong

Computer Science. 2026, 53 (6A): 250300143-7. doi:10.11896/jsjkx.250300143

Abstract

PDF(3890KB) ( 18 )

References | Related Articles | Metrics

The existing deep learning-based water meter reading recognition methods generally recognize each digit or pointer of the water meter in isolation,and then simply splices the recognition results of each bit for the final result.However,due to the existence of occlusal gaps between the counting gears of the water meter,possible structural errors in the water meter itself,and the shooting angle,there may be situations such as incomplete display of the water meter word wheel digits,the word wheel turning between two digits,and deviation of the pointer indication,etc.,at which time,a simple combination of the recognition results of each bit of the digits or pointers may lead to errors in the final recognition results.To address the above problems,this paper proposes a water meter reading recognition method based on deep learning and a priori correction.The method is based on the PaddlePaddle framework,uses the lightweight model architecture MobileNetV3 and SVTR to read the word wheel region,and at the same time,uses the image processing technology to read the pointer reading.Finally,it takes full advantage of the correlation a priori knowledge of the correlation between each digit in the word wheel of the water meter and the readings of each pointer to correct the recognition results.In this paper,the recognition and correction methods of the word wheel area and the pointer area are discussed.These methods are applied to the water meter images for experimental testing,and compared with the existing methods.The results show that the proposed method can effectively improve the accuracy of the water meter reading recognition results.

Accurate Recognition of Dialect Based on CTC-Conformer Model

SHEN Yingchun, FENG Xiaohan, LI Qian

Computer Science. 2026, 53 (6A): 250600112-8. doi:10.11896/jsjkx.250600112

Abstract

PDF(2221KB) ( 21 )

References | Related Articles | Metrics

With the rapid development of speech recognition technology,dialect speech recognition has become significant in va-rious application scenarios.To address challenges such as phonetic variations,speech speed differences,and noise interference in dialect recognition,this paper proposes a dialect speech recognition method based on the CTC-Conformer model,aiming to improve recognition accuracy and robustness for dialect speech.The model combines the Conformer architecture and CTC mechanism.The encoder uses convolutional neural networks and multi-head self-attention mechanisms to extract local features and long-range dependencies from audio,enhancing the understanding of dialect speech.The decoder adopts the CTC mechanism and a dual-attention mechanism,reducing the need for alignment and enhancing contextual modeling ability.A multi-task learning stra-tegy optimizes the balance between CTC loss and cross-entropy loss,further improving recognition accuracy.Experimental results show that the proposed CTC-Conformer model achieves a 79.08% character accuracy on standard test sets,and it maintains stable perfor-mance in noisy environments,with an accuracy of 65.20% even in severe noise conditions,demonstrating its robustness and precision.In summary,the proposedCTC-Conformer model provides an efficient and robust solution for dialect speech recognition,with broad potential for real-world applications.

Multi-layer Graph Convolutional Action Recognition Method Based on Topological Information

HUANG Haixin, HE Tianyu, HOU Guangshuai

Computer Science. 2026, 53 (6A): 250600147-5. doi:10.11896/jsjkx.250600147

Abstract

PDF(3118KB) ( 21 )

References | Related Articles | Metrics

Human action recognition achieves the identification of human behaviors by analyzing spatiotemporal features in vi-deos.As one of the important research topics in the field of computer vision,its efficient and accurate recognition performance has demonstrated wide application value in various scenarios such as human-computer interaction and intelligent security.Graph Convolutional Networks(GCNs),owing to their significant advantages in modeling human skeletal topology,have become a mainstream method for action recognition tasks.However,existing approaches generally adopt a unified modeling of the entire skeleton structure,overlooking the hierarchical characteristics of the human body composed of multiple functional regions.This limitation restricts model performance in complex action recognition tasks.To address these,this paper proposes a Topology-informed Multi-layer Graph Convolutional Network(TMGCN).The model employs a multi-branch architecture to partition and model the human skeleton,effectively capturing spatial dependencies between skeletal nodes.Additionally,it introduces a Topology Perception Unit(TPU) to extract and integrate topological features during graph convolution,enhancing the model's representation capability for skeletal topology.Experimental results based on NTU-RGB+D dataset show that TM-GCN has achieved excellent performance in human skeletal action recognition tasks,and effectively improved the accuracy of action recognition.

Aerial Image Object Detection Model Based on Dual-domain Attention and Feature Fusion

MAO Lihong, TANG Jianjun, CHEN Tong, ZHANG Rui

Computer Science. 2026, 53 (6A): 250600036-7. doi:10.11896/jsjkx.250600036

Abstract

PDF(4303KB) ( 17 )

References | Related Articles | Metrics

With the escalating strategic importance of China's low-altitude economy,precise object detection in aerial imagery under complex scenarios has emerged as a pivotal technology.However,challenges including dense object clustering,intricate background environments,and the prevalence of small targets in aerial images continue to impede feature extraction and detection accuracy for deep learning-based models.This paper introduces an aerial imagery object detection framework integrating dual-domain attention mechanisms and hierarchical feature fusion.Firstly,a channel-spatial dual-domain attention module is engineered to suppress background interference while amplifying salient feature channels through adaptive weight calibration.Secondly,a cross-layer multi-scale feature fusion architecture is developed,incorporating residual fusion pathways and learnable weighting coefficients to enable effective multi-resolution feature interactions.Finally,a dedicated 20×20 pixel small-object detection branch is appended to enhance fine-grained target recognition capabilities.Experimental evaluations on the VisDrone2019 dataset demonstrate substantial performance gains over state-of-the-art baselines:the proposed model achieves relative mAP_0.5 improvements of 97.3%,18.6%,22.7%,33.7%,12.2%,18.4%,and 37.9% compared to Faster R-CNN,LFET-NetYOLOv5x,YOLOv8,YOLOv9,YOLOv10,and YOLOv11 respectively.These results fully demonstrate the effectiveness of the proposed model inaerial image object detection tasks.

Infrared and Visible Image Homography Estimation for Power Equipment Based on Improved MobileNetV4

WANG Sheng, ZHANG Linghao, ZHANG Juling, PANG Bo, XI Ning, SHE Wenkui

Computer Science. 2026, 53 (6A): 250400077-7. doi:10.11896/jsjkx.250400077

Abstract

PDF(3343KB) ( 20 )

References | Related Articles | Metrics

The homography estimation of infrared and visible images is one of the key techniques to improve the positioning accuracy and defect detection accuracy of power equipment.To address the problems of insufficient accuracy and large model size of existing methods in homography estimation of infrared and visible images of power equipment,a lightweight homography estimation method based on improved MobileNetV4 is proposed.Firstly,MobileNet is applied to the homography estimation task for the first time,and a lightweight estimation model is designed.Secondly,an improved MobileNetV4 model,CBMobileNet,is proposed by highlighting the key features in the feature map through the introduction of the CBAM module in each stage of MobileNetV4.Finally,the number of parameters and computational complexity of the model is significantly reduced using the L1 Norm pruning algorithm while ensuring less performance loss.The experimental results show that the average corner error of the proposed method substantially decreases from 5.06 to 4.95 compared to the suboptimal algorithm on the synthetic benchmark dataset.In addition,compared to the original model,the pruned model significantly reduces the parameters from 10.04 MB to 6.91 MB and the FLOPs from 1 029.48 MB to 755.11 MB,while the average corner error only slightly increases from 4.93 to 4.95.

Armory Equipment Detection Based on Improved YOLOv5

ZHOU Wenwu, LEI Lei, XUAN Xin

Computer Science. 2026, 53 (6A): 250800049-6. doi:10.11896/jsjkx.250800049

Abstract

PDF(2660KB) ( 24 )

References | Related Articles | Metrics

To address the issues of low efficiency and poor real-time performance in the management of military warehouse equipment,as well as the shortcomings of existing detection models in complex warehouse scenarios－such as missed detection of small targets,false detection of dense targets,poor environmental robustness,and anchor mismatch－this paper proposes an improved YOLOv5-based detection method for warehouse equipment.Based on the YOLOv5s model,the proposed approach incorporates the following improvements:constructing a multi-scale feature enhancement network to improve the recognition capability for small targets;adopting DIoU-NMS to enhance the accuracy of dense target detection;introducing the CBAM attention mechanism and a frequency-domain illumination suppression module to strengthen the model's adaptability to complex environments;and optimizing anchor matching through K-means++ re-clustering.Experimental results show that the improved model maintains lightweight characteristics while significantly increasing average precision,with both precision and recall outperforming the original model.This method can effectively support intelligent inventory and management of warehouse equipment.

Improved YOLOv5s-based Algorithm for Emergency Situation Detection in Airport Terminals

LIU Dai, AN Pengyu, WANG Kai

Computer Science. 2026, 53 (6A): 250300174-7. doi:10.11896/jsjkx.250300174

Abstract

PDF(3753KB) ( 24 )

References | Related Articles | Metrics

Based on the response urgency requirements of terminals on emergency calls,this article optimizes the previous YOLOv5s object detection model to improve terminal situation emergency calls.There are three types of detection targets in the study:flames,smoke,and people's falls.Specifically,it improves the model by using MPDIoU as the loss function instead of original one,replacing NMS algorithm with softNMS algorithm and integrating the BiFormer attention mechanism into the small target detection layer.The study employes ablation experiments and comparative trials to validate the effectiveness of the improvements.Experimental results demonstrate that the enhanced model exhibits superior performance after training on a custom-built dataset,achieving significant improvements in the average precision metrics of mAP@0.5 and mAP@0.5:0.95,reaching 93.1% and 63.5% respectively.Compared to the original YOLOv5s model,these metrics increase by 1.7% and 4.2%,while outperforming the latest YOLOv11 model by 1.1% and 5% in the respective metrics.The better model works well with real-time video feeds from the airport terminals.It fulfills the requirements for incident detection and has high prospect of application at this kind of critical places.

SeguGAN:Research on Super-resolution Reconstruction of License Plate Images UtilizingGenerative Adversarial Networks

HUANG Haixin, HOU Guangshuai, HE Tianyu

Computer Science. 2026, 53 (6A): 250600070-5. doi:10.11896/jsjkx.250600070

Abstract

PDF(3270KB) ( 21 )

References | Related Articles | Metrics

In intelligent transportation systems,super-resolution(SR) reconstruction of license plate images is crucial due to common issues like poor lighting,motion blur,and low resolution in surveillance footage.Existing SR methods often produce artifacts and lose high-frequency details,leading to blurred outputs.To tackle these problems,this paper proposes SeguGAN—a novel framework that integrates semantic cues using CLIP-based features via a semantic-aware module(SeMSCA) in the discriminator.The generator employs a gating mechanism to merge multi-branch features(from RRDB and CAMConv),enhancing reconstruction quality.It also introduces Dynamic Tanh(DyT) to replace layer normalization,simplifying the architecture while improving performance.Evaluated on the CCPD2019、CCPD2020dataset,SeguGAN achieves 33.20 dB PSNR and 0.906 SSIM,outperforming ESRGAN,SRGAN,ECBSR,RCAN,and SwinIR by 5.9% in PSNR and 1.9% in SSIM on average.The results confirm its effectiveness in license plate SR reconstruction.

Semantic Perception Active Learning Method for the Datum Map of Scene Matching Navigation System

SHAN Chengcheng, MEI Chun, LI Weiting, GUO Yuanyuan, QIAN Weixing, XIONG Zhi

Computer Science. 2026, 53 (6A): 250600228-8. doi:10.11896/jsjkx.250600228

Abstract

PDF(3058KB) ( 17 )

References | Related Articles | Metrics

The high-precision datum semantic information obtained by aerial remote sensing target detection technology can effectively improve the perception dimension of scene matching navigation system.Due to the large scale,high target density and high annotation cost of remote sensing images,the training and application of high-performance detection models are limited.In order to solve the problems of difference in data distribution between source domain,target domain and insufficient annotation data in target domain in semantic object detection in scene matching scenes,an efficient active learning method is proposed to optimize the selection of target domain samples.This method uses the labeled data in the source domain and the unlabeled data in the target domain to select high-information samples from the target domain for manual labeling through the active learning strategy,so as to make up for the impact of insufficient data labeling in the target domain.This paper proposes three active learning scoring functions,namely consistency score,discriminator score and cosine difference score,which are designed to evaluate the labeling value of target domain samples from the perspectives of detection frame prediction inconsistency,domain belonging probability and feature difference,respectively.At the same time,a set of evaluation framework for object detection tasks is constructed,which considers the overall annotation effect of each image and quantifies the annotation cost of each detection frame.Experiments show that the proposed method can improve the object detection performance of active learning by about 3.29% on the target domain,and reduce the number of labeled bounding boxes required by the target domain by about 17.6%.Under the condition of limited annotation resources,this method can effectively improve the performance of object detection and domain migration,and can cope with the problem of model performance degradation caused by distribution differences between source and target domains,provide a new solution for cross-domain object detection in remote sensing scenes,and provide reliable data support for the construction of high-precision semantic datum maps,thereby improving the accuracy and reliability of scene matching navigation.

Improved Stereo Matching Algorithm Based on Weighted Guided Image Filtering

ZHANG Ben, ZHU Denglin

Computer Science. 2026, 53 (6A): 250800043-5. doi:10.11896/jsjkx.250800043

Abstract

PDF(2652KB) ( 26 )

References | Related Articles | Metrics

In order to achieve real-time and effective stereo matching,this paper proposes a local stereo matching algorithm based on weighted guided filtering based on the existing local stereo matching algorithms.Firstly,the matching costs are computation based on multi-measures.Then in the cost aggregation stage,the weighted guided filtering method is used to guide the filtering,the regularization parameters are adjusted adaptively to achieve more accurate cost aggregation by Canny method.Finally,disparity maps obtained by WTA optimization strategy are processed by densification after LRC.The proposed algorithm is applied to stereo matching of images in Middlebury database,and the experimental results verify that the proposed algorithm is effective and robustness.

Asynchronous Dynamic Image Stitching Method Based on Parameter-adaptive Grey WolfOptimization Algorithm

SHAN Chengcheng, LI Weiting, MEI Chun, ZHAO Hui, QIAN Weixing, ZENG Qinghua

Computer Science. 2026, 53 (6A): 250600169-8. doi:10.11896/jsjkx.250600169

Abstract

PDF(3244KB) ( 18 )

References | Related Articles | Metrics

To address the inefficiency and dynamic object distortion issues in asynchronous image stitching under dynamic scenes,this paper proposes a parameter-adaptive image stitching method based on the Grey Wolf Optimizer (GWO) algorithm.This method integrates the swarm intelligence search mechanism of GWO into the RANSAC framework,mapping keypoint subsets as “wolf pack individuals” and utilizing the guided search mechanism of α,β,δ wolves to make keypoint selection “optimization-oriented”,ensuring the integrity and continuity of dynamic objects in asynchronous states.Additionally,to improve the robustness and processing efficiency of the algorithm in different scenarios,a two-stage parameter adaptation mechanism is introduced,including low-resolution pre-computation and dynamic termination conditions,achieving automated adjustment of core parameters such as error tolerance and iteration count.Experiments on the StabStitch-D dataset show that under the same iterative conditions,GWO-RANSAC improves the inlier matching rate by 4.91% compared to traditional RANSAC,PSNR value increases by 11.4% (from 32.5dB to 36.2dB) and SSIM value increases by 5.1% (from 0.881 to 0.926),while effectively reducing black borders and misalignment phenomena in stitched images,and ensuring the integrity and continuity of dynamic objects even in complex scenarios.Theoretical analysis shows that this method has significant advantages in resource-constrained environments and dynamic asynchronous scenarios,forming effective complementarity with deep learning methods.

Defect Detection of Transmission Line Fittings Based on Multiscale Feature Fusion Attention and Cross-layer Aggregation

CHEN Dianlong, LIU Tengbin, GAO Xiong, TIAN Zijian, ZHU Wenbing, ZOU Shun, WANG Qiang

Computer Science. 2026, 53 (6A): 250600110-7. doi:10.11896/jsjkx.250600110

Abstract

PDF(4085KB) ( 19 )

References | Related Articles | Metrics

This paper proposes a transmission line fitting defect detection method based on multiscale feature fusion attention and cross-layer aggregation,built upon the general framework ofthe YOLO series model.To address the issue of low detection accuracy of transmission line fittings in complex environments,a Multiscale Dual-branch Attention(MDA) mechanism is introduced into the feature extraction part of the network.This mechanism captures cross-dimensional interactions of multiscale features,establishing long-term dependencies between dimensions,thereby significantly improving detection performance.Additionally,to mitigate the loss of detail during feature transfer,a Cross-Layer Aggregation(CLA) module is proposed.This module aggregates multilevel feature layers from the backbone network with multilevel detection layers in the detection neck,preserving fine-grained information that might be lost during feature transmission.Compared to other state-of-the-art object detection models,the proposed method achieves higher detection accuracy on real-world transmission line fitting defect datasets,particularly excelling in small target defect detection and background noise suppression,demonstrating its practical value in transmission line maintenance.

Object Detection Method Based on Phased Training Strategy and Multi-scale Feature Fusion

QU Jiewu, LU Xinxi, SUN Jian, LIU Yan, GAO Ling, XU Binbin

Computer Science. 2026, 53 (6A): 250700088-7. doi:10.11896/jsjkx.250700088

Abstract

PDF(2740KB) ( 19 )

References | Related Articles | Metrics

To overcome these limitations,such as the computational bottlenecks and the challenge of balancing real-time perfor-mance with accuracy in the DETR(Detection Transformer) family of object detection methods during inference,this paper proposes an enhanced approach that combines a phased training strategy with multi-scale feature fusion.Specifically,the multi-layer encoder structure of DETR is simplified to reduce computational complexity,while the phased training strategy improves feature representation and accelerates model convergence.In the first phase,one-to-many label matching is adopted to obtain high-quality two-dimensional multi-scale features.In the second phase,the weights from the first phase are frozen,and a parallel attention-convolutional fusion module is introduced to further refine the features.Experimental results demonstrate that the proposed method achieves a 5× increase in inference speed and a 1.5-point AP gain over the baseline model on the COCO dataset,effectively alleviating DETR's inference inefficiency.In addition,it yields a 1.4-point AP improvement on the BitVehicle dataset.

Intelligent Recognition Method Based on Multimodal Feature Fusion

ZHONG Hao, KONG Qingxuan, CAI Xianqing, LI Zhizhong, SUN Hao

Computer Science. 2026, 53 (6A): 250700065-10. doi:10.11896/jsjkx.250700065

Abstract

PDF(5504KB) ( 24 )

References | Related Articles | Metrics

To addresses the issue of degraded vehicle recognition performance caused by changes in lighting,pose differences,and occlusions in complex traffic scenarios,this paper proposes a lightweight multi-modal fusion framework,called MM-ASTFL.Its core innovation lies in a triple attention mechanism.First,Adaptive Channel-Space Attention(ACSA),which dynamically weights MobileNetV2 features to significantly enhance the representation of key areas such as license plates and headlights.Second,Time-Aware Attention-Enhanced LSTM(TA-LSTM) uses multi-head self-attention to capture temporal key frames,accurately depicting driving behaviours such as lane changes and turns.Third,cross-modal cross-attention achieves bidirectional guidance and deep aggregation of visual and temporal information.Experiments on VeRi-776 demonstrate that MM-ASTFL achieves a vehicle re-identification mAP of 89.8% and Rank-1 of 92.6%,representing improvements of 1.7% and 1.6% over SOTA;driving behaviour classification F1-score reaches 92.4%,an increase of 2.2%;trajectory prediction ADE and FDE are reduced to 0.68 m and 1.25 m,respectively,with error reductions of 13.3% and 15.4%.Ablation experiments confirm that each module contributes significant gains,and visualisation analysis further validates its robustness under extreme conditions such as backlighting and occlusion,providing an efficient and reliable solution for intelligent transportation systems.

Research on Lightning Arrester Fault Identification Technology Based on Multi-source Image Fusion

WANG Haozhao, FU Fangda, WU Yuyi, WANG Luliang, YU Yang, QI Yifan

Computer Science. 2026, 53 (6A): 250700042-8. doi:10.11896/jsjkx.250700042

Abstract

PDF(4494KB) ( 26 )

References | Related Articles | Metrics

Aiming at the problems of incomplete information representation and easy loss of small-target features in lightning arrester fault detection using traditional single-modality images,this study proposes a lightning arrester fault identification method that integrates the multiscale residual pyramid attention network(MSRPAN) and the lightweight target detection model YOLO11-CGB.MSRPAN extracts multi-scale deep features from multi-modality images and combines the residual attention mechanism to avoid gradient disappearance,enhancing the feature expression ability to address the feature defects of single-modality images.The designed YOLO11-CGB model embeds the convolutional block attention module(CBAM) in the backbone network,uses GhostConv to replace the traditional convolutional layer to reduce the computational complexity,and combines the bidirectional feature pyramid network(BiFPN) to optimize multi-scale feature fusion,improving the small-target detection ability in complex backgrounds.Experiments show that the MSRPAN fusion method is superior to common fusion algorithms such as IHS,Brovey,PCA,WT,and CNN in both subjective and objective evaluations.The YOLO11-CGB model achieves a mean average precision at IoU＝0.5(mAP@0.5) of 94.88% and a recall rate of 94% on the self-built dataset.The recognition confidence levels for the damage(P),flashover(S),and crack(L) faults of the fused images can reach up to 0.88,0.85,and 0.84 respectively,which are better than those of single-modality images(infrared and visible-light images).

HIBA:Verifiable and Efficient Query Architecture for Blockchain Based on Hybrid Index

LI Xiaogang, ZHAO Hui

Computer Science. 2026, 53 (6A): 250600212-7. doi:10.11896/jsjkx.250600212

Abstract

PDF(3394KB) ( 21 )

References | Related Articles | Metrics

To address the challenges of low efficiency in complex queries and high verification overhead in blockchain systems,this paper proposes a Hybrid Index Blockchain Architecture(HIBA) that integrates a dynamic Bloom filter,inverted index,and Merkle tree,along with a collaborative verification mechanism.The approach employs the dynamic Bloom filter to rapidly pre-screen candidate blocks,reducing query complexity from linear to logarithmic and effectively avoiding full-chain traversal.Combined with inverted indexing,it precisely locates data associated with semantic keywords within blocks,further enhancing query efficiency.Additionally,a Merkle tree-based interval proof mechanism is introduced,enabling light nodes to perform incremental verification of query results,significantly reducing computational and communication overhead.The proposed solution achieves notable breakthroughs in multi-keyword semantic query support,query efficiency,and verifiability,demonstrating strong practicality and scalability.

Multi-RAG:Distributed Retrieval-augmented Generation Framework for Cross-domain Data

SHEN Jianwei, CHEN Hanlin, CHEN Xing

Computer Science. 2026, 53 (6A): 250900159-7. doi:10.11896/jsjkx.250900159

Abstract

PDF(2458KB) ( 21 )

References | Related Articles | Metrics

The increasing application of large language models(LLMs) in natural language processing tasks has established retrieval-augmented generation(RAG) as a critical technique for enhancing factual accuracy.However,the distributed storage of cross-domain data presents significant challenges,including barriers to data aggregation,insufficient scalability,and the lack of native distributed coordination.These challenges render traditional centralized RAG frameworks inadequate for cross-domain scenarios.To address this limitation,a distributed RAG framework named Multi-RAG is proposed for handling cross-domain data.This framework enables individual nodes to maintain independent embedding models and vector indexes tailored to their local data characteristics.Queries are routed in parallel to relevant nodes through a query distribution module.Each node retrieves and returns locally high-scoring document segments.A global re-ranking module then performs global semantic reranking on these segments.The optimally re-ranked context is subsequently fed into the LLM for answer generation.Experiments conducted within a synthetically constructed distributed environment using the MultiHop-RAG dataset demonstrate Multi-RAG's effectiveness.The framework achieves a Hits@10 score of 0.765 9,representing a 72% improvement over single-node retrieval(0.445 2) and maintaining performance within 3.1% of a centralized approach.Answer generation accuracy using the DeepSeek-R1 model marks a 48% increase compared to the single-node baseline.The study indicates that through its streamlined distributed coordination mechanism and global information fusion strategy,Multi-RAG effectively enhances retrieval and generation performance in cross-domain settings without requiring raw data consolidation.This framework provides a practical and efficient solution for collaborative knowledge utilization across institutions and domains.

SINDy-GSN:Sparse Identification of Network Dynamics for Group Behavior in Social Graphs

WANG Yuhan, MA Fuyuan, MA Shixuan, WANG Ying

Computer Science. 2026, 53 (6A): 250700014-10. doi:10.11896/jsjkx.250700014

Abstract

PDF(2673KB) ( 20 )

References | Related Articles | Metrics

The evolution of group behavior in social networks is often marked by nonlinearity,multi-agent coupling,and structural heterogeneity,posing challenges for traditional modeling methods in uncovering the underlying dynamics.To better capture these dynamics,this paper proposes an improved method for dynamic identification based on a structure-coupled function library—SINDy-GSN.The method leverages feature-driven discrete simulation,integrating user behavior states,adjacency structures,and topic information to construct a tripartite state vector,generating a high-dimensional nonlinear function library suited for social networks.The library incorporates first-order neighbor influence,normalized diffusion,and topic propagation coupling,effectively capturing the dynamic interplay between individual behavior and network structure.Using real-world social platform data,a simulation network is created,and discrete evolution models the propagation of group stances,enabling sparse modeling and identification of group behavior dynamics.Results show that SINDy-GSN maintains interpretability and sparsity while accurately identi-fying group propagation mechanisms,offering a versatile framework for modeling and predicting complex social behaviors,with strong adaptability and scalability.

Model-agnostic Cross-domain Few-shot Learning Framework Based on Invariant Risk Minimization

AN Yuexuan, ZHAO Xingyu

Computer Science. 2026, 53 (6A): 250900009-8. doi:10.11896/jsjkx.250900009

Abstract

PDF(2650KB) ( 22 )

References | Related Articles | Metrics

Few-Shot Learning(FSL) aims to build efficient predictive models using only a small number of labeled samples,thereby reducing the reliance on large-scale annotated data and improving the learning efficiency and practical value of models.How-ever,when there is a significant distribution shift between the test domain and the training domain,traditional methods often suffer a severe performance drop due to domain shift.Existing few-shot learning methods designed for domain generalization scenarios mostly rely on specific model architectures or alignment strategies,making them difficult to integrate with other methods to enhance generalization capabilities.Moreover,they often struggle to balance the learning of task-relevant features and domain-inva-riant features.To address these issues,this paper proposes the Model-agnostic Cross-domain Few-shot Learning framework based on the strategy of Invariant Risk Minimization(IRM).This framework can be integrated with various existing few-shot learning methods,enabling these models to effectively learn domain-invariant features from samples,thereby significantly improving their cross-domain predictive performance.Experiments on multiple benchmark datasets demonstrate the effectiveness of the proposed framework.

Academic Performance Prediction Model Based on Dynamic Graph Isomorphism Network

LI Fan

Computer Science. 2026, 53 (6A): 250800035-4. doi:10.11896/jsjkx.250800035

Abstract

PDF(2658KB) ( 15 )

References | Related Articles | Metrics

To address the limitations of existing academic performance prediction models in effectively capturing implicit student relationships,exhibiting high sensitivity to sparse data,and suffering from computational redundancy,this paper proposes a dynamically optimized prediction model based on graph isomorphism network.The model optimizes graph structure and information propagation through three core mechanisms.Firstly,a bimodal fusion graph construction technique generates dynamic adjacency matrices by integrating cosine similarity and standardized Euclidean distance,significantly enhancing relational representation robustness.Secondly,a parameter self-adaptation mechanism enables hierarchical feature fusion through learnable parameters,effectively improving adaptability to heterogeneous data.Finally,a K-value decay optimization method accelerates convergence while reducing computational complexity via progressive pruning.On the CGPA and Grade-Class datasets,the proposed model achieves prediction accuracies of 73.3% and 69.4%,outperforming optimal benchmarks including GNN by 8.3% and 4.8% respectively.The K-value decay strategy further reduces training time by 1.7 seconds compared to fixed k-value models(k=10) on Grade-Class dataset,demonstrating effective balance between computational efficiency and prediction accuracy.

Spatiotemporal Prediction of African Precipitation Based on Wavelet-Recurrent Neural NetworkFusion

MA Xiangxiang

Computer Science. 2026, 53 (6A): 250600171-5. doi:10.11896/jsjkx.250600171

Abstract

PDF(3091KB) ( 18 )

References | Related Articles | Metrics

Based on ERA5 data from 1979 to 2024,monthly precipitation data for the African continent from 1979 to 2019 and daily precipitation data from 2020 to 2024 are selected as indicators.Wavelet transform and other methods are employed to investigate the periodic characteristics of precipitation across different African regions.A hybrid wavelet-recurrent neural network(RNN) forecasting model is developed and compared with standalone RNN models.The results reveal that African precipitation exhibits significant non-stationarity and multi-scale periodicity,including seasonal cycles of 12－15 months,short-term cycles of 1－4 months,and long-term cycles of 96－128 months,reflecting the influences of monsoons and climate variability.The prediction accuracies of standalone RNN models(RNN,GRU,LSTM) are 0.522,0.516,and 0.515,respectively.In contrast,the hybrid wavelet-RNN model significantly improves forecasting accuracy,reducing MAE by approximately 60%,RMSE by approximately 66%,and increasing R² by approximately 70%.The MAE values reaches 0.000 339,0.000 338,and 0.000 337,while RMSE values reaches 0.000 626,0.000 628,and 0.000 622,and R² values reaches 0.889,0.886,and 0.891,respectively.The LSTM-WT model performs best in long-term trend prediction(R²≈0.88) and demonstrates enhanced capability in predicting 4～6 mm precipitation events.This study provides a scientific basis for water resource management,agricultural planning,and sustainable development in Africa.

Dual-channel Spatiotemporal Hypergraph Convolutional Network for Traffic Speed Prediction

CHEN Hongfengand ZHAO Zhenzhen

Computer Science. 2026, 53 (6A): 250500107-7. doi:10.11896/jsjkx.250500107

Abstract

PDF(3362KB) ( 23 )

References | Related Articles | Metrics

Traffic speed forecasting plays a vital role in tasks such as traffic congestion recognition and signal control in intelligent transportation systems.However,since traffic data contains spatial relationships that change dynamically over time,there are also correlations between non-directly adjacent road nodes in the road network,thus deriving implicit cross-regional collaborative features.Therefore,this paper proposes a dual-channel spatiotemporal hypergraph convolutional network for implicit feature extraction of traffic data to solve the above problems.Specifically,the network uses a clustering algorithm to discover global spatial features.Then,a dual-channel convolution method of hypergraph and line graph is established to capture the implicit spatial relationships in traffic data.Finally,a long short-term memory network with a convolutional structure is used to capture temporal features.Experiments in real scenarios show that the performance of this framework is better than the state-of-the-art baseline models.

CA-MLNet:Dual-stream Memory and Channel Attention Based High-precision Trajectory Prediction Model

ZHANG Xiaohan, YANG Fei, MA Jingyao, ZHAO Hanyue, ZHAO Xu

Computer Science. 2026, 53 (6A): 250600118-8. doi:10.11896/jsjkx.250600118

Abstract

PDF(3430KB) ( 19 )

References | Related Articles | Metrics

To address the insufficient prediction accuracy of single models in complex environments,this study proposes a novel prediction architecture named Channel Attention-enhanced Dual-stream Memory Network(CA-MLNet),which integrates improved state space models with memory networks.The core innovations of this work include:(1)Reconstruction of the Structured State-Space Model,it enhances its spatial feature selection capability in dynamic environments;(2)Innovative integration of xLSTM's mLSTM module as an auxiliary temporal modeling unit,it establishes a dual-branch feature fusion architecture.The improved SSM module effectively captures long-range spatial dependencies,while the mLSTM module enhances local temporal feature extraction through exponential gating mechanisms.These components achieve complementary advantages via adaptive weight fusion mechanisms.Experimental results on the GeoLife GPS Trajectories dataset demonstrate that the proposed model achieves a prediction accuracy of 99.01%,representing an 8.18% improvement over the baseline Mamba model and a 3.14% enhancement compared to the xLSTM architecture.Ablation experiments verify that the SSM module modification contributes 43.6% to spatial feature selection accuracy,with dual-module collaboration reducing trajectory offset errors.This approach provides a high-precision solution for intelligent traffic early warning systems.

Second-order Multi-channel Directed Graph Convolution for Gene Regulatory Inference

SHEN Yajie, WANG Jishu, JIN Kui, ZI Tong, TANG Mingjing

Computer Science. 2026, 53 (6A): 250700116-10. doi:10.11896/jsjkx.250700116

Abstract

PDF(4989KB) ( 21 )

References | Related Articles | Metrics

The rapid development of single-cell RNA sequencing(scRNA-seq) technology has led to an exponential increase in single-cell gene expression data,thereby resulting in the accumulation of extensive gene expression datasets.Therefore,there is a pressing need for computational methods capable of leveraging these datasets to uncover potential regulatory relationships between genes.In recent years,the advancements in deep learning and the expansion of known regulatory relationship datasets have facilitated the development of numerous supervised inference methods,particularly those based on graph neural networks(GNNs).However,most of these current methods model the prior regulatory network as an undirected graph,neglecting the directed nature of regulatory relationships between genes,which makes it impossible to extract directional information.In addition,due to the limited availability of known regulatory information,genes with similar or correlated expression patterns may not ne-cessarily have known direct connections.Most current methods rely solely on extracting first-order neighborhood information,which may hinder the ability to fully capture the richer information contained in prior networks and expression data.To address these challenges,this paper proposes a gene regulatory inference method based on second-order multi-channel directed graph convolution.By leveraging prior regulatory networks,the method constructs a first-order adjacency matrix,a second-order in-degree adjacency matrix,and a second-order out-degree adjacency matrix.Additionally,it employs a directional Laplacian matrix to more accurately represent the structure of directed graphs,thereby enhancing the performance of network inference and the ability to model complex regulatory patterns.Experimental results with multiple datasets and evaluation metrics demonstrate that the proposed method can more accurately predict potential regulatory relationships between genes compared to existing work.Meanwhile,extensive ablation studies confirm that these different modules of the proposed method contribute to improving model performance.

MFR-GCN:Key Node Identification via Multi-feature Fusion and Ranking Optimization in GraphConvolutional Networks

SONG Lin, WANG Yuning, SHI Keren, OU Yuan

Computer Science. 2026, 53 (6A): 250700049-9. doi:10.11896/jsjkx.250700049

Abstract

PDF(3163KB) ( 19 )

References | Related Articles | Metrics

To address the limitations of existing key node identification in the complex network,such as shallow feature fusion,insufficient dynamic adaptability,and weak differentiation of node significance,this paper proposes a graph convolutional network model based on multi-feature fusion and ranking optimization(MFR-GCN).The model innovatively incorporates deep feature interaction encoding and a learnable contrastive enhancement mechanism.It achieves dynamic and robust key node detection through hierarchical adaptive gating and conditional global information injection.Firstly,eight representative features spanning local attributes,global attributes,positional attributes,and random-walk properties(including the proposed LASPN centrality) are extracted from the network graph.These features are combined with node embeddings to construct feature vectors.Next,the vectors are fed into a hybrid layer integrating graph convolutional network(GCN) and graph attention network(GAT) for deep feature learning,while skip connections aggregate multi-scale information.Finally,an enhanced multi-component loss function－incorporating ranking loss,variance loss,and clustering loss—is designed for model training and optimization.During inference,a contrastive reinforcement layer amplifies the score differences of key nodes to further distinguish them.Validation experiments using the SIR propagation model are conducted on real-world datasets including Cora,Email,and C.elegans.Results demonstrate that compared to traditional methods like Degree Centrality and Betweenness Centrality,the key nodes identified by MFR-GCN achieve a significantly higher average final infection scale(such as exceeding the suboptimal method by approximately 6.22% on the USairport network).This highlights the model's superior global propagation potential and applicability.

Optimization of HAN-based GNN-Transformer Collaborative Contrastive Learning Framework

ZHANG Zihao, WU Zezhong

Computer Science. 2026, 53 (6A): 250900103-8. doi:10.11896/jsjkx.250900103

Abstract

PDF(3441KB) ( 26 )

References | Related Articles | Metrics

GNNs have demonstrated strong performance in graph representation learning due to their message-passing mechanism.However,they encounter challenges such as over-smoothing and insufficient capture of multi-hop neighbor information when processing heterogeneous graphs.The GNN-Transformer collaborative contrastive learning framework(GTC) combines the local information aggregation capability of GNNs with the global information modeling capability of Transformers.This framework implements self-supervised heterogeneous graph representation learning through cross-view contrastive learning,effectively mitigating the over-smoothing problem that GNNs experience during neighbor information aggregation.This study enhances the GNN branch of the GNN-Transformer collaborative contrastive learning framework by incorporating the node-level and semantic-level attention mechanisms from the heterogeneous graph attention network(HAN).This optimization enables more effective capture of information from different node and edge types in heterogeneous graphs during neighbor aggregation.Experiments on the ACM dataset demonstrate that the improved GNN-Transformer collaborative contrastive learning framework achieves superior performance in node classification and node clustering tasks.For node classification,with 20,40,and 60 labeled nodes,the mo-del exhibits average improvements of 0.52%,3.07%,and 3.14% in AUC,macro-F1,and micro-F1 scores,respectively.In node clustering,normalized mutual information(NMI) and adjusted rand index(ARI) increase by 6.57% and 6.38%,respectively.These results confirm that HAN's hierarchical attention mechanism enables finer neighbor aggregation and metapath semantic fusion in heterogeneous graph representation learning,offering a novel approach to alleviating the over-smoothing problem in GNNs.

Imbalanced Data Learning Approach Utilizing Feature Value Based Class Overlap Degree

SUN Bo, WANG Zhijun, ZHOU Zhunan, LI Qingjie, WANG Yun, GENG Xia, ZHANG Yan , SUN Chenxuan

Computer Science. 2026, 53 (6A): 250600199-8. doi:10.11896/jsjkx.250600199

Abstract

PDF(2318KB) ( 22 )

References | Related Articles | Metrics

Class imbalance problem is an important challenge in supervised machine learning field.In an imbalanced training set,although the minority class is significantly outnumbered by the majority class,it usually attracts more attention from the practitioners and has higher misclassification cost than the latter one.Most classifier learning algorithms usually employ the overall classification accuracy as the optimization goal,and thus easily misclassify the minority class examples that make less contribution to overall classification accuracy.Existing imbalance learning approaches often utilize the class imbalance ratio(IR) of a training set as the classification complexity measure as well as the optimization goal.However,it has recently been indicated that,compared with IR,class overlap can more objectively measure the learning difficulty of an imbalanced dataset.Considering the importance of class overlap in evaluating the data complexity,the imbalance problem is solved from the class overlap perspective,and an imbalanced dataset learning approach FO-RBU utilizing the class overlap information of a training set is proposed.Specifically,the distribution concerning the ratios of feature based class overlap examples is employed to evaluate the learning difficulty of an imbalanced dataset,and further utilized as a theoretical guideline in determining the proper undersampling extent of Radial-Based Undersampling approach.Experimental results show that the feature values based class overlap information is a good indicator in the proper undersampling ratio determination process,and the proposed class imbalance learning approach FO-RBU is effective.

Dynamic Sparsity and Heterogeneous Knowledge Distillation for Top-k Recommendation

FU Shiqi, ZHU Jinxia, XU Qichen, DU Zeyu

Computer Science. 2026, 53 (6A): 250700121-9. doi:10.11896/jsjkx.250700121

Abstract

PDF(2608KB) ( 21 )

References | Related Articles | Metrics

Current recommendation algorithms mainly focus on using deep learning techniques to enhance recommendation accuracy,thereby providing users with a collection of content they are interested in.However,the recommendation results obtained by such methods often have high computational costs and model redundancy,making them unsuitable for resource-constrained scenarios.To address these issues,a collaborative optimization framework(DySparseHKD) that integrates dynamic sparsity and he-terogeneous knowledge distillation is proposed.This framework builds a lightweight recommendation model that reduces the number of parameters while retaining key features.A dynamic sparsity rate allocation method based on interaction redundancy is proposed to capture more efficient parameter configurations.The training trajectory of the teacher model is utilized to achieve progressive knowledge transfer,alleviating the knowledge gap between heterogeneous models.The knowledge transfer granularity is dynamically adjusted according to the current learning state of the student model to improve transfer efficiency.Finally,the deep decoupling of model complexity and recommendation performance is achieved through joint optimization objectives.Experiments on three real datasets show that the proposed model achieves an organic integration of model efficiency and recommendation effect while maintaining lower complexity.

Model-based Trajectory Anomalies Detection Algorithm for Longitudinal Data

DONG Dong, JIN Pengchao

Computer Science. 2026, 53 (6A): 250800004-8. doi:10.11896/jsjkx.250800004

Abstract

PDF(2516KB) ( 23 )

References | Related Articles | Metrics

Longitudinal data have attracted considerable attention in fields such as public health because they capture the dynamic trajectories of the same subjects over time.Data cleaning process ensures the quality of longitudinal data modelling and directly influences downstream analysis quality.To address this issue,this paper proposes a novel anomaly detection method which integrates generalized linear model(GLM)-based polynomial fittingand adaptive clustering.The approach assigns binary normal and abnormal labels to individual trajectories and compares them with ground-truth annotations.On four independent datasets(two simulated longitudinal cohorts,one additional UCR dataset,and one real-world clinical dataset) a systematically comparison against established R-package methods is conducted.Experimental results demonstrate superior detection performance and robust generalizability across diverse settings.Applying the proposed method to six years of height data from primary school students in a specific city further demonstrates its effectiveness and accuracy in detecting outlier trajectories in practical longitudinal health data.This method offers robust support for public health surveillance and intervention,disease progression assessment,and clinical decision support.

Carbon Emission Prediction Algorithm Based on TransLSTM-GAN Model

ZHANG Xiaozhu, CHEN Hongyou, QU Lingfeng, WANG Yuechenjia, TIAN Baodan, FAN Yong

Computer Science. 2026, 53 (6A): 250400146-11. doi:10.11896/jsjkx.250400146

Abstract

PDF(4426KB) ( 29 )

References | Related Articles | Metrics

Carbon emission prediction is crucial for several aspects,including international cooperation,addressing climate change,and energy security.Due to the numerous and complex factors affecting carbon emission prediction within national geographic area and over longer time interval,there are higher requirements for the feature representation learning ability of prediction mo-dels.Aiming to the above problems,a carbon emission prediction model that integrates transformer,long short-term memory(LSTM) neural network,and generative adversarial network(GAN) is proposed,called TransLSTM-GAN.In this work,attention mechanism and high-performance feature representation learning ability are utilized via Transformer and LSTM network to improve the model learning ability in processing complex carbon emission data and long sequence data.An adaptive improved whale optimization algorithm(IWOA) is designed to automatic hyperparameter learning for TransLSTM,reducing training difficulty and improving training effectiveness.Using pre-trained TransLSTM as a generator and deep residual network(ResNet) as a discriminator,a GAN is constructed to fine tune the generator parameters and further improve prediction accuracy.To validate the performance of this model,experimental verification on carbon emission datasets in China,America,and Europe.The experimental results indicate that the TransLSTM-GAN model can better adapt to national regions and predict long-term carbon emissions.

Fraud Detection Model Based on Dual-space Heterogeneous Graph Neural Network

HAN Zhigeng, FU Chunshuo

Computer Science. 2026, 53 (6A): 250600050-6. doi:10.11896/jsjkx.250600050

Abstract

PDF(2519KB) ( 14 )

References | Related Articles | Metrics

GNNs have emerged as a dominant approach for fraud detection due to their inherent ability to model graph-structured data and capture complex relational patterns in fraudulent activities.However,existing GNN-based fraud detection models face critical limitations:homogeneous GNNs struggle with heterogeneous relationships in fraud graphs,while heterogeneous GNNs typically process such relationships in only a single attribute or structural space,restricting their detection performance.To overcome these challenges,this paper proposes a novel dual-space heterogeneous GNN for fraud detection,which models user relationships as a multi-relational heterogeneous directed graph and employs a multi-layer graph convolutional architecture.Each convolutional layer integrates three key modules:(1)a heterophily learning module that separately learns node heterophily in attribute and structural spaces using labeled node information,then fuses these features via a weighted strategy;(2)a cross-space graph aggregation module that computes attention weights from the fused heterophily and updates node representations through multi-relational aggregation;(3)a prototype-guided classification module that constructs class prototypes from labeled nodes to guide the classification of unlabeled nodes.To address data scarcity and class imbalance,the model adopts a balanced sampling strategy for semi-supervised training.Experimental results on YelpChi and Amazon datasetss demonstrate that the proposed model has signi-ficant improvements,with Recall increasing by 0.962 6% and 0.644 4%,and AUC rising by 0.859 4% and 0.147 9%,respectively,outperforming nine baseline models.

Time Series Prediction Method Based on Multi-level Wavelet Decomposition Bidirectional Mamba

LIU Pneg, SHEN Jiying, LIU Dongsheng, CHEN Guibo, SONG Yuanwei

Computer Science. 2026, 53 (6A): 250600172-8. doi:10.11896/jsjkx.250600172

Abstract

PDF(3084KB) ( 23 )

References | Related Articles | Metrics

Time series forecasting can provide insights into future trends and patterns,which is crucial for various applications.For example,weather forecast,power load forecast,etc.However,existing time series prediction models suffer from problems such as large model parameters,high computational complexity,and insufficient utilization of frequency domain information in data.To address these issues,a new model based on state space is proposed for long-term time series prediction.The model firstly uses multi-level wavelet decomposition to decouple the original time-series data into multiple sub sequences of different frequency bands.Secondly,it designs independent bidirectional Mamba modules for each subsequence to capture its unique dynamic patterns.Finally,the prediction results of each frequency band are accurately fused into the final prediction through wavelet reconstruction.Experimental results on seven publicly available datasets,including ETT,show that the method achieves optimal performance at multiple prediction lengths,with an average MSE reduction of 4.12% compared to the current best baseline model.This method has demonstrated its effectiveness and potential for practical applications on time series public datasets.

Data Quality Measurement Method Based on Metadata

CHEN Lianyong, SONG Jinyu, LI Zhixia, SI Changzhe, YANG Wenkai, WANG Jing

Computer Science. 2026, 53 (6A): 250600204-10. doi:10.11896/jsjkx.250600204

Abstract

PDF(2950KB) ( 21 )

References | Related Articles | Metrics

Data quality is a realistic need to activate the potential of data elements and ensure the realization of data value.In order to find problematic data and improve data quality,this paper studies the data quality measurement methods based on metadata,constructs the data quality measurement index set based on metadata,and formulates the generation elements and formal composition forms of measurement rules,so as to achieve data quality measurement with consistency,integrity and validity.And the corre-sponding data quality measurement tool is developed.Combining with the teaching evaluation dataset of a university,the feasibility and effectiveness of the proposed measurement method and measurement tool are verified.

Torlink:High-performance Streaming ML Framework for Dynamic Flow-rate Data

LIANG Zheheng, YU Ran, CUI Lei, QIN Zheng, ZHANG Jinbo, ZHANG Ziyang, WU Mingchao

Computer Science. 2026, 53 (6A): 250800062-11. doi:10.11896/jsjkx.250800062

Abstract

PDF(3015KB) ( 22 )

References | Related Articles | Metrics

With the advent of the big data era,streaming machine learning theories and methods have gained widespread attention and application.Their core lies in the ability to process continuously arriving data streams in real time and respond quickly to dynamic changes in data.The typical streaming machine learning frameworks lack support for general streaming learning algorithms and effective performance optimization mechanisms when handling dynamic data flow rates.To address these issues,this paper first analyzes and summarizes the application and computational characteristics of streaming machine learning,designing a relatively general streaming machine learning data flow.For existing frameworks,it analyzes their potential performance bottlenecks and proposes two performance optimization methods:a distance-based dynamic sampling mechanism and a gradient-based window pre-aggregation mechanism.Finally,a prototype system,Torlink,is implemented based on Flink,and experiments are conducted on four typical datasets.Results show that Torlink achieves an overall throughput approximately 4.1 times higher than existing frameworks on a 4-node cluster,with a horizontal speedup ratio of up to 3.3.

MDBCache:Lightweight Data Caching Solution for Relational Databases Based on Memory- mapping

WANG Hai, LIU Zhongyi, ZHANG Chenyang, CUI Hua, FU Dan

Computer Science. 2026, 53 (6A): 250500104-12. doi:10.11896/jsjkx.250500104

Abstract

PDF(4207KB) ( 18 )

References | Related Articles | Metrics

Relational databases face issues of low read efficiency in high-concurrency real-time scenarios.In efforts to enhance the performance of traditional database systems,NoSQL(Not Only SQL) in-memory databases and embedded databases are often employed as data caching layers for acceleration.However,these approaches have limited potential for improving read and write efficiency,and generally incur high migration costs due to differences in storage structures and read/write methods.This paper proposes a data caching solution MDBCache tailored for relational databases.By leveraging fixed-address memory-mapped caching,multi-dimensional custom index reading,and incremental update technology for in-memory data,it achieves efficient data sharing without the need for cross-process or cross-user-space operations,significantly reducing data request response times and migration costs.Experimental results demonstrate that,compared to the caching solutions of the in-memory database MMDB and the embedded database Berkeley DB,MDBCache achieves a maximum improvement of 1.49× and 6.88× in read efficiency(single-thread mode),and 8.59× and 5.44× in data update efficiency,respectively.This solution boasts mature underlying technology,high-performance service,low implementation complexity,and high practicality,making it a valuable reference in the design of data solutions for high-concurrency real-time scenarios.

Challenges and Methods for Robust Cyber-Physical Systems Under Uncertainty:A Systematic Review

HAN Liping, YU Le, HU Mingzhe, NIE Tingting

Computer Science. 2026, 53 (6A): 250700017-8. doi:10.11896/jsjkx.250700017

Abstract

PDF(1877KB) ( 23 )

References | Related Articles | Metrics

Cyber-Physical Systems(CPSs) are complex systems that are deeply coupled between the physical and digital worlds.They have been widely applied in fields such as intelligent transportation,industrial automation,and smart energy.However,the design and operation of CPS often involve various uncertainties.As a result,robustness against these uncertainties has become a critical requirement for ensuring stable and reliable system performance.Currently,the key technologies for ensuring CPS robustness include uncertainty testing,robustness evaluation,and robustness optimization.Uncertainty testing focuses on constructing diverse disturbance scenarios to reveal potential vulnerabilities of the system under complex and uncertain conditions.Robustness evaluation quantifies the system's stability and reliability under various disturbances using multi-dimensional metrics.Robustness optimization,in turn,targets the identified weak points by adjusting system architecture,control strategies,or resource configurations to enhance the system's resilience and adaptability.This paper reviews the progress in these three research areas.It also analyzes the major challenges in current CPS robustness assurance.These include difficulties in identifying and modeling multi-source uncertainties,the absence of unified standards and efficient evaluation methods,and the complexity of deploying robustness optimization strategies in practice.On this basis,the paper outlines potential future research directions,such as test case generation for evolving multi-source uncertainties,AI-driven predictive robustness evaluation,and adaptive robustness repair strategies.This paper aims to provide a structured technical reference for CPS robustness research and promote the development of highly reliable CPS applications.

Fuzzing Driver Generation Based on Large Language Models

WEI Qing, ZHANG Yupeng, LIU Shaoxun, ZHANG Jinfeng, ZHANG Yuezhong, CHEN Haoyang

Computer Science. 2026, 53 (6A): 250400113-8. doi:10.11896/jsjkx.250400113

Abstract

PDF(2519KB) ( 18 )

References | Related Articles | Metrics

With the widespread adoption of software systems,security issues have become increasingly prominent.Fuzzing,as an effective vulnerability detection technique,plays a crucial role in software development.However,traditional fuzzing tools rely on manually written driver programs,suffering from inefficiency and insufficient coverage.To address these challenges,this study proposes an automated fuzzing driver generation method based on large language models(LLMs).The approach incorporates an intelligent code parsing module to extract function interfaces and structure definitions,leverages the code generation capabilities of LLMs to automatically produce driver programs compliant with the Honggfuzz framework,and introduces a feedback-based correction mechanism to improve driver generation success rates.Experimental results demonstrate that the proposed method achieves a 100% driver generation success rate and fuzzing interface coverage in the open-source cJSON library and an in-house TBox project.For the open-source Libtiff library,the driver generation success rate reaches 76.2%,with a fuzzing interface coverage of 40.5%.Ablation studies on Qwen2.5-coder(14 B parameters) and Qwen2.5-coder(32 B parameters) indicate that the feedback correction mechanism further optimizes driver generation success rates,improving them by 5.9% and 2.4%,respectively.This method significantly enhances the automation level and coverage of fuzzing,providing an efficient solution for vulnerability detection in complex software systems.Future work may focus on optimizing the code parsing module,refining prompt templates,and enhancing LLM adaptability to further improve the method's generalizability and vulnerability discovery capabilities.

Research on Intelligent Compiler Optimization Techniques Based on Program Features

HUANG Liangming, ZHANG Jiahui, CAI Chunhao

Computer Science. 2026, 53 (6A): 260300057-9. doi:10.11896/jsjkx.260300057

Abstract

PDF(2723KB) ( 40 )

References | Related Articles | Metrics

Traditional compiler optimizations face challenges such as rigid rules,limited adaptability across scenarios,and the efficiency ceiling imposed by manual design,making it difficult to meet the diverse demands of efficient compilation for different programs.To address this,researchers have proposed algorithms such as auto-tuning and Bayesian optimization to search for optimal optimization flags and parameters.However,existing methods suffer from two major drawbacks.Firstly,they require iterative execution of the target code,which is computationally expensive and infeasible for large-scale programs.Secondly,they treat programs as black boxes,leading to dependence on initial configurations,susceptibility to local optima,and the need for re-optimization when applied to new programs.To overcome these limitations,this paper proposes an intelligent compiler optimization technique based on program features.The approach involves collecting program feature information through the compiler and training a deep learning model on a dataset containing both program features and high-performing optimization flag combinations,enabling the model to predict suitable optimization options based on input program characteristics.Furthermore,for optimization processes such as loop unrolling,fine-grained parameter tuning is achieved by combining machine learning models with program features to enable intelligent selection of key parameters.Evaluated on the standard-scale SPEC CPU2017 integer benchmark suite,the proposed method achieves a 7.2% performance improvement over the O3 optimization level,with an average model inference time of only 7.11 seconds and a feature extraction overhead of just 1.02% of total compilation time.Experimental results demonstrate that the technique can effectively predict appropriate optimization flags and parameters for different programs,offering significant cost advantages over existing approaches.

PID-Dynamic LSTM Generation Model for MCU Driver Code Based on Dynamically-tuned Cross-entropy Loss

LIU Zixuan, TANG Xiaoyong

Computer Science. 2026, 53 (6A): 250800005-9. doi:10.11896/jsjkx.250800005

Abstract

PDF(3224KB) ( 20 )

References | Related Articles | Metrics

To address the issues of model overfitting and training instability caused by noisy data in deep learning,this paper introduces,for the first time,a dynamic error compensation mechanism from control theory into code generation tasks.It proposes a code generation model named PID-Dynamic LSTM,based on a dynamically-tuned cross-entropy loss function(PID-CE Loss).Traditional cross-entropy loss is vulnerable to interference from anomalous samples under noisy conditions,leading to deviations in gradient updates and reduced convergence speed.To mitigate this,it integrates proportional(P),integral(I),and derivative(D) control terms to construct a dynamic error compensation mechanism.1)Proportional term preserves the immediate error response characteristic of cross-entropy.2)Integral term incorporates exponential moving average(EMA) differentialto capture long-term trends in loss variation,thereby correcting accumulated bias.3)Derivative term suppresses prediction fluctuations induced by noise by constraining the mean squared error(MSE) of probability distributions between adjacent training steps.Experimental results demonstrate that during 500 epochs of noisy training,the proposed method achieves an 96.28% validation accuracy on the test dataset(+3.42% improvement over baselines).Critically,it reduces the number of epochs required to first reach 80% accuracy by 31.7%(from 224 to 153 epochs).Furthermore,it reduces the overfitting gap by 6.4% and decreases loss fluctuation by 18.5%.Ablation experiment further verifies the key role and parameter characteristics of PID-CE in noise suppression.This method establishes a theoretically interpretable and engineering-friendly paradigm for noise-robust optimization,demonstrating significant application potential in noise-sensitive scenarios.

Research on C2 Style Oriented Software Architecture Evolution Path Planning

ZHONG Linhui, LIAO Zichen, ZHENG Yi, QU Qiaoqiao, HU Zhen, LI Zhuoyu, LIU Wenxuan

Computer Science. 2026, 53 (6A): 250900102-7. doi:10.11896/jsjkx.250900102

Abstract

PDF(1956KB) ( 21 )

References | Related Articles | Metrics

The C2 style of software architecture is a typical hierarchical style that emphasizes communication between software components through connectors and other rules.Converting existing software systems to the C2 style can effectively enhance the flexibility and scalability of the system.However,existing software systems face issues such as unclear connectors caused by direct component connections,and ambiguous paths during the evolution of the software system.This paper proposes a rule-based clustering method centered around “core classes” to identify connectors.Utilizing an improved multi-objective optimization genetic algorithm,the CodeT5 model is introduced to guide the mutation process,optimizing the C2 style distance,modular quality,and evolutionary cost,thereby searching for the optimal C2 style software architecture.Finally,based on a reflection model,the diffe-rences in software architecture before and after the evolution of the software system are extracted,generating domain files such as PDDL.Combined with a PDDL interpreter,the evolutionary path is generated.Experiments show that the RBCOC method can effectively identify connectors.The CT5NSGA3 algorithm demonstrates promising performance in the architecture search of C2-style systems.Comparative experiments also verify that the CT5NSGA3 algorithm outperforms the traditional NSGA3 algorithm in several indicators.

Design of Trend-aware Branch Predictor Based on RISC-V Processor

SUN Andong, ZHANG Qingyi

Computer Science. 2026, 53 (6A): 250300124-7. doi:10.11896/jsjkx.250300124

Abstract

PDF(2461KB) ( 18 )

References | Related Articles | Metrics

In recent years,RISC-V-based processors have attracted growing attention from both academia and industry.Within a processor's micro-architecture,branch predictors critically affect overall performance:in pipelined designs,higher prediction accuracy mitigates pipeline flushes and shortens execution time.Conventional predictors rely on saturated counters whose state transitions and decision rules directly bound accuracy.To better track the dynamic tendencies of program control flow,redesign the counter mechanism and,on top of the classic competitive branch predictor,implement a trend-aware branch predictor.Experimental evaluation shows that,relative to the competitive branch predictor,the proposed scheme improves prediction accuracy by 19.26% and instruction throughput by 2.12%;compared with a static branch predictor,the gains rise to 53.85% and 9.60%,respectively.These benefits come at higher hardware cost:logic for the prediction function increases by 40.83% over the competitive branch predictor and 1 220% over the static branch predictor,while verification logic grows by 0.826% and 10.95%,respectively.

ALHC:Floating-point Time Series Adaptive Lossless Compression Tool Based on HybridPrediction Architecture

ZHANG Xulin, WANG Lei, NIU Mengjin, LIANG Junda

Computer Science. 2026, 53 (6A): 250700136-8. doi:10.11896/jsjkx.250700136

Abstract

PDF(2256KB) ( 19 )

References | Related Articles | Metrics

Against the backdrop of explosive growth in floating-point time series data within domains such as industrial IoT and financial technology,significant challenges have emerged for data storage and transmission,rendering floating-point time series data compression of paramount importance.Typically,time-series data suffers from encoding redundancy caused by outlier contamination,disruption of temporal continuity due to missing values,and failure of prediction models triggered by unstructured time series.Moreover,single prediction models exhibit marked performance disparities when dealing with linear short-cycle scenarios versus sparse and complex ones.To address these issues,this paper proposes the ALHC algorithm.Aiming to meet the requirements of floating-point compression concerning data accuracy,continuity,distribution characteristics,and residual calculation,a targeted data cleaning process is designed.A hybrid architecture is constructed,dynamically switching between a TCN-LSTM neural network predictor and an adaptive linear predictor based on the LMS algorithm,enabling adaptation to different scenarios.The adaptive predictor is employed as a post-processing module for residual correction,enhancing prediction accuracy,reducing residuals,and improving entropy coding efficiency.Experimental evaluations of the proposed compression algorithm on 14 public time series datasets demonstrate that under lossless conditions,the average compression ratio reaches 0.24.

Conjugate Gradient Preconditioner Adaptive Selection Algorithm via Deep Learning

LI Qin, WU Siyuan, YANG Haoyuan, DU Qin, LING Xu, XIAO Guoqing

Computer Science. 2026, 53 (6A): 250900126-6. doi:10.11896/jsjkx.250900126

Abstract

PDF(2309KB) ( 20 )

References | Related Articles | Metrics

Precondition Conjugate Gradient(PCG) algorithm is an iterativesolving algorithm for solving large-scale sparse matrices which iswidely used in fields such as scientific engineering computing and artificial intelligence.Existing research focuses on using deep learning to generate pre condition operators to improve solving speed.However,fixed preconditioning operators lack generality and are difficult to apply to all sparse matrices because of the spatial complexity of sparse matrices.To address this issue,a preconditioner operator adaptive selection algorithm based on deep learning and its optimization method are proposed.Firstly,a convolutional neural network(PCNN) is designed to capture the spatial structural characteristics of sparse matrices.Secondly,an adaptive classification prediction model combining multi-layer perceptronsis constructed to select the optimal precondition operator.Finally,experimental results on the publicly available dataset in Florida show that the proposed method has a better classification accuracy than deep learning methods such as MLP and SVM,reaching 70.49%;Compared with the PCG algorithm based on Jacobi,ICCG,and SSOR,the proposed method improves performance by 5.5,4.3,and 6.2 times,respectively.

Collaborative Scheduling Strategies for Superconducting Quantum Processors and Heterogeneous Computing Systems

WANG Dezhi, CHENG Kun

Computer Science. 2026, 53 (6A): 250600165-5. doi:10.11896/jsjkx.250600165

Abstract

PDF(1905KB) ( 15 )

References | Related Articles | Metrics

This study addresses scheduling challenges arising from the integration of superconducting quantum processors into heterogeneous computing environments.It introduces a unified modeling framework for hybrid task graphs and proposes a joint scheduling strategy that minimizes communication overhead while leveraging predictive allocation of quantum execution windows.The proposed method is evaluated in terms of its potential to reduce execution delays and improve overall resource utilization.Further analysis explores the model's robustness and scalability under computationally intensive scenarios,along with the adaptability of task partitioning approaches to varying system constraints.The findings contribute to practical methodologies for ena-bling quantum-classical hybrid computing and advancing intelligent resource coordination in next-generation high-performance systems.

Survey of UAV Cooperative Optimization Algorithms

WANG Yan, SHI Junling, LI Hanyu

Computer Science. 2026, 53 (6A): 250700048-9. doi:10.11896/jsjkx.250700048

Abstract

PDF(2530KB) ( 21 )

References | Related Articles | Metrics

As an important carrier of multimodal intelligent system,UAV has shown revolutionary potential for cross domain applications in the past decade.From precision agricultural monitoring,urban 3D modeling to military reconnaissance and target tracking,its technology evolution continues to expand the application boundary.However,single machine operation has inherent limitations in execution efficiency,system robustness and other aspects,which is difficult to meet the needs of diverse tasks.The UAV cluster based on the distributed cooperation mechanism shows significant advantages in terms of large-scale coverage,fault tolerance,through real-time information interaction and collective decision-making.For the core technical bottlenecks such as formation control and path planning,scholars at home and abroad have carried out extensive research for decades and proposed many classic algorithms.In the past decade,various optimization methods have emerged,covering collision avoidance,path planning,task allocation and formation reorganization,laying a solid foundation for the efficiency and practicality of the swarming cluster system.In order to clearly understand the research status of cooperative optimization algorithms for UAV clusters at home and abroad,the commonly used optimization algorithms are classified and summarized.According to the principle of each algorithm,the cooperative optimization algorithms are firstly divided into traditional heuristic algorithms and machine learning algorithms.Then it introduces the application and improvement of various algorithms by some scholars in four directions.Finally,the future deve-lopment direction of UAV collaborative optimization algorithm is prospected to provide reliable reference for beginners.

Two-layer Optimization Deployment Method for Multi-source Heterogeneous Sensors

CHENG Qing, HUANG Yichuan , LUO Zhihao

Computer Science. 2026, 53 (6A): 250600229-10. doi:10.11896/jsjkx.250600229

Abstract

PDF(4487KB) ( 16 )

References | Related Articles | Metrics

As a core area of military struggle,the border defense region can achieve all-weather situation awareness by deploying multi-source heterogeneous sensor networks(WSNs).For large-scale multi-source heterogeneous sensor deployment applications,sensor deployment needs to comprehensively balance the overall performance of the sensor network and the cost of sensor deployment.To address this issue,this paper proposes a multi-source heterogeneous sensor deployment framework based on double-layeroptimization,constructing an upper layer with the objective of minimizing manufacturing cost,deployment cost,and maintenance cost,and a lower layer with a multi-objective optimization model that integrates network coverage,reliability,and life cycle.Moreover,a two-population improved NSGA-II algorithm is designed to solve the double-layer optimization model,optimizing the Pareto front distribution through a population divide-and-conquer strategy and an interquartile range resolution mechanism,thereby solving the difficulty in solving the nested structure of the double-layer optimization model.Finally,to verify the application of the model,simulation experiments are conducted to demonstrate the effectiveness of the model and the superiority of the algorithm.

Identifier-driven Computing-Storage-Forwarding Convergence Mechanism for Space Communications

CAI Hezhuo, SUN Tao, HU Chenhan, SUN Jianan, LIU Yang, JIANG Yangyi , ZHENG Tao

Computer Science. 2026, 53 (6A): 260300153-9. doi:10.11896/jsjkx.260300153

Abstract

PDF(5576KB) ( 24 )

References | Related Articles | Metrics

With the rapid advancement of space communication technologies in emerging fields such as low-altitude intelligent networks,interplanetary Internet,and satellite-terrestrial integrated systems,traditional network architectures designed for relatively static terrestrial environments are increasingly inadequate. These conventional architectures face significant challenges in coping with highly dynamic node mobility,frequent topology changes. To address these fundamental limitations,this paper proposes a novel identifier-based computing-storage-forwarding convergence mechanism and constructs a heterogeneous converged network architecture tailored for space communication. The proposed mechanism centers around a unified identifier system that establishes multi-layer mapping relationships among identifier types,including bundle identifiers,network function identifiers,component identifiers,group identifiers,and terminal identifiers. This multi-layer mapping decouples network services from underlying hardware,facilitates unified scheduling of heterogeneous network resources,and supports intelligent resolution of service demands across diverse network domains. On this basis,a three-layer converged architecture is designed,comprising the bundle service layer,the convergence adaptation layer,and the network component layer. By integrating store-and-forward strategies with distributed networking methods,the architecture mitigates the effects of intermittent links and long propagation delays typical of space environments. Furthermore,a distributed mobile network prototype is developed to validate the proposed architecture and mechanisms. Experimental results demonstrate that the proposed mechanism reduces average transmission latency by over 35% versus conventional approaches under mobile conditions. In addition,the system shows significantly enhanced robustness against link disruptions and improved communication resource utilization efficiency. These findings confirm the effectiveness and feasibility of the identifier-based heterogeneous convergence mechanism in addressing highly dynamic network topologies,providing a viable technical pathway for developing future space communication systems with identifier-driven intelligent scheduling capabilities.

Dual-frequency Index Modulation Assisted Non-orthogonal Multiple Access System Design and FPGA Implementation

XIANG Nantian, ZHENG Xing, SHI Changhan

Computer Science. 2026, 53 (6A): 250700081-7. doi:10.11896/jsjkx.250700081

Abstract

PDF(4601KB) ( 24 )

References | Related Articles | Metrics

With the evolution of 5G to 6G communication,mobile communication systems are facing massive user connections and data transmission.Non-orthogonal multiple access(NOMA),as one of the candidate technologies for future communication,significantly improves spectrum efficiency by reusing user signals in the power domain,but there are problems such as high energy consumption and power allocation optimization.In order to alleviate the above problems,this paper proposes a transmission me-thod based on dual-frequency index modulation assisted non-orthogonal multiple access system(DFIM-NOMA),which combines the high spectrum utilization characteristics of non-orthogonal multiple access with the flexibility of index modulation and the advantages of high energy efficiency.At the transmitter of the system,dual-frequency index modulation is used to improve the proportion of implicit signal transmission in information transmission.At the receiver,serial interference cancellation(SIC) and maximum likelihood detection(ML) are used to demodulate the mixed transmission signal of two users.The digital model of the test system is built by MATLAB for simulation verification,and the system baseband is implemented on the field programmable gate array.Experimental results show that under the condition of using the same spectrum resources and sending the same signal,compared with the traditional index modulation assisted non-orthogonal multiple access(IM-NOMA),increasing the proportion of index modulation signals can reduce the number of actual signals to be sent;in the case of little difference in bit error rate performance,the average amplitude of the transmitted signal can be reduced by about one-third,and the transmission energy consumption can be theoretically reduced by about 50%.

Research on Edge Intelligent Computing Services for Space-based Remote Sensing

WU Kankan, WANG Shaolin, JIANG Xiaoyong, LI Linwei, SANG Feng

Computer Science. 2026, 53 (6A): 250700029-9. doi:10.11896/jsjkx.250700029

Abstract

PDF(5427KB) ( 24 )

References | Related Articles | Metrics

In response to the requirements of satellite intelligent processing,task collaboration,and efficient transmission,the form of space-based remote sensing data processing has shifted from being provided by ground cloud computing centers to being provided by space-based data generation sources,and is developing towards multi satellite distributed edge intelligent processing services for space-based data.The space-based edge intelligent computing adopts the distributed collaborative architecture of “ground cloud-satellite edge-payload end”,the ground cloud adopts the cloud native Kubernetes framework,and the satellite edge adopts the KubeEdge edge computing framework.By controlling the data plane network,data exchange between the payload end,edge computing units,edge controllers,and gateways is achieved,supporting cloud edge collaborative processing and edge end data transmission.The control plane uses Gigabit time triggered Ethernet to provide deterministic network communication services,while the data plane uses 10 Gigabit Ethernet to provide high band-width network communication services.Adopting a unified five layer protocol stack to achieve bidirectional data transmission between space networks and satellite networks,meeting the unified network infrastructure requirements for mixed services such as cloud edge collaboration,edge end collaboration,onboard proces-sing,and space network interaction.

Power Regulation Backscatter Communication Sensing System for Distributed Photovoltaic Power Generation Systems

PENG Linyu, WANG Tao, LONG Jiao, CAO Zhongye, WU Yijie, WANG Wei

Computer Science. 2026, 53 (6A): 250400020-6. doi:10.11896/jsjkx.250400020

Abstract

PDF(3700KB) ( 15 )

References | Related Articles | Metrics

Distributed photovoltaic(PV) systems,characterized by their flexible deployment and wide distribution,align with the development trends of power system intelligence and efficient utilization of renewable energy.However,managing such systems not only requires real-time acquisition of operational status and environmental parameters of PV components but also precises localization of data collection terminals to support fault diagnosis,resource optimization,and efficiency enhancement.This imposes stringent demands on terminal energy consumption,large-scale access communication capabilities,and sensing and localization abilities.Existing technologies struggle to simultaneously meet the collaborative requirements of low-power communication,large-scale terminal access,and high-precision localization,leading to significant challenges in the practical application of distributed PV systems.To address these needs,this paper proposes a low-power backscatter communication and sensing system based on power-controlled non-orthogonal multiple access(NOMA).The system faces the following challenges during implementation.Firstly,existing backscatter communication technologies lack sufficient precision in power control,making it difficult to adapt to the dynamically changing power allocation requirements in distributed PV scenarios.Secondly,spatial distribution differences reduce the effectiveness of power control,affecting overall system performance.Finally,severe concurrent signal interference during multi-tag localization limits positioning accuracy.To tackle these challenges,this paper designs a reflected power control scheme based on tunnel diodes,achieving fine-tuning of signal power;proposes a dynamic power control strategy to optimize the overall bit error rate(BER) of the system,and combines angle-of-arrival(AoA) information embedded in aliased signals with the power domain characteristics of NOMA to design a multi-tag localization scheme,enabling simultaneous multi-tag communication and position estimation.Experimental results show a BER below 0.01% at SNR>15 dB and a localization angle error under 5° at SNR>5 dB.

Time-deterministic Service Network for DMR Emergency Communication Systems

WANG Ruijia, WANG Wenkai, LUO Haoxiang, REN Xingliang, DING Lei, ZHANG Dongpo

Computer Science. 2026, 53 (6A): 250400072-7. doi:10.11896/jsjkx.250400072

Abstract

PDF(3081KB) ( 19 )

References | Related Articles | Metrics

In traditional DMR networks,connections are made via Ethernet.When facing a high volume of burst traffic,long queues often occur,leading to high latency.This can easily result in a high rate of retransmissions and packet loss.Implementing flow control for IP network end systems to reduce traffic flow conflicts at switching nodes,and thereby avoid long queues,may be an effective solution.Based on this,this paper explores key issues related to the integration of DMR emergency communication networks with deterministic networks under IP interconnection.It proposes a traffic forwarding and scheduling method for DMR emergency communication networks,and designs a deterministic forwarding unit.This unit utilizes a scheduling table to ensure staggered transmission and relay of traffic flows,allowing for controlled traffic flow transmission according to scheduling rules.Tests performed with DMR equipment demonstrate that the method increases jitter stability by 19.80%.

Variance-reduced Distributed Stochastic Compositional Optimization Algorithm over Directed Networks

LYU Qingguo, HE Chenglong, HU Hanqing, ZHANG Wei, DAI Xiangguang, ZHANG Keke, GUAN Mingyu

Computer Science. 2026, 53 (6A): 250700050-10. doi:10.11896/jsjkx.250700050

Abstract

PDF(2369KB) ( 21 )

References | Related Articles | Metrics

This paper investigates distributed stochastic compositional optimization over directed communication networks,where each agent holds a private compositional objective function.The goal is to minimize the weighted sum of all agents' objectives through local computation and limited communication.The directed network topology introduces asymmetric information exchange,and the compositional structure leads to noise in inner function estimation,which affects convergence and efficiency.To address these challenges,this paper proposes a variance-reduced distributed algorithm designed for directed networks.It integrates a variance reduction strategy into a gradient tracking framework,improving the accuracy of inner function estimation and reducing steady-state bias,while also lowering the per-iteration computational cost.The algorithm employs a push-sum protocol to handle communication asymmetry.It proves that,under strong convexity and smoothness,the algorithm achieves linear convergence to the global optimum.Numerical experiments validate its advantages in convergence speed,steady-state accuracy,and communication efficiency.

Bibliometric Investigation of Research on Blockchain Technology

LYU Mingfeng, LIU Haijiang, DI Yangchen, YUAN Yecheng, GAO Xizhang

Computer Science. 2026, 53 (6A): 250600062-9. doi:10.11896/jsjkx.250600062

Abstract

PDF(8085KB) ( 24 )

References | Related Articles | Metrics

Blockchain is a distributed ledger utilizes cryptographic technology toappend consensus-confirmed blocks in sequence,which can ensure the authenticity,non-tamperability and irrevocability of data.Currently,it has been widely used applied across various fields.Based on CNKI and Web of Science core journal database,a bibliometric method is employedto systematically analyze the research progress of blockchain technology from 2008 to 2024 by using Citespace software.The research shows that:1)Whether in CNKI Chinese journals or SCI/SSCI English journals,the research on blockchain technology remains at a very low level from 2008 to 2013,starts slowly around 2014 and develops rapidly around 2018.Comparatively,CNKI journals startsearlier,while SCI/SSCI journals develops more rapidly.2)The research hotspots are concentrated in areas such as blockchain system architecture,digital currency,privacy security,data sharing,Internet of Things,big data,cloud computing,artificial intelligence,metaverse,etc.Moreover,blockchain technology is also applied to finance,healthcare,agriculture,electricity,copyright and other fields.3)In terms of time,the blockchain research from 2014 to 2020 mainly focuses on digital currency,marking the emerging phase of the research.From 2016 to 2023,it mainly includesstudies on the blockchain system architecture,security and privacy,and data-related operations,running through most of the research process.From 2018 to 2024,the focus shifts to the integration with new digital technologies and the application in other research fields,pointing to future research directions.

Review of Evaluation Methods for Resource Public Key Infrastructure Deployment Levels

YANG Xue, JIANG Bowen, LIU Yongxiang, ZHANG Likun, DENG Guiying

Computer Science. 2026, 53 (6A): 251100048-7. doi:10.11896/jsjkx.251100048

Abstract

PDF(2269KB) ( 26 )

References | Related Articles | Metrics

RPKI(Resource Public Key Infrastructure),as a key mechanism for enhancing the security of the BGP(Border Gateway Protocol),directly affects the trustworthiness and resilience of the Internet routing system.Therefore,existing methods for evaluating RPKI deployment based on ROA(Route Origin Authorization) issuance rates are systematically reviewed,and it is pointed out that relying on a single metric makes it difficult to comprehensively reflect the actual deployment effectiveness.Perspectives such as authoritative DNS services,global traffic,and critical asset protection rates are introduced to compensate for these limitations,and the advantages and limitations of each perspective are comparatively analyzed.Analysis based on multidimensional data shows that global RPKI deployment has achieved phased progress,with significant effectiveness in critical infrastructure and core service layers,where more than 70% of authoritative DNS servers and nearly 80% of top global websites have been protected.However,the actual adoption rate of ROV(Route Origin Validation) filtering is only 21.15%,indicating a significant imbalance between resource-side deployment and network-side enforcement,and further advancement of RPKI deployment still requires coordinated efforts in technology,policy,and other aspects.

Malicious Traffic Detection Method of ICMP Covert Channel Based on Baseline Features

DUAN Haiying, WANG Baohui, HUANG He

Computer Science. 2026, 53 (6A): 250200069-11. doi:10.11896/jsjkx.250200069

Abstract

PDF(3625KB) ( 15 )

References | Related Articles | Metrics

ICMP is used for network management technology.Network attackers often use it to carry out illegal actions such as remote control,data theft and malicious attacks.It is a common method of hidden communication in network attacks in recent years,which brings serious security risks to the victim host.In view of the increasingly severe situation of ICMP covert channel attacks and the characteristics of ICMP data flow that are complex,difficult to identify and have strong concealment,it is found that there are insufficient feature extraction when using machine learning to detect malicious traffic in ICMP covert channel in existing research,and the robustness and generalization ability of the model are poor.Therefore,a baseline feature-based malicious traffic detection method for ICMP covert channels is proposed to address these challenges.Firstly,the baseline analysis of ICMP benign traffic and covert channel traffic is carried out,and five features with good discrimination are proposed:average data packet length,data packet frequency,session duration,ratio of request to reply packets,and message data information entropy.Then,a binary classifier is constructed by combining multiple machine learning models for malicious traffic detection.The experimental results show that the accuracy,recall and F1 value of the proposed method reach 99.53%,99.51%and 99.5%respectively,which are 2.83 persentage points,2.97 persentage points and 2.88 persentage points higher than those of the existing methods.In addition,considering that the baseline features are easy to be bypass by attackers through dynamic adjustment or obfuscating techniques,this paper adds an ICMP tunnel detection method based on adversarial training and ensemble learning,which enhances the robustness of the model by generating adversarial samples,and combines the advantages of deep attention network and traditional machine learning models to effectively identify covert tunnel traffic.The proposed method mainly uses PGD attack to generate adversarial samples,introduces a multi-head attention mechanism to extract deep features,and predicts the results through MLP.The final accuracy is improved to 99.63%.Experiments show that the proposed method improves the detection accuracy and robustness in the adversarial environment.In addition,the proposed method has millisecond level traffic analysis and detection capabilities,which can effectively adapt to the actual ICMP covert channel traffic detection requirements.

DDoS Attack Detection Based on Attention Mechanism TCN-BiLSTM

LI Jie, WANG Baohui, ZHANG Jingyuan

Computer Science. 2026, 53 (6A): 250300060-9. doi:10.11896/jsjkx.250300060

Abstract

PDF(3415KB) ( 19 )

References | Related Articles | Metrics

DDoS attacks pose a great threat to personal and national data security.How to accurately detect and identify DDoS attacks is of great significance.Aiming at the problems such as low prediction efficiency,overfitting and poor generalization ability in traditional DDoS attack detection,a new DDoS attack detection algorithm based on multi-scale spatiotemporal fusion attention network is proposed.Firstly,the features of heterogeneous data are distinguished and supplemented,and white noise is injected to enhance the robustness of the model to random disturbance.Secondly,at the algorithm level,a hierarchical strategy of multi-scale TCN and BiLSTM in parallel is proposed to cover multiple dependencies from short time to long time,and the layered output feature matrix is compressed by deep separable convolution to extract core timing patterns and effectively control network complexity.Finally,the compressed vector sequence is transferred to Transformer self-attention mechanism to realize global correlation modeling of cross-scale and cross-channel features,dynamically highlight timing context slices with high discriminating power,and identify abnormal traffic of DDoS attacks.Comparison experiments and ablation experiments are conducted based on the CIC-IDS-2017 dataset respectively.The results show that the prediction accuracy of the multi-scale spatial-temporal fusion attention network algorithm can reach 99.82%,the recall rate is 99.35%,and the F1 value is 99.58%,which is 4.28% higher than the accuracy of TCN and BiLSTM models and it can effectively identify DDoS attacks.

Symbolic Execution-based Automated Verification Method for Binary Vulnerabilities

ZHANG Yuanyuan, LIU Tieming, LIU Guoan, GE Xueshuai

Computer Science. 2026, 53 (6A): 250600035-7. doi:10.11896/jsjkx.250600035

Abstract

PDF(1883KB) ( 19 )

References | Related Articles | Metrics

With the widespread application of software systems,the number of binary vulnerabilities continues to grow,posing serious threats to information security.Existing research lacks comprehensive solutions that organically integrate vulnerability detection,root cause analysis,and automated exploitation,resulting in inefficient and limited-coverage vulnerability verification.This study aims to propose an automated,cross-architecture binary vulnerability verification method that integrates the entire process from vulnerability detection and root cause localization to exploit generation,while producing structured verification reports.It designs an automated verification framework based on symbolic execution,which consists of five components:preprocessing,vulnerability detection,root cause analysis,exploit generation,and verification.By combining vulnerability-oriented and path-guided symbolic execution strategies with taint analysis and constraint solving,the framework supports multi-architecture analysis.Validation on 14 test samples demonstrate that the proposed method successfully detects various types of vulnerabilities including stack overflow,heap overflow,use-after-free(UAF),and format string vulnerabilities,accurately locates root causes at the basic-block level,and achieves automated exploit generation for most samples.This research not only enhances the automation and accuracy of binary vulnerability verification but also provides a reliable basis for vulnerability remediation and defense.Future work will focus on improving heap vulnerability exploit generation,expanding verification capabilities for large-scale applications,and exploring automated repair techniques.

Blockchain Scheme for Medical Data Sharing Based on Main-Side Chain and Traceable RingSignature

ZHAO Rui, GAO Hancheng, HUANG Haiping, FENG Haoshi , LU Xinyi

Computer Science. 2026, 53 (6A): 260100018-8. doi:10.11896/jsjkx.260100018

Abstract

PDF(3314KB) ( 33 )

References | Related Articles | Metrics

Secure sharing of medical data and privacy protection are crucial for improving healthcare service quality.To address critical challenges in current medical systems,including data silos,privacy breaches and auditing difficulties,a data security sharing and privacy protection scheme for medical consortium blockchain is proposed.Firstly,the scheme constructs a patient-ID-indexed “main-side” chain storage architecture,where the main chain stores personal information,while the side chain stores medical records and addition trajectories,enabling efficient data retrieval.Secondly,Traceable ring signature technology is introduced to preserve user anonymity while supporting identity traceability in the case of medical disputes,balancing privacy protection and regulatory needs.Furthermore,a threshold signature mechanism without a trusted center is adopted,leveraging distributed key generation and threshold signature technology to provide decentralized approval and verification for medical record additions,achieving access control and secure sharing.Finally,security analysis and experimental simulations demonstrate that the proposed scheme outperforms existing solutions in terms of computational and communication overhead,which demonstrates its practical feasibility and effectiveness.

ChainGrid-V:A Highly Scalable Blockchain Storage and Consensus Framework for Cross-domainInternet of Vehicles

ZHANG Xiqian, CHEN Siyu, ZHANG Hanwen, WANG Deguang, TIAN Hong

Computer Science. 2026, 53 (6A): 250800060-8. doi:10.11896/jsjkx.250800060

Abstract

PDF(2576KB) ( 21 )

References | Related Articles | Metrics

Real-time sharing in IoVs imposes stringent demands for high trust,high privacy,and high concurrency,whereas exis-ting blockchains struggle to simultaneously address storage overhead,off-chain tampering,and robustness under dynamic topologies.To this end,this paper proposes ChainGrid-V,a highly scalable framework built upon three core technologies:decentralized dual-Merkle verification that leverages a local-global tree structure to reduce consistency-check complexity to O(log n+log k),sustaining an 80% detection rate under 30% shard tampering for single nodes and 65% under multi-node collusion with repair latency ≤8.5 s;RAW hybrid consensus that fuses Proof-of-Authority for leader election,Proof-of-Reputation for weighted voting,and lightweight Proof-of-Work as a fallback,augmented by a dual-decay exponential reputation-recovery function enabling millisecond-level weight reclamation for dishonest nodes and seamless replacement by trusted ones,maintaining >500 TPS in a 70-node network and achieving 93.5% consensus success even with 30% Byzantine nodes,with system recovery in only 1.6 s;a lightweight privacy stack where on-board units locally generate zero-knowledge proofs whose on-chain verification latency is below 10 ms,enabling raw-data privacy protection and integrity verification without trusted hardware.Joint simulations with 200 OBUs,50 RSUs,and 30 storage nodes show that ChainGrid-V's throughput and latency differ by <7% from Raft-PoA and DAG-IoV,while its security margin improves by 15%~20%,fully satisfying automotive-grade high-frequency low-latency requirements and demonstrating large-scale deployment potential.

Black-box Physical Adversarial Attack Against Multimodal Object Detector

ZHENG Haibin, LIN Xiuhao, HAN Ye, CHEN Jinyin, LI Beibei

Computer Science. 2026, 53 (6A): 250700023-10. doi:10.11896/jsjkx.250700023

Abstract

PDF(5761KB) ( 21 )

References | Related Articles | Metrics

For real-world complex working conditions,deep learning-based multimodal(visible,infrared,etc.) target detectors improve the detection effect by fusing data features from different bands.However,it has been found that multimodal detectors are susceptible to adversarial attacks,resulting in the output detection frames being severely off-target or the detection frames disappearing,which reduces their reliability for use in the physical world.Work has been done to explore black-box physical adversarial attacks for multimodal depth detectors,but there are still problems such as inefficient modal attacks,limited target detectors and fusion strategies,and poor physical domain attack assignment.Aiming at the above problems,this paper proposes a multimodal adversarial color patch(MAC-Patch) generation method to achieve efficient,general,and robust attacks on multimodal deep detectors.Specifically,a stochastic gradient descent optimizer is utilized to generate strong adversarial patches against different modalities on the equivalent model,and can still effectively interfere with the target model in a black-box setting without accessing the internal structure of the target model.A patch location optimization method based on differential evolution is proposed to adaptively select the optimal attack location under multiple target fusion strategies,target detection models,and defense settings.Finally,the attack effectiveness,generalization and migration of MAC-Patch are tested on 2 models,3 image fusion strategies and 4 datasets respectively;the actual attack effect under different environment brightness and different patch rotation angles is adopted in the physical domain with expectation translation transformation to verify its robustness.Experimental results show that MAC-Patch is optimal in terms of attack success rate,AP reduction value and other indexes,such as compared with the three advanced attacks of MAP,MIC,and UAP,the AP reduction value of MAC-Patch is improved by 62.6%.

Web Application Fingerprinting Method Based on Multi-level SimHash and Digital FeatureSnapshots

GU Xianjun, HUANG Mengqi, LIU Ming, HAN Fuji, TIAN Cong , ZHU Dongjun

Computer Science. 2026, 53 (6A): 250600030-11. doi:10.11896/jsjkx.250600030

Abstract

PDF(4128KB) ( 19 )

References | Related Articles | Metrics

Web application fingerprinting is a fundamental technique in cyberspace mapping,Web vulnerability exploitation,and cybersecurity situational awareness.Existing mainstream approaches primarily rely on manually crafted text-based rules and regular expression matching to identify Web applications.However,these methods face several limitations,such as the difficulty of rule extraction,challenges in distinguishing between similar sub-versions,and susceptibility to failure when page content changes.To address these issues,this paper proposes a Web application fingerprinting model based on a multi-level SimHash algorithm and digital feature snapshots.The method extracts representative page content that reflects core characteristics of Web applications,and maps it into high-dimensional digital fingerprints using multiple SimHash calculations to form computable and comparable feature snapshots.On this basis,a general fingerprinting model is constructed,with systematic definitions of its structure,key algorithms,and parameters.Furthermore,a practical implementation of the model is developed using HTML page content,and a series of experiments are conducted on various mainstream Web applications.Experimental results demonstrate that the proposed method outperforms traditional rule-based approaches in recognition accuracy,supports automatic fingerprint generation and sub-version identification,and exhibits robustness to page modifications to a certain extent.

Robust Time Series Anomaly Detection Model Based on Multi-view Cross Filtering

ZHANG Juling, ZHAO Yibing, WANG Sheng, XI Ning, SHE Wenkui

Computer Science. 2026, 53 (6A): 250600105-10. doi:10.11896/jsjkx.250600105

Abstract

PDF(4066KB) ( 16 )

References | Related Articles | Metrics

Amid the rapid advancement of digital transformation and intelligent technologies,fields such as industrial manufactu-ring,financial transactions,and energy management increasingly rely on vast amounts of time series data to support critical decision-making.The sudden occurrence of anomalous events not only poses a threat to system performance but also severely affects overall security.Thus,efficiently identifying anomalies from large-scale,structurally complex data has become a pressing challenge.This paper focuses on the issue of time series anomaly detection by investigating the interference caused by noise in industrial datasets during model training and proposing an improved strategy.Data collected in industrial environments often exhibit characteristics such as high dimensionality and the presence of multiple noises.When noise is incorporated into the training samples,the learning process of the model is easily disrupted,leading to reduced robustness.Previous studies mainly adopted a single indicator to identify and filter noisy samples,a method that may introduce cumulative errors during training and consequently affect the accuracy of anomaly detection.To address the aforementioned issues,this paper proposes a robust time series anomaly detection model based on multi-view cross filtering(MVCF-AD).The model first introduces the concept of non-neighbor attention and combines it with reconstruction error to construct a dual-indicator system for noise discrimination.Subsequently,a multi-view cross filtering strategy is built,and a dual-network parallel training approach is employed.By utilizing loss ranking,the model dynamically identifies and filters noisy samples.Experimental results demonstrate that MVCF-AD exhibits excellent detection performance and robustness under various noise levels,thereby proving its effectiveness in addressing the noise issue in datasets.

Signature Scheme Based on Traceable Attributes

HUANG Shoumeng, YANG Boxiong, YANG Ming

Computer Science. 2026, 53 (6A): 250300121-8. doi:10.11896/jsjkx.250300121

Abstract

PDF(3930KB) ( 16 )

References | Related Articles | Metrics

In order to provide reliable data transmission in the era of big data,signature technology is often applied to data communication.One type of attribute based signature technology can ensure signature anonymity by allowing users to create and verify signatures without disclosing sensitive information of the signer.However,validators can also verify whether the attribute periodhas expired normally,and users who do not require signature authorization can create and use existing signature keys and attributes to sign.To solve this problem,a signature scheme based on traceable attributes with anonymous certificates is proposed.Firstly,the validity of the signer is verified by using the pseudonym ID contained in the user list in the cloud server.If an unauthorized signer creates a signature using an existing signature key and attribute,the cloud server will use the user list to verify the signer,whose identity is represented as a transfer based pseudonym ID.Next,based on signature verification to determine the storage of signed messages,cloud servers can prevent unauthorized signers from continuously generating and sending signatures.If there is a problem,the identity of the signer can be verified through tracking.If the signature obtained by the verifier is unclear or problematic,the identity of the signer can be verified through AA and TA.Finally,the anonymity of the signer is provided to the verifier,and the integrity of the message is ensured through the verification phase.This signature scheme protects signatories who abuse anonymity by disclosing their signature keys and attributes for certain purposes,as well as protecting unauthorized users from using public information to sign and send messages.

Lightweight Network Security Vulnerability Risk Awareness Method Based on RAG

GU Xianjun, QIN Sihang, SHU Yifeng, MA Baoxin, LIU Feixue, LIU Ming

Computer Science. 2026, 53 (6A): 250300034-10. doi:10.11896/jsjkx.250300034

Abstract

PDF(3988KB) ( 18 )

References | Related Articles | Metrics

In recent years,large model-based network security vulnerability risk awareness has gradually become a research hotspot.However,existing methods still suffer from slow response speed and low semantic quality in terms of intelligent operational efficiency and fine-grained information perception.To address these issues,this paper proposes a lightweight network security vulnerability risk awareness method based on Retrieval-Augmented Generation(RAG).Firstly,by constructing a cross-domain knowledge base and integrating a cross-domain vectorization algorithm,efficient vectorization of network vulnerability information is achieved.Then,a multi-objective retrieval algorithm is designed to adaptively extract highly relevant fine-grained vulnerability information from the knowledge base.Finally,by incorporating metadata-based secondary enhancement and a local large model,intelligent vulnerability risk awareness response is completed.Experimental results show that the proposed method significantly outperforms existing approaches in both semantic quality and operational efficiency,enabling rapid responses to network security risks and providing high-quality countermeasures,fully meeting the demands of intelligent network security operations.

Research on Health Evaluation Technology of AHP-FEC Meteorological Equipment Based onLasso Optimization

YAO Ye, GUO Kangning, ZHU Yian, LIAO Shaochun, ZHANG Ni

Computer Science. 2026, 53 (6A): 250400123-16. doi:10.11896/jsjkx.250400123

Abstract

PDF(5123KB) ( 20 )

References | Related Articles | Metrics

Meteorological equipment operates for long periods of time under extreme climatic conditions,and obtaining timely and accurate health status is essential to ensure its continued,reliable operation.However,the health assessment of meteorological equipment faces the problems of complex evaluation indexes,low degree of correlation between devices,long data collection pe-riod,many parameters of qualitative indexes and lack of quantitative indexes.Current health evaluation methods are difficult to be directly applied to the health assessment of meteorological equipment.To address these issues,this study aims to systematically propose a set of evaluation index system with value for engineering practice,and to obtain the data set through multiple data preprocessing methods.On this basis,this paper firstly uses Lasso regression for feature extraction to get the priority queue of each indicator in the assessment result.Then,combining AHP and expert scoring,an importance matrix is constructed and the weights of the indicators are calculated.Finally,the fuzzy comprehensive evaluation method is used to construct comment sets and their corresponding affiliation functions for different indicators,and the health status of meteorological equipment is analysed by combining the weights and affiliation matrix calculation.The experimental test results show that the proposed method exhibits good performance in the recognition of health status,with the accuracy of 0.949,the recall rate of 0.966,and the F1 Score of 0.957.This study provides a new method for assessing the healthiness of meteorological equipment,which can help to better maintain and manage meteorological equipment and ensure its stable operation under extreme weather conditions.

Harmonic and Interharmonic Analysis Method Based on Improved SSA-OMP Atomic Search

WANG Haonan

Computer Science. 2026, 53 (6A): 250500053-7. doi:10.11896/jsjkx.250500053

Abstract

PDF(2884KB) ( 20 )

References | Related Articles | Metrics

In modern power systems,the issue of harmonic and interharmonic pollution is increasingly severe.Aiming at the bottleneck problem of balancing detection accuracy and computational efficiency in existing harmonic and interharmonic analysis methods,this paper innovatively proposes a joint optimization method that integrates the improved Sparrow Search Algorithm(SSA) with Orthogonal Matching Pursuit(OMP).By constructing an overcomplete sine atomic library based on continuous parameters,it breaks through the limitations of traditional discretized atomic search.Combining SSA for global optimization of atomic parameters in continuous space significantly improves matching accuracy.Meanwhile,by introducing an orthogonal iterative mechanism and an adaptive termination condition based on signal correlation,it effectively reduces redundant calculations and suppresses noise interference.Simulation results show that the proposed SSA-OMP algorithm achieves a maximum reconstruction signal-to-noise ratio of 49 dB in harmonic/interharmonic frequency,amplitude,and phase detection,with a frequency error below 0.013 4%,and its noise immunity is superior to traditional methods.Compared with the Particle Swarm Optimization-OMP algorithm,the computational efficiency is improved by 20%,providing an innovative solution with high accuracy and low complexity for real-time monitoring in complex harmonic scenarios of power systems.

Accurate Prediction of Electric Vehicle Charging Loads Approach Based on Multi-branch Fusionand Multi-head Attention Residual Network

WANG Hongbiao, ZHAN Qiankun, GAO Ge, LEI Ming

Computer Science. 2026, 53 (6A): 250300074-5. doi:10.11896/jsjkx.250300074

Abstract

PDF(3125KB) ( 21 )

References | Related Articles | Metrics

The accuracy of electric vehicle(EV) charging load prediction is significantly influenced by the coupling relationship between charging prices and multi-source loads,yet traditional prediction models often neglect this dynamic interaction.To address this,this paper proposes a novel charging load prediction method based on multi-branch fusion and a multi-head attention residual network.Firstly,historical charging loads,charging prices,and temporal features are dynamically decoupled to construct a price-load decoupled feature space.Secondly,a multi-branch parallel network is designed to separately extract decoupled temporal features,price-sensitive features,and spatial correlation features,while cross-branch feature interaction is achieved through a multi-head self-attention mechanism.Finally,residual connections are introduced to optimize gradient propagation and mitigate degradation in deep networks.Experimental results based on real-world charging station data demonstrate that the proposed method reduces the mean absolute error by 16.15% compared to benchmark models such as CNN-BiLSTM and GRU.This approach provides high-precision prediction support for power system dispatching and validates the synergistic effectiveness of price-decoupled features and attention mechanisms in load forecasting.

Software System Architecture of New Intelligent Hardware

MENG Lin

Computer Science. 2026, 53 (6A): 250600017-12. doi:10.11896/jsjkx.250600017

Abstract

PDF(8646KB) ( 18 )

References | Related Articles | Metrics

It is a strategic task to supporting the localization and independent creation of core software technologies to avoid being “choked” by foreign forces for Chinese technology industry.In the field of intelligent hardware products and equipment,practitioners are actively exploring technological innovation ideas.In this paper,according to the user's multi scene requirements for the new generation of intelligent mechatronics products,based on the actual application of advanced information technology and control technology in the industry,an advanced software system architecture is proposed,covering devices,desktop,mobile and cloud platforms.Device software can run on low-cost embedded terminals to achieve high-precision real-time tasks such as motion control,motor drive,perception,and non real-time application tasks such as IoT and AI;Desktop software is based on cross platform industrial software technology,mainly used for modeling,simulation,and task planning;The mobile end and cloud platform rely on Internet and other information technologies to achieve remote control,multi-user interoperability,resource management,data analysis,protocol adaptation and other functions.This software system solution is designed for industrial,industrial,and consumer products,and can be applied to products such as general robots,CNC systems,3D printers,smart home appliance,etc.

Research on Mixing of Virtual and Actual Reality for Transformer Augmented Assembly

MA Wei, ZHOU Haofeng

Computer Science. 2026, 53 (6A): 250600225-7. doi:10.11896/jsjkx.250600225

Abstract

PDF(3442KB) ( 16 )

References | Related Articles | Metrics

The mixing of virtual and actual reality's effects will directly affect the realism and immersion of the augmented assembly system.The correct occlusion relationship and collision interaction between virtual and real objects are essential to the fusion's effect.An occlusion handling method based on “virtual avatar” of transformer equipment is proposed,which has strong robustness and real-time performance.Aiming at the problem that there are too many wrong matching pairs of feature points in the process of 3D registration,the 3D registration method for ORB-Cosine similarity measurement is proposed to improve the accuracy of feature point matching.The current collision detection methods for virtual and real objects have a large amount of calculation and low detection accuracy.It proposes an octree layered collision detection optimization algorithm based on voxel representation of model surface and volume difference,which improves the detection accuracy on the basis of ensuring calculation efficiency.Experimental results show that the proposed methods have high real-time and accuracy,which greatly enhance the realism and immersion of the assembly system.

Reliability Evaluation of Autonomous and Controllable Substation Communication Network Based on Fault Tree

WANG Pengyang, SONG Daoxun, LIU Qiang, CHEN Can, CHEN Guanyu, SONG Xiaofan

Computer Science. 2026, 53 (6A): 250800016-7. doi:10.11896/jsjkx.250800016

Abstract

PDF(2807KB) ( 25 )

References | Related Articles | Metrics

In response to the characteristics of the complex communication network architecture,hierarchical partitioning,dual-network redundancy,and high degree of logical coupling of the new generation of independent and controllable substations,a fault tree modeling and evaluation method oriented to the characteristics of communication function failures is proposed.This method combines the partition structure,equipment redundancy configuration,and multi-path forwarding mechanism of the communication network to construct a multi-level fault logic model covering key units such as the station control layer,spacer layer,isolation device,firewall,etc.It systematically characterizes the typical fault propagation paths and coupling logic of the communication network of the autonomous controllable substation,and combined with engineering examples,quantitatively analyzes the communication availability of the autonomous controllable substation and identifies key fault nodes.Test results indicate that after the introduction of dual-link redundancy and isolation mechanisms,the availability of communication functions in the Core Business Security Zone I reaches 99.85%,the overall availabi-lity of typical interval devices remains above 99.6%,and the overall availability of the entire station communication system reaches 96.53%.The quantitative analysis results show that the dual-network redundancy and partition isolation architecture of the new generation of autonomous controllable substations have advantages in communication reliability,and ensure their comprehensive level in a wider range of business scenarios.