Started in January,1974(Monthly)
Supervised and Sponsored by Chongqing Southwest Information Co., Ltd.
ISSN 1002-137X
CN 50-1075/TP
CODEN JKIEBK
Editors
Current Issue
Volume 51 Issue 11A, 16 November 2024
  
Intelligent Computing
Survey of Recommender Systems for Large Language Models
KA Zuming, ZHAO Peng, ZHANG Bo, FU Xiaoning
Computer Science. 2024, 51 (11A): 240800111-11.  doi:10.11896/jsjkx.240800111
Abstract PDF(2286KB) ( 397 )   
References | Related Articles | Metrics
Large language models have emerged as highly effective tools in the field of natural language processing(NLP) and have recently garnered considerable attention in the domain of recommendation systems(RS).These models have undergone extensive training on vast datasets through self-supervised learning,achieving remarkable results in learning universal representations.With efficient transfer techniques such as fine-tuning and prompt tuning,they have the potential to significantly enhance various aspects of recommendation system performance.The crux of leveraging language models to optimize recommendation quality lies in fully utilizing their high-quality text feature representations and extensive external knowledge bases to establish strong connections between items and users.To gain a comprehensive and in-depth understanding of current recommendation systems based on large language models,this paper meticulously categorizes these models into two main types:discriminative large language models for recommendation(DLLM4Rec) and generative large language models for recommendation(GLLM4Rec).Furthermore,the latter is subdivided into constrained generation and free generation,and a detailed summary of relevant research on these two approaches is provide.Additionally,this paper identifies key challenges in this field and shares valuable findings,hoping to provide researchers and practitioners with precious inspiration and insights.
Product Improvement Based on UGC:Review on Methods and Applications of Attribute Extractionand Attribute Sentiment Classification
SUI Haoran, ZHOU Xiaohang, ZHANG Ning
Computer Science. 2024, 51 (11A): 240400070-9.  doi:10.11896/jsjkx.240400070
Abstract PDF(2058KB) ( 150 )   
References | Related Articles | Metrics
User-generated content(UGC) contains a wealth of authentic user feedback on products and their attributes.With the continuous advancement of digital technology,enterprises are increasingly relying on UGC to gain insights into user needs and guide product improvements.In this process,attribute extraction and attribute sentiment classification are considered as two core steps.Attribute extraction aims to identify key product attributes from UGC and is mainly categorized into supervised and unsupervised learning methods.Attribute sentiment classification,meanwhile,focuses on analyzing users' emotional attitudes towards these extracted attributes,primarily including approaches based on dictionaries and rules,statistical machine learning,and deep learning.Firstly,systematically outlines the theoretical frameworks and technical essentials of attribute extraction and attribute sentiment classification methods.Subsequently,these methods are illustrated through practical applications,aiming to offer valuable references for enterprises and researchers utilizing UGC to inform product enhancements.Finally,this paper explores the current challenges faced by attribute extraction and sentiment classification,as well as directions for future research.
Study on DistMult Decoder in Knowledge Graph Entity Relationship Prediction
HAN Yijian, WANG Baohui
Computer Science. 2024, 51 (11A): 231200118-5.  doi:10.11896/jsjkx.231200118
Abstract PDF(2959KB) ( 197 )   
References | Related Articles | Metrics
State Grid Gansu Electric Power Academy hopes to construct a knowledge graph of the power industry through a large amount of scientific research literature and deeply explore the potential correlations in the knowledge graph.The relationship prediction model is a key technology for solving such problems and an important technology in knowledge graphs,which has been a research hotspot for researchers in recent years.A large number of papers and experiments have demonstrated that the framework combining encoder and decoder performs well in relation prediction tasks.Under this framework,due to the advancement of graph neural network technology,there have been many works in recent years that have improved the performance of relationship prediction by using graph neural networks as encoders and optimizing them,while neglecting the role of decoders.Taking inspiration from cosine similarity,this paper proposes a novel decoder COS DistMult based on DistMult and conducts comparative experiments on real datasets,and the experimental results indicate that the evaluation indicator Hits@10of the relationship prediction model increases by 2%.It is proves that optimizing the decoder structure is an effective method in relation prediction tasks based on an encoder decoder framework.
Construction of Fine-grained Medical Knowledge Graph Based on Deep Learning
WANG Yuhan, MA Fuyuan, WANG Ying
Computer Science. 2024, 51 (11A): 230900157-7.  doi:10.11896/jsjkx.230900157
Abstract PDF(3250KB) ( 191 )   
References | Related Articles | Metrics
As a powerful tool for integrating massive medical information,medical knowledge graphs are being widely evaluated on convenient platforms such as clinical decision support systems and medical question and answer systems.At present,large-scale medical knowledge graphs are emerging one after another,but most of them focus on the supplement of the number of entities.Medical terminology is lengthy and difficult to understand.Therefore,building a fine-grained knowledge graph can make the knowledge graph convenient for the system to a large extent.practicality and provide more crown diagnostic instructions for the question and answer system.This paper targets the large-scale medical knowledge base crawled by vertical websites,with the goal of achieving fine-grained medical long texts.BiLSTM is used to model complete contextual information for each word from both directions of the long sentence.At the same time,we introduce the pre-training model BERT to enhance the modeling of word context semantics and combined with the CRF model learning status.The incremental matrix maintains the consistency of the label sequence,efficiently identifies entities in long sentences,and builds a fine-grained medical knowledge graph through entity alignment and attribute filling.Comparative experiments on the fine-grained task of medical entities demonstrate that the BERT+BiLSTM+CRF model is better than other models,and the visualization results also illustrate the fine-grained effect of this method.
Spiking Neural Network Classification Model Based on Multi-subnetwork Pre-training
ZHUO Mingsong, MO Lingfei
Computer Science. 2024, 51 (11A): 240300191-6.  doi:10.11896/jsjkx.240300191
Abstract PDF(3507KB) ( 180 )   
References | Related Articles | Metrics
Spiking neural network(SNN) is widely regarded as the most biologically plausible model,aligning closely with the mechanisms of the biological brain.It has garnered increasing research attention due to its event-driven nature,high energy efficiency,and interpretability.However,the training methods of SNN still have some limitations due to the binary output and non-differentiability of the spike.This paper proposes a SNN classification method based on multi-subnetwork pre-training,which draws inspiration from the way cortical memory units store memory information through local networks.This approach leverages sample label information to optimize the feature extraction process,employs enhanced spike-timing-dependent-plasticity learning rules for pre-training multiple single-class feature extraction subnetworks,and conducts unsupervised feature fusion on these pre-trained subnetworks to effectively enhance the network's feature classification capability.Furthermore,the effectiveness of this method is analyzed through weight visualization and t-SNE visualization tools.Finally,the classification accuracy of 97.40% and 88.81% is achieved on MNIST and Fashion MNIST datasets,respectively.
Balanced Weighted Graph Coloring Problem and Its Heuristic Algorithms
OU Kaiming, JIANG Hua
Computer Science. 2024, 51 (11A): 231200103-7.  doi:10.11896/jsjkx.231200103
Abstract PDF(2381KB) ( 163 )   
References | Related Articles | Metrics
Given an undirected graph G and a number of colors k,the graph k-coloring problem(GCP) refers to assigning one of k colors to each vertex in G such that any two adjacent vertices receive different colors.Balanced resource allocation is to distribute resources as evenly as possible to each participant,aiming to achieve fair utilization of resources and reasonable sharing of tasks.Aiming at the situation that the traditional graph coloring problem cannot solve the balanced resource allocation,a new variant of the graph coloring problem,the balanced weighted graph coloring problem,is proposed,whose goal is to find a legal coloring that minimizes the standard deviation of the sum of the weights of each color class.A HEA-TLS algorithm that combines two local searches into an evolutionary algorithm is proposed to find an optimal solution to this problem.The novelty-based local search aims to find a legitimate solution.The purpose of the local search for improving the equilibrium of the solution is to improve the equilibrium of the solution based on the legitimate solution.A balanced weight crossover is designed in the evolutionary algorithm,which can adaptively select the color classes to be passed to the offspring according to the changes in the weights of the parent color classes,to produce a more balanced coloring solution through genetic evolution of the population.In a comparative evaluation using the generalized solver CPLEX on the DIMACS graph,HEA-TLS achieves almost optimal results in all tests,va-lidating the effectiveness of the proposed method.
Zebra Optimization Algorithm Improved by Multi-strategy Fusion
REN Qingxin, FENG Feng
Computer Science. 2024, 51 (11A): 240100203-7.  doi:10.11896/jsjkx.240100203
Abstract PDF(4733KB) ( 209 )   
References | Related Articles | Metrics
In order to solve a series of problems of zebra optimization algorithm,such as easy to fall into local optimization and slow convergence,this paper proposes a multi-strategy fusion improved zebra optimization algorithm(MSI-ZOA).Firstly,the random sequence generated by Tent chaotic map is used to initialize the population,which improves the distribution quality of the initialized population in the search space and strengthens the global exploration ability.Secondly,taking advantage of the heavy-tailed property of Levi's flight,the search space coverage is increased,and the global exploration ability in the foraging stage of zebra optimization algorithmZOA) is strengthened.Nextly,using a sine and cosine optimization algorithm with hyperbolic cosine enhancement factor,it can effectively pick out the local optimal solution and improve the convergence speed when it is applied to the predator-resistant stage of ZOA algorithm.Finally,the MSI-ZOA algorithm,ZOA algorithm,vulture optimization algorithm(AVOA),artificial hummingbird algorithm(AHA),gorilla troop optimization algorithm(GTO),arithmetic optimization algorithm(AOA) and northern goshawk optimization algorithm(NGO) are tested on eight benchmark functions,and the results show that MSI-ZOA algorithm is superior to the other six algorithms in convergence speed and global search.
Visual Servoing Predictive Control for Omnidirectional Mobile Robots with SuppressionofVelocity Abrupt Change
LIN Yegui, DAI Zhijian, HE Defeng, XING Kexin
Computer Science. 2024, 51 (11A): 240300003-6.  doi:10.11896/jsjkx.240300003
Abstract PDF(2953KB) ( 138 )   
References | Related Articles | Metrics
During the visual servoing task of omnidirectional mobile robot,in order to solve the problem of sudden change of speed caused by the change of feature points,wheel slippage,dynamic obstacles and other situations,this paper proposes a neurodynamics-based visual servoing strategy for quasi-min-max MPC.Because the sudden change of visual error is the main cause of the sudden change of speed,the strategy deals with the visual error by introducing a neurodynamics model.A neurodynamics-based time-varying prediction model for the linear parameters of the visual servo of the mobile robot is established,and the quasi-min-max MPC strategy is used to obtain the optimal velocity solution,thus suppressing the sudden changes in velocity.Ultimately,it is ensured that the mobile robot can reach the desired position with a smooth velocity.Simulation results verify the effectiveness of the proposed strategy in suppressing the velocity mutation.
Mobile Robots' Path Planning Method Based on Policy Fusion and Spiking Deep ReinforcementLearning
AN Yang, WANG Xiuqing, ZHAO Minghua
Computer Science. 2024, 51 (11A): 240100211-11.  doi:10.11896/jsjkx.240100211
Abstract PDF(4480KB) ( 169 )   
References | Related Articles | Metrics
Deep reinforcement learning(DRL) has been applied to mobile robots' path planning successfully,and the DRL-based mobile robots' path planning methods are suitable for high-dimensional environments and stand as a crucial method for achieving autonomous learning in mobile robots.However,training DRL models requires a large amount of interacting experience with the environment,which leads to heavy computational cost.In addition,the limited memory capacity within DRL algorithms hinders the assurances of effective utilization of experiences.Spiking neural networks(SNNs),one of the main tools for brain-inspired computing,are suitable for robots' environmental perception and control with SNNs' unique bio-plausibility and the ability of incorporating spatio-temporal information simultaneously.In this paper,we combine SNNs,convolutional neural networks(CNNs),and policy fusion for DRL-based mobile robots' path planning,and have accomplished the following works:1)We propose the SCDDPG(spike convolutional DDPG,SCDDP) algorithm,which employs CNNs for multi-channel feature extraction of input states and SNNs for spatio-temporal features extracting.2)Based on SCDDPG and the designed state constraint policy,the SC2DDPG(State Constraint SCDDPG,SC2DDPG) algorithm is proposed to constrain the robot's operation states,which avoids unnecessary environment exploration and improves the convergence speed of DRL model in SC2DDPG.3)Based on SCDDPG,the PFTDDPG(policy fusion and transfer SCDDPG,PFTDDPG) algorithm is proposed.The PFTDDPG implements the “wall-follow” policy to pass the wedge-shaped obstacles in the environment.Additionally,PFTDDPG incorporates transfer learning to transfer prior knowledge between policies in mobile robots' path planning.PFTDDPG not only completes path planning tasks that cannot be completed solely by RL,but also yields the optimal collision-free paths.Furthermore,PFTDDPG improves the convergence speed of the DRL model and the performance of the planed path.Experimental results validate the effectiveness of the proposed path planning algorithms.The comparison experimental results indicates that compared with SpikeDDPG,SCDDPG,SC2DDPG and PFTDDPG algorithms,the PFTDDPG algorithm achieves the best performance in the path planning success rate,training convergence speed,planning path length.This paper not only proposes new ideas for mobile robots' path planning,but also enriches the solution policy of DRL in mobile robots' path planning.
Optimization of Low-carbon Oriented Logistics Center Distribution Based on Genetic Algorithm
JIANG Yibo, ZHOU Zebao, LI Qiang, ZHOU Ke
Computer Science. 2024, 51 (11A): 231200035-6.  doi:10.11896/jsjkx.231200035
Abstract PDF(2900KB) ( 172 )   
References | Related Articles | Metrics
The transportation industry,as one of the main contributors to carbon emissions,urgently needs effective carbon reduction reforms to help the country achieve thecarbon peaking and neutrality goals.Aiming at the current mainstream logistics center logistics model,a low-carbon oriented logistics optimization multi-objective model is established with the goals of minimizing carbon emissions per unit freight turnover,minimizing freight costs,and minimizing delivery time.The NSGA-II multi-objective genetic algorithm is improved based on the characteristics of this model and the characteristics of the scenario.An abstract data sample of an express company is used to test the effectiveness and progressiveness of the multi-objective optimization model and the improved NSGA-II algorithm.Experimental results show that from the perspectives of optimization scheduling and path planning,optimizing the entire distribution process through search and solution can effectively achieve the preset goal of cost control and carbon reduction,and provide theoretical basis for logistics enterprise distribution decision-making.The research results also indicate that carbon reduction and cost control are constraints in logistics,and different target preferences can have a significant impact on decision-making.
Clinical Findings Recognition and Yin & Yang Status Inference Based on Doctor-Patient Dialogue
LIN Haonan, TAN Hongye, FENG Huimin
Computer Science. 2024, 51 (11A): 231000084-7.  doi:10.11896/jsjkx.231000084
Abstract PDF(2053KB) ( 139 )   
References | Related Articles | Metrics
Clinical findings recognition and Yin & Yang status inference are import tasks in the field of intelligent healthcare.The goal is to identify clinical findings such as diseases and symptoms from doctor-patient dialogue record,then determine their Yin & Yang status.The main weakness of existing research is as follow:(1)Lack of modeling of semantic information and dialogue structure in doctor-patient dialogues,leading to low model accuracy.(2)Implementing it as a two-stage process,it will cause error accumulation.This paper proposes a unified generative method that incorporates dialogue information.It achieves this by constructing a static-dynamic fusion graph to model semantic and structural information in doctor-patient dialogues,enhancing the model's understanding of conversations.And utilizes a generative language model to unify clinical findings recognition andYin & Yang status inference into a sequence generation task,mitigating the problem of error accumulation.Additionally,it improves the accuracy of Yin & Yang statue inference by identifying Yin & Yang statue indicator words.Experimental results on the CHIP2021 evaluation dataset CHIP-MDCFNPC show that the proposed method achieves an F1 score of 71.83%,which is 2.82% higher than the baseline model on average.
Multi-task Emotion-Cause Pair Extraction Method Based on Position-aware Interaction Network
FU Mingrui, LI Weijiang
Computer Science. 2024, 51 (11A): 231000086-9.  doi:10.11896/jsjkx.231000086
Abstract PDF(3358KB) ( 178 )   
References | Related Articles | Metrics
The task of emotion-cause pair extraction is to extract emotion clauses and reason clauses simultaneously.Previous methods regard emotion-cause pair extraction as three independent tasks of emotion extraction,cause extraction,and emotion-cause pair extraction,which cannot effectively capture the connection between tasks.In addition,the existing two-stage models suffer from error propagation problems,and the relative position distribution between emotion clauses and reason clauses is unbalanced.This paper proposes a new emotional reason pair extraction model MK-BERT based on BERT,sentiment lexicon and position-aware interaction module.The model first uses the BERT enhanced by the sentiment lexicon for document encoding.In order to solve the problem of label position imbalance,a position-aware interaction module is designed according to the relative distance between the emotion clause and the reason clause to capture the position information and construct the characteristics of the emotion-cause pair.Then,through interactive encoding between the emotion prediction module and the reason prediction mo-dule,the shared information among multiple tasks is fully mined.Experimental results on the Chinese emotion-reason pair extraction dataset show that the proposed modelcan effectively extract emotion-reason pairs and achieve good performance on positio-nally imbalanced samples.
Regular Expression Generation Based on Natural Language Syntax Information
WANG Hao , WU Junhua
Computer Science. 2024, 51 (11A): 231200017-6.  doi:10.11896/jsjkx.231200017
Abstract PDF(2572KB) ( 154 )   
References | Related Articles | Metrics
Regular expressions are composed of a series of characters and metacharacters,defining a matching pattern that can be used to check whether a string matches the desired criteria.Many developers find it is difficlult to write regular expressions during the software development process.Therefore,generating regular expressions based on natural language requirements has become a research focus.In recent years,systems that transform natural language descriptions into regular expressions have achieved some research results,but often only for simple serialized texts.This paper explores methods for converting natural language queries into regular expressions that can execute their intended functionality.Given the successful application of syntactic parsing in natural language processing,our model utilizes the structural information of natural language by embedding syntax parse trees in a hierarchically aggregated manner.We employ the Tree-transformer architecture,suitable for input tree structures,to perform self-attention encoding on natural language descriptions.The decoder uses cross-attention to predict the regular expression.The model is validated on two public datasets.Experimental results demonstrate that our model effectively improves the quality of generated regular expressions.It outperforms existing models in the DFA-Equal-Acc evaluation metric.
Deep Learning-based Method for Mining Ocean Hot Spot News
QIN Xianping, DING Zhaoxu, ZHONG Guoqiang, WANG Dong
Computer Science. 2024, 51 (11A): 231200005-10.  doi:10.11896/jsjkx.231200005
Abstract PDF(2382KB) ( 151 )   
References | Related Articles | Metrics
The rapid development of the mobile Internet and the popularity of modern mobile clients promote the vigorous deve-lopment of the online news industry,social media and self-media,etc.,providing users with diverse and rich information.With the steady advancement of China's maritime power strategy and the significant enhancement of national maritime eawareness,the Internet is flooded with multifaceted information on the ocean field, with relevant media reports and public opinions proliferating online and hotspot events occurring frequently.Aiming at multi-source and multi-attribute network marine information,based on multi-source text clustering and automatic summarization technology,an automatic deep learning-based ocean hot news mining system is proposed,including five functional modules:automatic collection of multi-source ocean-related data,data preprocessing,feature extraction,text clustering,and automatic summarization.Specifically,the web crawler program collects diverse and scattered ocean data from multiple data sources,automatically structures the data and stores it in the database;clustering analysis is performed based on the similarity of text features and relationships between texts,which provides data support for subsequent summarization generation and topic discovery.Additionally,an automatic summary generation method for ocean news is proposed,leveraging the powerful contextual understanding and rich language expression abilities of the pre-trained language mo-dels.Multiple experiments demonstrate the effectiveness of the proposed method in each evaluation index,highlighting its superiority in mining news on multi-source heterogeneous networks.This method provides a feasible solution for processing scattered marine information and generating more readable content summaries,significantly contributing to the enhancement of marine information retrieval efficiency,monitoring public opinion trends,and promoting the application and dissemination of marine information.
Aspect-based Sentiment Analysis Based on BERT Model and Graph Attention Network
LIN Huang, LI Bicheng
Computer Science. 2024, 51 (11A): 240400018-7.  doi:10.11896/jsjkx.240400018
Abstract PDF(2530KB) ( 201 )   
References | Related Articles | Metrics
Aspect-based sentiment analysis is a fine-grained sentiment analysis task that aims to analyze the sentiment polarity of specific aspects in a given text.Current syntax-based methods heavily rely on a single parsing result from the dependency tree,and most of the existing research lacks sufficient integration of semantic and syntactic features.Therefore,this paper proposes an aspect-based sentiment analysis approach based on the BERT model and graph attention network.This method can effectively explo the semantic and syntactic information in sentence structures and fuse these information through an interactive attention mechanism to obtain more accurate sentiment features.Firstly,the BERT pre-trained model is utilized to obtain the initial vectors of the text,and an attention mechanism is employed to associate the globalsemantic information of aspect words,resulting in the semantic features of the text.Secondly,a syntax parser is used to construct a phrase structure graph and a dependency graph,and a graph attention network is applied to encode the node information,leading to the syntactic features of the text.Finally,through an interactive attention mechanism,the learned semantic and syntactic features are combined to achieve a comprehensive understanding of aspect-opinion relationships from multiple perspectives.Experimental results show that the proposed method outperforms the existing state-of-the-art methods with ACC and F1 values on multiple datasets.
Study on Named Entity Recognition of NOTAM Based on BiLSTM-CRF
XIANG Heng, YANG Mingyou, LI Meng
Computer Science. 2024, 51 (11A): 240300148-6.  doi:10.11896/jsjkx.240300148
Abstract PDF(2269KB) ( 154 )   
References | Related Articles | Metrics
Aiming at the problem that the current research of International Civil Aviation Organization in digital NOTAMs,which only considers the compatibility with the environment of textual NOTAMs,but not digital NOTAMs,a named entity recognition model for NOTAMs based on BiLSTM-CRF is proposed to realise the automatic recognition of relevant entities in textual NOTAMs and to provide the necessary basic data for the conversion of digital NOTAMs.Comparative experiments are carried out by constructing a NOTAM corpus tagged dataset in three models,LSTM,BiLSTM and BiLSTM-CRF,and the experimental results show that the precision,recall and F1 value of the proposed method is 95%,95% and 95%,respectively,which verifies the effectiveness of the proposed method in the field of NOTAMs and proves that this study can effectively obtain the important entity information in NOTAMs.
Multi-task Learning Model for Text Feature Enhancement in Medical Field
GUO Ruiqiang, JIA Xiaowen, YANG Shilong, WEI Qianqiang
Computer Science. 2024, 51 (11A): 240200041-7.  doi:10.11896/jsjkx.240200041
Abstract PDF(1971KB) ( 200 )   
References | Related Articles | Metrics
The recognition and standardization of medical named entities are the foundation for constructing high-quality medical knowledge graphs.This paper proposes a multi-task learning model based on text feature enhancement,aiming to address the issue of inadequate utilization of text features in existing models for medical entity recognition and standardization.The model incorporates word-level,character-level features,and contextual semantic information to enhance text representation.Through four hierarchical sub-tasks,it jointly models medical entity recognition and standardization tasks.Experiments indicate that the proposed model can learn common features for both entity recognition and entity standardization tasks,effectively improving the accuracy of learning.Satisfactory results are achieved on two datasets,NCBI and BC5CDR,with F1 scores for NER and NEN tasks 1.09%,91.02%;92.05%,92%,respectively.
Sentiment Analysis of Image-Text Based on Multiple Perspectives
GAO Weijun, SUN Zibi, LIU Shujun
Computer Science. 2024, 51 (11A): 231200163-8.  doi:10.11896/jsjkx.231200163
Abstract PDF(3607KB) ( 189 )   
References | Related Articles | Metrics
In the realm of social media,facial expressions of characters in pictures often captivate our attention first,directly evoking strong emotional responses.However,for a truly comprehensive emotional expression,scenes play a pivotal role,serving as a crucial backdrop and support for emotional analysis.Scenes provide context,setting the tone and atmosphere for the emotions being expressed.Regrettably,numerous scholars have failed to fully recognize the significance of scenes in emotional expression,often focusing solely on facial expressions.This oversight has led to suboptimal outcomes in sentiment analysis,missing out on the rich emotional nuances that scenes can provide.To address these challenges,we propose the multi-view image text sentiment analysis network(MITN).This innovative approach takes into account both facial expressions and scenes,providing a more comprehensive analysis of emotional expression.In MITN,we enhance image feature extraction by incorporating an attention mechanism that meticulously captures the facial expressions of characters.At the same time,dilated convolution is introduced to broa-den the receptive field,focusing on the intricate details of the scene.Moreover,we leverage the Places dataset for transfer learning training of Scene-VGG.This allows us to fully utilize the vast amount of scene information available,enhancing the accuracy and depth of our emotional analysis.The effectiveness of MITN is rigorously tested through experiments on the multimodal sentiment dataset MVSA.Utilizing BERT+BiGRU to extract text expression features,our model demonstrates superior performance in sentiment analysis,accurately capturing the emotional nuances present in both facial expressions and scenes.This comprehensive approach offers a new perspective in sentiment analysis,paving the way for more accurate and nuanced understanding of emotio-nal expression in social media.
Fine-grained Entity Recognition Model in Audit Domain Based on Adversarial Migration ofSample Contributions
PANG Bowen, CHEN Yifei, HUANG Jia
Computer Science. 2024, 51 (11A): 240300197-8.  doi:10.11896/jsjkx.240300197
Abstract PDF(2073KB) ( 151 )   
References | Related Articles | Metrics
Fine-grained named entity recognition(NER) identifies entity information in pro-poor texts in the auditing domain,which is crucial for optimising the analysis and evaluation of pro-poor policy effectiveness.In recent years,deep learning has achieved significant results in fine-grained NER tasks,but the specific domain still faces problems such as the lack of corpus set,the increasing incompatibility of fine-grained features in transfer learning,and data imbalance.To address these issues,we formulate a fine-grained pro-poor audit entity labelling system and construct a fine-grained pro-poor audit corpus(FG-PAudit-Corpus) to address the scarcity of datasets in the audit domain.A fine-grained entity recognition model(FGATSC) based on sample contribution against migration is proposed,which does the training against migration and proposes to incorporate the sample contribution weights into the migrated features to solve the incompatibility problem of fine-grained features.Meanwhile,for the imbalance between high resources in the source domain and low resource samples in the pro-poor audit domain,balanced resource adversarial discriminator(BRAD) is proposed to reduce this effect.Experimental results show that the F1 value of the FGATSC model on FG-PAudit-Corpus is 75.83%,which is improved by 9.03% compared with the baseline model,and 4.01% to 6.53% compared with the other mainstream models.For the generalisation validation on the Resume dataset,the F1 is improved by about 0.14% to 1.31% compared with the mainstream models in recent years,and reaches 95.77%.In summary,the validity and generali-zability of the FGATSC model are verified.
Autonomous Exploration Methods for Unmanned Aerial Vehicles Based on Deep ReinforcementLearning
TANG Jianing, LI Chengyang, ZHOU Sida, MA Mengxing, SHI Yang
Computer Science. 2024, 51 (11A): 231100139-6.  doi:10.11896/jsjkx.231100139
Abstract PDF(3462KB) ( 186 )   
References | Related Articles | Metrics
Faced with unstructured and unknown environments,such as exploring in mountains and jungles,UAVs must simultaneously perform environment sensing and trajectory planning in the absence of a priori conditions.Traditional methods are constrained by multiple factors such as algorithms and sensors,resulting in limited exploration range,low efficiency,and susceptibility to interference from environmental changes.To solve this problem,this study proposes an autonomous exploration method for UAVs based on deep reinforcement learning.The method is based on the normalized advantage functions(NAF) algorithm and introduces three algorithmic enhancement mechanisms to improve the exploration range and efficiency of UAVs in unstructured and unknown environments.By conducting experiments in a self-designed simulation environment,the results of simulation expe-riments and analysis show that the improved NAF algorithm has a larger exploration range and higher efficiency compared to the original version,while exhibiting superior convergence and robustness.
Model Counting Method for Pseudo-Boolean Constraints
ZHENG Suhao, NIU Qinzhou, TAO Xiaomei
Computer Science. 2024, 51 (11A): 240300161-5.  doi:10.11896/jsjkx.240300161
Abstract PDF(1939KB) ( 151 )   
References | Related Articles | Metrics
Pseudo-Boolean constraints problem is a kind of combinatorial optimization problem similar to the Boolean constraints problem.The core of solving such problems lies in encoding Pseudo-Boolean constraints in different mathematical forms,such as linear programming,integer programming,and other forms of combinatorial optimization.The current popular solution is to convert the problem into Boolean formulas,and then use the conflict-oriented clause learning(CDCL) class SAT solver to solve these Boolean formulas.This paper proposes a new method to solve the model counting problem in the Pseudo-Boolean constraints problem.Firstly,the related concepts of knowledge compilation and extension rules are introduced,and then how to transform Pseudo-Boolean constraints problem into binary decision diagram(BDD) by knowledge compilation is discussed in detail,and the characteristics of BDD structure are discussed emphatically.Finally,the model counting method based on extension rules isadoptedto deal with the model counting problem in Pseudo-Boolean constraints problem.Experimental results show that the proposed method has better performance when dealing with clauses with higher complementarity factors.
Tourism Route Planning Based on Colored Traveling Salesman Problem
GU Chan, FU Yan, YE Shengli
Computer Science. 2024, 51 (11A): 231200072-8.  doi:10.11896/jsjkx.231200072
Abstract PDF(3274KB) ( 166 )   
References | Related Articles | Metrics
Studying the tourism route planning problem is crucial to promote the development of urban tourism and improve tou-rists' experience,which is usually explored as a coloring traveler′s problem in abstraction by researchers.However,in solving the optimal routes for large-scale attractions,existing methods suffer from poor solution flexibility and slow convergence speed.Therefore,the coloring traveler problem and its application in tourism route planning are explored based on automata.Firstly,the roadmap of attractions is established by automata,and decentralized discussion is carried out by using the hierarchical automata method.Secondly,the selection of attractions is flexibly handled,and invalid paths are deleted by using its structural properties.Finally,the shortest route is solved by using ant colony algorithm on the simplified model.The experiment selects the attractions in and around Xi'an city as the sample data,and the research results show that,compared with the traditional ant colony algorithm and simulated annealing algorithm,the proposed algorithm reduces the complexity of the problem,searches within the effective range,and can converge and find the shortest path within 20 iterations,and it can also flexibly plan the reasonable tourist route according to the tourists' personalized needs.
Design and Application of Attention-enhanced Dynamic Self-organizing Modular Neural Network
ZHANG Zhaozhao, PAN Haoran, ZHU Yingqin
Computer Science. 2024, 51 (11A): 231000069-9.  doi:10.11896/jsjkx.231000069
Abstract PDF(3626KB) ( 150 )   
References | Related Articles | Metrics
In response to the complexity and non-linear characteristics of chaotic time series,this paper proposes a novel neural network model specifically designed to address these challenges:the attention-enhanced dynamic self-organizing modular neural network(ADAMNN).Grounded in the divide-and-conquer philosophy,this model employs an attention mechanism to compute the similarity between different sub-networks and input data,facilitating an adaptive partitioning of sub-networks through hierarchical clustering.Subsequently,a dynamic growth mechanism,based on hierarchical clustering,adjusts the size of sub-network clusters.Ultimately,activated sub-network clusters are employed for online learning of input samples.Simultaneously,we introduce a novel attention-based sub-network weighted ensemble output method,integrating traditional ensemble output approaches.Ultimately,experiments were conducted on the Mackey-Glass time series,the rapidly varying MG time series,in the realm of nonli-near system identification,and using gas concentration datasets from coal mining operations.The ADAMNN model exhibited its proficiency in real-time updates of sub-network centers and the dynamic formation of sub-network clusters.Moreover,compared to dynamic self-organizing modular neural networks based on Euclidean space,ADAMNN exhibits an approximately 40% improvement in prediction accuracy.
Dynamic Multi-Objective Optimization Algorithm with Irregularly Varying Number of Objectives
LI Sanyi, LIU Shuang
Computer Science. 2024, 51 (11A): 231000079-11.  doi:10.11896/jsjkx.231000079
Abstract PDF(3946KB) ( 167 )   
References | Related Articles | Metrics
tIn this paper,a hybrid strategy based initial population prediction algorithm(HIPPA) is proposed to solve the dynamic multi-objective optimization problem where the number of objectives varies irregularly with time.HIPPA determines whether the environment has changed according to the number of objectives,and divides the environment type according to the different number of objectives.In the population initialization stage,the initial population is generated by three mechanisms.First,an improved neural network algorithm is trained using historical population information to generate a part of the initial population.Second,the improved elite strategy uses historical population information to generate a portion of the initial population.Finally,an improved random strategy is used to generate a portion of the population to maintain the diversity of the population.In this paper,the effectiveness of the proposed algorithm is verified by reference experiment F1-F5,and the results are compared with other dynamic optimization algorithms.Experimental results show that HIPPA can more effectively solve the dynamic multi-objective optimization problem where the number of objectives varies irregularly with time.
High-generalization Ability EEG Emotion Recognition Model with Differential Entropy
LI Zhengping, LI Hanwen, WANG Lijun
Computer Science. 2024, 51 (11A): 231200066-7.  doi:10.11896/jsjkx.231200066
Abstract PDF(5347KB) ( 177 )   
References | Related Articles | Metrics
With the advent of deep learning,the study of EEG signals has been further developed,and the commonly used me-thods for classification of emotions based on deep learning include artificial neural network(ANN) and deep learning(DL).How-ever,EEG signals are limited sample data,and for networks such as deep learning,which require a large amount of data-driven training to complete classification tasks,how to improve the effect and generalization performance of classification tasks with a limited amount of data is a research focus.In order to solve the problem of the influence of the real environment on the EEG signal and the generalization of the neural network model in EEG research,this paper fully excavates the information contained in the EEG signal,proposes a deep learning model that considers both the original EEG signal and the DE feature,and designs the data acquisition process and processing process of the experiment.Experiments are carried out on DEAP dataset,SEED dataset and experimental data to evaluate the performance effect and generalization ability of the built network,and to explore the correlation between deep learning networks in emotion classification on EEG signals.The network model and feature processing method constructed in this paper obtain an accuracy of 85.62% in the sentiment tri-classification on the SEED dataset.The accuracy of 59.38% and 61.70% is obtained in the emotional binary classification of the two dimensions of valence and arousal of the original EEG on the DEAP dataset,respectively.
Parallel Computing of Reentry Vehicle Trajectory by Multiple Shooting Method Based onOPENMP
LI Siyao, LI Shanglin, LUO Jingzhi
Computer Science. 2024, 51 (11A): 231000019-6.  doi:10.11896/jsjkx.231000019
Abstract PDF(3111KB) ( 178 )   
References | Related Articles | Metrics
The implementation is based on the multiple target method.By turning the trajectory optimization problem of a booster glider into a nonlinear programming problem,optimizing a three degree of freedom reentry trajectory,using openmp with sequence quadratic programming optimizer,it is possible to perform parallel computing on the integrals in equation constraints.Using multiple target methods constructs a reentry trajectory optimization algorithm,and performs parallel computation on the model.The optimization method used on the MATLAB version is the interior point method,while the optimization method used on C is the sequential quadratic programming algorithm.The above program is converted based on MATLAB.The CAV-H model is selected for computing in the simulation experiment.Parallel computing achieves 8.398 times the acceleration ratio using openmp.The results of the multiple target method and the direct target method are basically consistent.The heat absorption capacity of the minimum heat absorption multiple target method is not much different from that of the direct target method with the minimum heat absorption as the objective function.Through simulation,the acceleration ratio is the highest when the number of threads is 13,and the average relative efficiency is more than 93%.
Image Processing & Multimedia Technology
Research Progress of 3D Point Cloud Data Processing Methods
GUO Zhangxiang, YAN Tianhong, ZHOU Guoqiang
Computer Science. 2024, 51 (11A): 240100132-13.  doi:10.11896/jsjkx.240100132
Abstract PDF(2055KB) ( 279 )   
References | Related Articles | Metrics
Point cloud is one of the important forms to understand 3D scenes,and 3D point cloud has important applications in reverse modeling of offshore platforms,seabed topography mapping,damage measurement of mooring systems of deep-water floating structures,and visualization of submarine pipelines.Based on this,this paper sorts out the point cloud data processing methods and divides them into two categories:traditional processing algorithms and deep learning-based methods.The traditional processing algorithms are introduced and summarized from three aspects:filtering,object recognition,classification and registration.Based on the deep learning method,it is introduced and summarized from three aspects:point cloud,voxelization and multi-view.The advantages and disadvantages of various algorithms are summarized and compared,and the future development trend and direction of 3D point cloud processing technology are prospected.
Classification of Thoracic Diseases Based on Attention Mechanisms and Two-branch Networks
SONG Ziyan, LUO Chuan, LI Tianrui, CHEN Hongmei
Computer Science. 2024, 51 (11A): 230900116-6.  doi:10.11896/jsjkx.230900116
Abstract PDF(2778KB) ( 164 )   
References | Related Articles | Metrics
Thoracic disease classification based on chest radiographs is important to improve diagnostic accuracy and reduce the pressure on the healthcare system.The huge variation in the size of the regions of different thoracic diseases is the main challenge in the chest radiograph-based classification of thoracic diseases.When classifying diseases with small onset regions,most of the regions in the image are noisy regions,and it is difficult for traditional methods to cope with the huge size differences among di-seases effectively.To address this problem,a mask construction method combining multi-scale features is proposed,using DenseNet-121 as the feature extractor,a two-branch network is constructed,in which the global network is used for the overall classification,and tiny lesion regions are fed into the local branches to mitigate the interference of noisy regions.The branch feature fusion module based on the attentional mechanism is used to fuse the classification features from the two branches' information adaptively.Comparison experiments,ablation experiments,and parameter sensitivity analyses are performed on the ChestX-ray14 dataset.The experimental results show that the average AUC of the proposed method for classifying 14 thoracic diseases is higher than that of the existing methods,which is effective and parameter insensitive.
Study on Automatic Segmentation Method of Retinal Blood Vessel Images
ZHAO Yanli, XING Yitong, LI Xiaomin, SONG Cai, WANG Peipei
Computer Science. 2024, 51 (11A): 231000061-7.  doi:10.11896/jsjkx.231000061
Abstract PDF(4796KB) ( 175 )   
References | Related Articles | Metrics
With the rapid advancement of computer science and technology,digital image processing is widely used in medical diagnostics.Due to the close correlation between human health and retinal vascular characteristics,evaluating retinal images has become crucial for medical diagnoses.Traditional manual retinal vessel segmentation is time-consuming and lacks reproducibility,no longer meeting current demands.Consequently,this paper introduces an automatic retinal vessel segmentation algorithm.Firstly,it uses RGB color model,histogram equalization and morphological methods to enhance the preprocessing of the retinal image.Secondly,the Otsu threshold method is used to segment and extract the main blood vessels of the image,and then the threshold is dynamically adjusted through Gaussian matching filtering to realize the segmentation of small blood vessels,and then merge and optimize the segmented main trunk and small blood vessel images.Finally,the performance of the proposed algorithm is assessed using 20 images from the DRIVE image database,demonstrating an average accuracy,sensitivity,and specificity of 96.2%,77.3%,and 97.9%,respectively,affirming its effectiveness and reliability.
3D Reconstruction Algorithm for Lower Limb X-ray Images Based on Generative AdversarialNetworks
YE Ruiwen, WANG Baohui
Computer Science. 2024, 51 (11A): 230900089-7.  doi:10.11896/jsjkx.230900089
Abstract PDF(4562KB) ( 166 )   
References | Related Articles | Metrics
Lower limb bone deformities have always been a common and difficult to treat disease in orthopedic medicine.Doctors usually need to judge the degree of deformities based on the patient's lower limb bone anteroposterior and lateral X-ray images.Its diagnosis and surgical plan design highly rely on the professional level and experience level of doctors,which is a very important challenge in the current medical field.In order to reduce the difficulty of diagnosis for doctors and provide them with more intuitive and accurate models of lower limb bone deformities,this study applies artificial intelligence deep learning technology to medical image processing and 3D reconstruction,and proposes the PSSobel-X2CTGAN model to achieve reconstruction from 2D X-ray films to 3D CT images.The main research content includes:1)investigating and organizing the data preprocessing processes of CT image normalization,cropping and scaling,and DRR generation,so as to better apply them to the training and prediction of 3D reconstruction models;2)apply the generative adversarial principle to model training,and optimize the sampling process on the generator to make the generated 3D model closer to the real situation;3)design a reasonable loss function,based on the basic reconstruction and projection losses,and introduce sobel loss to make the edges of the final image clearer,making it more suitable for high-precision 3D bone model reconstruction.Experiments are conducted on open-source pelvic and knee joint data,and the results showed that this model outperformed the original model in various evaluation indicators.Moreover,from the visual image results,the model can achieve satisfactory results and has practical value for the diagnosis of lower limb deformities.
EO-YOLOX Model for Insulators Detection in Transmission Lines
HU Yimin, Qu Guang, WANG Xiabing, ZHANG Jie, LI Jiadong
Computer Science. 2024, 51 (11A): 240200107-6.  doi:10.11896/jsjkx.240200107
Abstract PDF(3595KB) ( 160 )   
References | Related Articles | Metrics
To ensure the safe operation of the power system,daily inspection of high voltage insulators using UAV inspection techniques is necessary.However,the influence of power line magnetic field and flight safety leads to a reduction of insulator pixel representation in the image data,which in turn reduces the accuracy of insulator detection.To address these issues,this paper proposes an efficient optimization YOLOX(EO-YOLOX) detection model.Firstly,the model makes use of the idea of atrous convolution and proposes the atrous spatial pyramid pooling(ASPP) module,which eliminates the irrelevant information in the image and improves the ability of the network to identify the region of interest.Secondly,the attentione feature fusion(AFF) module is ad-ded to the feature fusion stage,which improves the accuracy of detecting insulators by supplementing deep semantic and shallow detail information into the fused feature map.Finally,for the problem that the traditional loss function cannot accurately reflect the distance between two bounding boxes,this paper proposes an optimised loss function to more accurately assess the quality of the bounding boxes.Experiments and tests are carried out on the insulator dataset,and the experiment results show that the proposed algorithm performs excellently in identifying insulators,with an improvement of about 2.59% in mAP value,compared with the traditional YOLOX method.The real-time processing efficiency of the model is as high as 41.21 frames per second,which effectively solves the insulator detection problem.
Graphical LCD Pixel Defect Detection Algorithm Based on Improved YOLOV8
ZHANG Feng
Computer Science. 2024, 51 (11A): 240100162-7.  doi:10.11896/jsjkx.240100162
Abstract PDF(4489KB) ( 179 )   
References | Related Articles | Metrics
During the inspection process of industrial instrument LCD displays,pixel defects are difficult to detect due to its small pixel size.Traditional computer vision methods are sensitive to environmental changes and require manual setting of parameters.In response to the above problems,this paper designs an LCD screen defect detection algorithm based on deep learning,which can identify pixel-level pixel defects on the LCD screen under lower computing power.The main work includes:(1)Aiming at the problem of the small number of positive samples in the sample assigner process of positive and negative samples for small-sized targets,an adaptive positive samples enhancement method for targets of different sizes is proposed.(2)Aiming at the problem of difficulty in small-sized targets training caused by small IoU of positive samples,an adaptive positive sample IoU compensation weighting method is proposed.(3)Aiming at the problem that small data sets are sensitive to hyperparameters in the loss function,a positive and negative cross-entropy imbalance weight classification loss function is designed.(4)In order to solve the pro-blem that detailed features of small-sized targetare difficult to extract,frequency channel attention is introduced in the backbone network to enhance the ability to extract detailed features of small targets.Experiments show that compared with the baselinecomparison model YOLOV8,the mAP_s reaches 63.3%,which is 3.7% higher than the baseline.The mAP_s for pixel defects reaches 78.85%,which improves 4.5%.Meanwhile,the recall rate of pixel defects reaches 99.8%.The mAP_s for dust detection targets reaches 47.8%,improves 3%.These fully verify the effectiveness of the proposed algorithm.
Event-based Camera Object Detection Algorithm for Cross-modal Noisy Annotations Filtering
HU Gang, LIANG Dong, HUANG Shengjun
Computer Science. 2024, 51 (11A): 231000013-6.  doi:10.11896/jsjkx.231000013
Abstract PDF(2465KB) ( 179 )   
References | Related Articles | Metrics
Event-based camera is commonly seen in object detection in limited scenarios for traditional camera applications (high speed,strong light,low light,etc.) due to their high time resolution,high dynamic range and low power consumption.However,the event sequence output of event camera is difficult to be manually labeled due to its pixel asynchronism,so the existing me-thods obtain event sequence annotations through the migration of RGB image annotations.However,since the migrated annotations have numerous inaccurate bounding boxes and some object textures in event sequence are fuzzy,leading to poor model performance.To address this problem,event-based camera object detection algorithm for cross-modal noisy annotations filtering is proposed.The method uses a pre-trained event-based camera detector to filter open-source RGB object detection datasets and selects RGB images that are most valuable for training the event-based camera detector.These selected RGB images are combined with event images to construct cross-domain mixed images,helping the detector to identify and locate the event image object more accurately.To mitigate the impact of noisy annotations on detector performance,a multi-stage object detection joint optimization strategy is designed.After each stage of training is completed,noisy annotations are identified in the global annotations and are corrected use in the next stage.Experimental results show that,on the 1Mpx Detection Dataset,the robust event-based camera cross-modal object detection method based on noisy annotations provides 8.35% model gain compared to the baseline model,significantly outperforming noise-label learning methods such as Co-teaching and O2U-net.Specifically,cross-modal hybrid images training and joint optimization frameworks offer model gains of 6.44% and 4.77%,respectively.
Mountain Fire Detection Algorithm of Transmission Line Based on Multi-scale Features and Coordinate Information
CHEN Dong, ZHOU Hao, YUAN Guowu, YANG Lingyu, CHENG Qiuyan, REN Ying, MA Yi
Computer Science. 2024, 51 (11A): 230900155-7.  doi:10.11896/jsjkx.230900155
Abstract PDF(4011KB) ( 176 )   
References | Related Articles | Metrics
Due to variable scale and complex background of smoke fire targets in transmission line mountain fires,it may lead to low detection accuracy and false alarms.To address these issues,this paper proposes an improved YOLOv5 object detection algorithm.To tackle the problem of scale variability,aiming at the problem that SPPF cascade structure only focuses on local feature information during scale fusion,this paper proposes spatial pyramid pooling cross stage partial conva(SPPCSPC),which combines hierarchical and cross stage partial networks(CSP) structures,effectively extracting and fusing multi-level and different scale global feature information to enhance the model's ability to detect smoke and fire targets.Secondly,to solve the misdetection problem,this paper proposes a new neck network PCANet(path coordinate aggregation network).While integrating the backbone network and neck feature maps,it integrates the target's position information from the vertical and horizontal directions of the feature maps into the channels respectively.This enhances the model's attention to smoke fire target position information and reduces interference from complex backgrounds.Experiments are conducted on a transmission line smoke fire dataset to evaluate the proposed model.The proposed algorithm achieves an increase of 1.6% in mAP50,1.5% in mAP50:95,and 2.4% in recall,respectively,which effectively improve the detection accuracy of smoke and fire target,reduce the occurrence of false detection,and can better complete the detection task.
ORB Algorithm Based on Key Point Density Optimization
JING Youxian, ZHU Qingsheng
Computer Science. 2024, 51 (11A): 240300048-5.  doi:10.11896/jsjkx.240300048
Abstract PDF(2745KB) ( 132 )   
References | Related Articles | Metrics
In stereo vision inspection system,feature matching technology is crucial for identifying and aligning similar features between different images,and realizing many tasks such as image comparison,object recognition,and 3D reconstruction.The qua-lity of feature matching directly affects the accuracy of the whole stereo vision detection system.Feature point extraction is the basis of feature matching,and the quality of these points directly determines the accuracy of matching and the robustness of the algorithm.The ORB algorithm is widely used in the feature matching task because of its high efficiency,but there are deficiencies in terms of the number and uniformity of the distribution of the feature points when dealing with complex scenes.In this paper,an improved adaptive sampling method based on the density of keypoints is proposed to optimize the distribution of keypoints in the ORB algorithm by combining the local contrast and gradient information of the image,so as to achieve the uniform selection of keypoints in the whole image and to improve the performance of feature point extraction.Experimental results on the Middlebury stereo vision dataset show that the improved algorithm significantly improves the number of keypoints and the uniformity of distribution compared to the traditional method,while maintaining an operational efficiency close to that of the original ORB algorithm.This study not only provides an effective solution to the shortcomings of the ORB algorithm in complex scene processing,but also opens up a new way for the optimization of feature point extraction and matching in the field of computer vision.
Multi-scale Dual Self-attention Based Remote Sensing Image Change Detection
SHI Jingye, ZUO Yiping, ZHI Ruicong, LIU Jiqiang, ZHANG Mengge
Computer Science. 2024, 51 (11A): 231000097-9.  doi:10.11896/jsjkx.231000097
Abstract PDF(5553KB) ( 161 )   
References | Related Articles | Metrics
Aiming at the problems of different scales of ground objects,insufficient context information and difficult recovery of edge details in remote sensing images,a pixel-level change detection network(PixelNet) based on multi-scale dual self-attention is proposed to realize the task of remote sensing images change detection.On the one hand,the multi-scale feature pyramid based on hybrid cavity convolution is used to extract the convolution features,and the dual self-attention module is added to obtain the channel and spatial attention.The feature receptive field is increased while considering the details and semantic information,and the global context information is further increased.On the other hand,in order to optimize the boundary smoothing fuzzy problem of ground object,a new edge repair module is implemented through automatic joint training of edge sensing loss and weighted contrast loss.To solve the problem of sample imbalance,a data processing strategy of weighted balanced sampling with threshold is proposed to reduce the skew problem of network training caused by the number of changed pixels is much smaller than that of unchanged pixels.Experiments on remote sensing image datasets CDD and LEVIR-CD show that the proposed PixelNet network outperforms SOTA in terms of subjective visual effects and objective evaluation indexes on remote sensing change detection tasks.The detection accuracy is 98.0% and F1 score is 96.7% on the CDD dataset,the accuracy reaches 95.8% and F1 score reaches 87.2% on the LEVIR-CD dataset.It effectively solves the problems of sample imbalance,lack of context information of biphasic features,and classification error of difficult edge examples in remote sensing change detection.
Integration of Multi-scale and Attention Mechanism for Ancient Mural Detachment Area Localization
WANG Xinchao, YU Ying, CHEN An, ZHAO Huirong
Computer Science. 2024, 51 (11A): 231200162-8.  doi:10.11896/jsjkx.231200162
Abstract PDF(4700KB) ( 170 )   
References | Related Articles | Metrics
In response to the challenging problem of accurately automating the localization of peeling areas in ancient murals,this paper proposes a lightweight network model based on a multi-scale fusion attention network.Firstly,a multi-scale fusion attention module is introduced to enable the network to learn features at different scales,with a focus on the most critical features,thus improving the accuracy of mural missing area localization.Deep separable convolutions are employed in the proposed multi-scale fusion attention module to make the network model more lightweight.Secondly,a combination of cross-entropy loss and Dice score is used as the loss function,and the Adam optimizer is applied to further enhance the accuracy of mural missing area localization.Additionally,datasets of Dunhuang Mogao Grottoes murals and Yunnan Shiping Luose Temple murals are constructed,and their peeling areas are manually annotated.Experimental results demonstrate that the proposed network model accurately localizes peeling regions in ancient murals.In comparison with existing deep learning methods,this model significantly reduces the number of parameters and exhibits better performance in terms of subjective visual quality,objective evaluation metrics,and generalization capabilities.
Road Obstacle Detection Method Based on Self-attention and Bidirectional Feature Fusion
LI Ting, ZHAO Erdun, YANG Jun
Computer Science. 2024, 51 (11A): 240100138-5.  doi:10.11896/jsjkx.240100138
Abstract PDF(2616KB) ( 150 )   
References | Related Articles | Metrics
With the rapid development of technology,assisted driving technology has become an important direction for the future development of the automotive industry.In image-based road obstacle detection,existing methods have limited detection capabilities for targets with large scale changes,small targets,and targets with occlusion,often resulting in misjudgments and omissions.To address this problem,a road obstacle detection method based on self-attention and bidirectional feature fusion(CoXt-FCOS) is proposed.This method introduces a grouped self-attention mechanism module CoXT in the backbone to enhance the global information capture capabilities of the network.To solve the occlusion problem,a cross-stage pyramid pooling module SPPCSPC is introduced.In the feature fusion module,a path enhancement network is introduced,forming a bidirectional feature fusion module ESPAFPN,to enhance the network's perception of small targets.Experiments show that the CoXT-FCOS model has high accuracy,with an mAP of 88% on the CODA dataset,and can more accurately detect obstacles on the road.
Face Micro-expression Recognition Method Based on ME-ResNet
JIANG Sheng, ZHU Jianhong
Computer Science. 2024, 51 (11A): 231000053-7.  doi:10.11896/jsjkx.231000053
Abstract PDF(3086KB) ( 153 )   
References | Related Articles | Metrics
Face micro-expressions have the characteristics of short duration and small amplitude of movement.Factors such as the small sample size of dataset also bring great challenges to micro-expression recognition.To solve these problems,this paper proposes a micro-expression recognition method based on ME-ResNet residual network.First,in the pre-processing stage,extract the key frame sequence between the start frame and the vertex frame in the micro-expression video clip at equal intervals and then,use the improved Farneback optical flow method to extract the motion features of the micro-expression key frame sequence.Se-cond,construct a ResNet50 network based on 3D convolution and add the spatial channel attention CBAM mechanism to the network Bottleneck module,so as to enhance the ability to focus on key facial motor features.Next,construct the ME-ResNet network model and sent the extracted facial optical flow motion features to the network for training.Finally,use the data enhancement to increase the sample size of network training and apply the ME-ResNet network model to micro-expression recognition tasks.Also,experimental results on CASME II,SMIC and SAMM datasets show that the recognition rate of the proposed algorithm reaches 84.42%,72.56% and 70.41% respectively.It has higher recognition ability compared with other algorithms.
Lightweight and Efficient Recognition Method for Chinese Character Click-based CAPTCHA
JIN Xinhao, CHI Kaikai
Computer Science. 2024, 51 (11A): 240100031-9.  doi:10.11896/jsjkx.240100031
Abstract PDF(2732KB) ( 142 )   
References | Related Articles | Metrics
With the advent of digitalization,enterprises increasingly rely on robotic process automation technologies to reduce costs and improve efficiency,thus maintaining competitiveness.However,the automation level is hindered by the challenge of Chinese character click-based CAPTCHA recognition in certain process steps.Existing research on this problem faces difficulties in dataset creation,poor model generalization performance,and an imbalance between model complexity and performance.To address these issues,this paper proposes a low-cost dataset creation approach and a lightweight Chinese character click-based CAPTCHA recognition method with excellent generalization performance.Specifically,a significantly lightweight version of the YOLOv8-n model,tailored for Chinese cha-racter detection,is employed in this study.Subsequently,preprocessing operations such as segmentation and rectification are applied to the CAPTCHA images.The highly versatile PaddleOCR model is utilized for Chinese character recognition,reducing the cost of scene adaptation.Furthermore,the best matching result is obtained through the recognition probability matrix,further enhancing accuracy.Additionally,a semi-automatic Chinese character detection dataset construction process is designed and made publicly available.This research aims to promote the development of automated Chinese character click-based CAPTCHA recognition techniques,enhance the level of enterprise process automation.
Study on Identification of Concrete Sand and Gravel Aggregate Types Based on Improved Residual Network
CAO Qingyuan, ZHU Jianhong
Computer Science. 2024, 51 (11A): 231000082-6.  doi:10.11896/jsjkx.231000082
Abstract PDF(3078KB) ( 164 )   
References | Related Articles | Metrics
In order to solve the problem of low accuracy of identification of complex types of concrete sand and gravel aggregates and realize the automatic identification of sand and gravel aggregate types,a CM-ResNet18 network model suitable for the identification of concrete sand and gravel aggregate types is proposed.Secondly,the ResNet18 model is selected as the backbone network,the CBAM module and the MHSA module are fused to enhance the model's ability to extract features,and then the Dropout function is added to improve the generalization performance of the neural network,and transfer learning is introduced into the training to accelerate the network convergence speed,and the last layer learning rate is increased to better adapt to the training data and improve the model performance.Experimental results show that the CM-ResNet18 model achieves an accuracy of up to 99.09% in the identification of raw materials.Compared with other network models AlexNet,VGG19,EfficientNet,ResNet18 and ResNet34,the CM-ResNet18 model has improved the recognition accuracy,precision,recall rate and F1-score,and the results show that the method has high practicability and feasibility in the identification of concrete sand and gravel aggregates.
Road Crack Detection Based on Separable Convolution and Wave Transform Fusion
LIU Yunqing, WU Yue, ZHANG Qiong, YAN Fei, CHEN Shanshan
Computer Science. 2024, 51 (11A): 240100141-9.  doi:10.11896/jsjkx.240100141
Abstract PDF(4406KB) ( 152 )   
References | Related Articles | Metrics
Aiming at the current problems of weak detection ability and low segmentation accuracy for small cracks,an improved U-Net model is proposed to detect road cracks and improve detection ability and segmentation accuracy.This paper designs a new module,multi scale depth separated convolutional block(MSDWBlock),which is applied in the encoder and decoder sections.Through its depthwise separable convolution,the model's ability is enhanced,the model's receptive field is expanded,and a C2G attention mechanism module is introduced in the skip connection section to enhance the model's perception of crack features.And atrous spatial pyramid pooling(ASPP) and discrete wavelet transformation(DWT) are introduced.ASPP helps to capture the characteristics of cracks by operating at multiple scales,while DWT can reduce the loss of crack spatial information during convolutional pooling and preserve crack edge information.This structural design makes the network more focused on the characteristics of cracks,thereby improving the accuracy of crack detection.It has been demonstrated through experiments that the accuracy of the proposed model is better than that of advanced models such as U-Net,Segnet,and U2net.On the CFD dataset,mIoU and F1 reaches 78.51% and 0.868 respectively.These results indicate that the proposed method can effectively improve the perfor-mance of road crack detection.
Text-driven Generation of Emotionally Diverse Facial Animations
LIU Zengke, YIN Jibin
Computer Science. 2024, 51 (11A): 240100094-8.  doi:10.11896/jsjkx.240100094
Abstract PDF(3286KB) ( 153 )   
References | Related Articles | Metrics
This paper presents an innovative text-driven facial animation synthesis technique,which integrates emotion models to enhance the expressiveness of facial expressions.The methodology is composed of two core components:facial emotion simulation and the consistency between lip movements and speech.Initially,a deep analysis of the input text identifies the types of emotions contained and their intensities.Subsequently,these emotional cues are utilized to generate corresponding facial expressions using the three-dimensional free-form deformation algorithm(DFFD).Concurrently,phonemes and lip movement data from human speech are collected.These are then precisely aligned with the phonemes in the text over time using forced alignment technology,resulting in a sequence of changes in lip key points.Following this,intermediate frames are generated through linear interpolation to further refine the timeline of lip movements.Finally,the DFFD algorithm synthesizes the lip animation based on this time series data.By meticulously balancing the weights between facial emotions and lip animations,this approach successfully achieves highly realistic virtual facial expressions.
Asymmetric Teacher-Student Network Model for Industrial Image Anomaly Detection
KONG Senlin, ZHANG Hui, HUANG Zhennan, LIU Youwu, TAO Yan
Computer Science. 2024, 51 (11A): 240200069-7.  doi:10.11896/jsjkx.240200069
Abstract PDF(2805KB) ( 169 )   
References | Related Articles | Metrics
Industrial image anomaly detection is a critical component in large-scale industrial manufacturing.Addressing challenges such as difficulty in annotating anomalous samples and obtaining prior information about anomalous regions in industrial image anomaly detection,a model based on asymmetric teacher-student networks for unsupervised image anomaly detection is proposed.Firstly,to tackle the problem of over-imitation mapping caused by high similarity in structure between teacher and student networks,an asymmetric teacher-student network is designed.Contextual Transformer modules are introduced into the residual blocks of the student network to add structural diversity to the teacher-student networks,preventing the student network from over-imitating the mapping of the teacher network.Secondly,to enhance the generalization difference between teacher and student networks,a moving average normalization layer is introduced into the teacher network to improve detection performance.Finally,a multi-scale abnormality map fusion mechanism is introduced to better detect anomalies of different sizes by fusing abnormality score maps of different scales.Experiments conducted on the MVTec AD public dataset show that the proposed method achieves an image-level AUROC of 95.7% and a pixel-level AUROC of 97.4%,verifying the feasibility and effectiveness of the approach.
Real-time Accurate Object Tracking for Resource-constrained Edge Devices
ZHANG Xinyi, TAN Guang
Computer Science. 2024, 51 (11A): 231200167-9.  doi:10.11896/jsjkx.231200167
Abstract PDF(4218KB) ( 147 )   
References | Related Articles | Metrics
Real-time video analysis tasks often involve running computationally intensive deep neural network(DNN) models for object tracking.In practical applications,offloading multi-stream video analysis tasks to edge devices near the cameras has become crucial.However,these edge devices often have limited computing resources,resulting in low tracking accuracy.This is primarily due to outdated detection results,accumulated tracking errors,and the inability to detect new object.To address these issues,a prediction-correction based framework is proposed.The framework comprises three core components:1)Predictive detection propagation,which rapidly updates outdated object bounding boxes using a lightweight prediction model to match the current frame.2)Frame difference corrector,which refines bounding boxes based on frame difference information.3)New object detector,which discovers newly appearing objects during the tracking process by clustering frame difference features.Experimental results demonstrate that the framework achieves accuracy improvements ranging from 19.4% to 34.7% compared to baseline methods across various traffic scenarios while maintaining real-time execution speed.
Go Chessboard Recognition Based on Light-YOLOv8
ZHANG Lei, WU Wenzhe, BAI Xueyuan
Computer Science. 2024, 51 (11A): 230900037-7.  doi:10.11896/jsjkx.230900037
Abstract PDF(3967KB) ( 208 )   
References | Related Articles | Metrics
A real-time detection algorithm Light-YOLOv8 based on a combination of three-dimensional attention mechanism and lightweight convolution is proposed to achieve high-precision real-time chessboard recording during Go games.On the basis of YOLOv8,PWConv+PConv is used to replace the 3*3 convolution of the cross stage local network in the backbone network,which greatly reduce the computational complexity of the model.Adding CARAFE upsampling operaor and SimAM three-dimensional attention mechanism to improve the detection ability of small Go targets.The use of the Wise-IOU loss function improves the model's localization ability and convergence speed,and improves its detection ability in cases of chess piece adhesion,chess piece overlap,and uneven lighting.Optimize and compress the model for mobile deployment and deploy it on different Android devices,with an image resolution of 640*480.The average single detection time combined with image preprocessing and post-processing operations is 89ms,and the average detection frame rate is 37.6 fps.Conduct 50 rounds of score recording experiments,with an average score recording accuracy of over 97% and an average winner/loser discrimination accuracy of 100%,which can achieve stable go chess score recording function.
Bottleneck Multi-scale Graph Convolutional Network for Skeleton-based Action Recognition
HUANG Haixin, WANG Yuyao, CAI Mingqi
Computer Science. 2024, 51 (11A): 231000073-5.  doi:10.11896/jsjkx.231000073
Abstract PDF(2962KB) ( 136 )   
References | Related Articles | Metrics
Action recognition methods have achieved significant success in the field of computer vision.Graph convolutional networks(GCNs) are crucial techniques for action recognition tasks,especially for extracting features from graph-structured data.However,existing GCNs suffer from limitations such as an excessive reliance on predefined skeleton topological graphs and a lack of flexibility in handling large temporal convolution kernels,which significantly constrain their expressive power and robustness.In this paper,we propose an adaptive bottleneck multi-scale graph convolutional action recognition method based on skeleton data.The adaptive spatial module optimizes the skeleton topological graph structure and parameters,enhancing the model's flexibi-lity.The bottleneck layer multi-scale temporal module improves the temporal modeling capabilities while reducing channel width to save computational costs and parameters.Experimental results on large-scale skeleton action recognition datasets,NTU-RGB+Dand NTU-RGB+D 120,show that the accuracy of our model is improved to a certain extent.
Fundus Vascular Image Segmentation Algorithm Based on Attention Mechanism
WANG Libin, WANG Shumei
Computer Science. 2024, 51 (11A): 231000003-6.  doi:10.11896/jsjkx.231000003
Abstract PDF(3754KB) ( 158 )   
References | Related Articles | Metrics
In order to narrow the semantic gap between the encoder-decoder structure,a medical image segmentation algorithm based on attention mechanism is proposed.Firstly,the CBAM is used to enhance the model for feature extraction of medical images through the attention mechanism module.Secondly,Using the feature map output by the CBAM module as the input of the feature refinement module proposed in this paper,it is used to restore the vascular detail information lost due to downsampling,so as to narrow the semantic gap.Finally,a scale attention module is used to combine the features of feature maps at different scales to form the final prediction.By comparing with the cunrrently popular retinal vessel segmentation algorithm,the proposed algorithm can improve the mIoU by up to 2.3% on the DRIVE dataset,with the closest approach also improving by 0.4%.This de-monstrates that the proposed model can effectively enhance segmentation accuracy and achieve good results in restoring subtle vascular pixels.
Impervious Surface Change Detection in Urban Fringe Areas Based on Multi-scale and Dual-attention Network
MU Zhengyang, DAI Jianguo, ZHANG Guoshun, HOU Wenqing, CHEN Peipei, CAO Yujuan, XU Miaomiao
Computer Science. 2024, 51 (11A): 231200064-8.  doi:10.11896/jsjkx.231200064
Abstract PDF(6217KB) ( 166 )   
References | Related Articles | Metrics
As an important feature of urbanization,impervious surface can intuitively reflect the scope of urbanization,and the use of remote sensing imagery and computer vision to detect the change of impervious surface in urban fringe areas between different time series is an effective means of observing the expansion of the city,which is of great significance to the urban construction planning and sustainable development of the city.Nevertheless,the peri-urban regions situated between urban and natural environments,are characterized by intricate and haphazard characteristics,exhibiting a high degree of heterogeneity.Consequently,this presents a formidable challenge when undertaking the task of detecting changes.To solve these problems,this paper adopts a SE compression excitement structure and multi-scale fusion module(MSFM)to improve and optimize the Deeplabv3+ network,and constructs a high-precision impervious surface change extraction model,MSDANet,to realize the automatic extraction of impervious surface change.Meanwhile,based on the Google Earth satellite imagery platform,high-resolution images of Urumqi's metropolitan outskirts in 2017 and 2021 are procured,and a well-annotated high-resolution impervious surface alteration detection(HISCD)dataset is established and made publicly available.By comparing with leading change detection networks,MSDANet achieves the best outcomes with good change extraction capability,and is able to accurately extract multiple impervious surface change types.The metrics of OA,Precision,Recall,F1 and MIoU in the HISCD test set reaches 90.77%,80.51%,78.83%,79.63% and 68.80%,respectively.This investigation provides a novel approach for evaluation urban spread analysis and effective technical support for urban spatial planning.
Research and Application of Color Mapping Function Optimization Method for SegmentedPipeline AxialData Visualization
LUO Yuetong, ZHAO Dongsheng, PENG Jun, DONG Ziqiu
Computer Science. 2024, 51 (11A): 240400039-4.  doi:10.11896/jsjkx.240400039
Abstract PDF(2475KB) ( 138 )   
References | Related Articles | Metrics
Axial data is often visualized in a pseudo color manner on 3D pipeline models,and the color mapping function has a decisive impact on the visualization effect.Although there are now many color mapping functions that have good effects on data that conform to common distribution characteristics,it is still difficult to achieve ideal results on data with special distributions.Therefore,people are studying color mapping functions for various special distribution data.If the axial data of a three-dimensional pipeline is composed of several segments and presents a distribution characteristic of “small differences within segments and large differences between segments”,then common mapping functions are difficult to simultaneously represent subtle differences within segments and significant differences between segments,which affects the visualization effect.To address this issue,this article proposes a mapping function optimization method based on control points to improve the visualization effect of segmented data.Experimental analysis first verifies the effectiveness of the method using synthetic data; it also verifies the effectiveness of the method using radiation data from the cooling tubes of fusion reactors,and both types of data confirm the effectiveness of this method.
Language Recognition Based on Improved MFCC and Energy Operator Cepstrum
CHEN Sizhu, LONG Hua, SHAO Yubin
Computer Science. 2024, 51 (11A): 231000065-6.  doi:10.11896/jsjkx.231000065
Abstract PDF(1983KB) ( 143 )   
References | Related Articles | Metrics
Aiming at the problem of low accuracy and poor robustness of language recognition under low signal-to-noise ratio of broadcast speech signals,a language recognition algorithm based on wavelet packet transform to improve MFCC and energy operator cepstrum features is proposed.Firstly,the WMFCC feature parameters are obtained by using wavelet packet transform instead of Fourier transform and Mel filter in MFCC.On the basis of retaining the auditory perception characteristics of the human ear,the high-frequency analysis ability and analysis accuracy of the speech signal are improved,and the limitations of the Fourier transform are overcomed.Secondly,the Teager energy operator cepstrum is extracted to obtain the characteristics of the instantaneous energy of the speech,which is fused with the improved MFCC feature parameters to obtain a new feature parameter TWMFCC.Finally,in order to further improve the recognition effect of low SNR speech,a VMD adaptive Wiener filtering denoising algorithm is proposed.The experiment compares the recognition effect of the proposed features with the traditional features.The average recognition accuracy of the proposed features is significantly improved,which is 13.02 % higher than that of the traditional MFCC without speech denoising.It effectively alleviates the problem of low recognition accuracy of traditional features under low signal-to-noise ratio,and has strong anti-noise and robustness.
Dunhuang Mural Element Detection Algorithm Based on Improved Yolov8
ZHOU Yanlin, WU Kaijun, MEI Yuan, TIAN Bin, YU Tianxiu
Computer Science. 2024, 51 (11A): 231000034-6.  doi:10.11896/jsjkx.231000034
Abstract PDF(4381KB) ( 184 )   
References | Related Articles | Metrics
The Dunhuang murals have garnered significant attention for their artistic,historical,and research value.In the research and development of cultural tourism surrounding frescoes,detecting elements within these frescoes is crucial.However,due to factors such as shedding,pigment fading,pest damage,and the significant discrepancies in elemental volume,detecting mural elements has become difficult.For this reason,this paper,which is based on the Yolov8 algorithm,continues the improvement and expansion work by introducing it into the fresco element detection task.Specifically,the design of an enhanced SPPCSPC module improves the feature-perception ability of the model and expands its sensory field.Additionally,the CoordAttention mechanism is introduced at the end of the C2f module to improve the network's ability to focus on local and non-significant information,which addresses the variability in volume and style of the elements.On the issue of detecting elements within Dunhuang murals,our algorithm outperforms five other cutting-edge detection algorithms in terms of mural detection accuracy.Compared to the Yolov8 baseline algorithm,it achieves a 2.2% improvement in mAP,particularly in the main_buddha category where we see a 12.2% improvement in detection accuracy.This accomplishment offers significant support for future research focused on Dunhuang murals analysis.
Study on Deep Learning Algorithm for Foreground Subject Segmentation of Non-specific CategoryImages
CHEN Xianglong, LI Haijun
Computer Science. 2024, 51 (11A): 231000071-9.  doi:10.11896/jsjkx.231000071
Abstract PDF(4564KB) ( 148 )   
References | Related Articles | Metrics
By incorporating SENet channel attention mechanism on the basis of Mobile Unet network,the image foreground subject se-gmentation algorithm is improved.The algorithm introduces deep separable convolution to reduce the number of model parameters,while utilizing skip connections and multi-scale feature fusion to improve the segmentation accuracy of the model.Du-ring the training process,a spatial pyramid pooling module with hollow convolution is used to increase the receptive field and improve the model's recognition ability for large-scale objects.Experimental results show that the improved algorithm achieves 96% MIOU(Modular Input/Output Un-it) segmentation accuracy on the PASCAL VOC2012 dataset,with an accuracy rate of 0.971,which is superior to various existing image segmentation algorithms,such as the FCN fully convolutional neural network algorithm.In terms of speed,the processing time of the model for each image is between 1.7 s and 2.5 s.The improved algorithm has a faster inference speed compared to traditional fully convolutional neural networks,making it suitable for real-time image segmentation on mobile devices.Through comparative experiments,the effectiveness of the Mobile Unet models before and after the improvement,as well as the FCN model,in foreground subject segmentation of images under bright and dim conditions is compared,and the conclusion is drawn that the improved Mobile Unet model has the best performance.Finally,the algorithm is deployed,a GUI visualization operation interface is designed,and an.exe executable file is generated.
Study on Detection Method of Bridge Crack Based on YOLOv5
LI Jun, LIU Nian, ZHANG Shiyi
Computer Science. 2024, 51 (11A): 231200063-7.  doi:10.11896/jsjkx.231200063
Abstract PDF(4560KB) ( 168 )   
References | Related Articles | Metrics
To address the issues of different crack recognition in bridge crack identification,improve the model's fitting ability,and enhance crack feature extraction capability,this paper proposes an algorithm called YOLOv5-Crack based on the fusion of YOLOv5 and EfficientNet,incorporating the CBAM attention mechanism in bridge crack recognition.Firstly,the feature extraction network of YOLOv5 is replaced with the EfficientNet network known for its high accuracy and efficiency,to extract crack features.Secondly,the convolutional block attention module(CBAM) is used to enhance the model's ability to capture the feature information of shallow targets by combining channel and spatial attention modules,thereby improving crack recognition accuracy.Finally,the model is trained on the bridge crack dataset “concrete crack images for classification”.The research results show that YOLOv5-Crack demonstrates higher accuracy in detecting large cracks compared to YOLOv5,with improved mAP@0.5,recall,and precision.Additionally,it consumes less computing power while meeting the requirements of crack detection.
Gender Recognition of Electronic Disguised Voices Based on MLP
ZHANG Xiao, GUAN Linyu
Computer Science. 2024, 51 (11A): 240400021-4.  doi:10.11896/jsjkx.240400021
Abstract PDF(1917KB) ( 136 )   
References | Related Articles | Metrics
A neural-network-based disguised voices recognition model is proposed to realize the gender identification of the disguised speech speaker from the parameters such as the formant center frequency,bandwidth and intensity of sound.The model uses multi-layer perceptron(MLP) as the framework to obtain the gender recognition results through the fully connected non-linear stacking calculation,and uses L-BFGS to solve the parameters optimization in training.This paper uses SoundTouch to disguise the original voices of the male and the female respectively,and then linear predictive coding(LPC) extracts various parameters such as the center frequency,bandwidth and sound intensity of the formant,and eliminates the outliers.Then experiment is carried out to explore the influences of network structure and activation function on the model as well as the adaptability of this recognition model to different electronic disguised methods.The experimental results show that the MLP-based recognition model can effectively distinguish the gender of the speaker corresponding to the voice disguised by different methods.This laid the foundation for electronic disguised voice speaker recognition.
Lightweight Terrain Map Building Approach Combining Laser and Vision
LI Yawen, ZHANG Botao, ZHONG Chaoliang, LYU Qiang
Computer Science. 2024, 51 (11A): 240400051-9.  doi:10.11896/jsjkx.240400051
Abstract PDF(6798KB) ( 195 )   
References | Related Articles | Metrics
The performance of robots in complex environments is closely related to the interaction with the environment,and traditional geometric mapping can not capture the detailed information of the enviroment adequately.To deal with the problems mentioned above,this study proposes a lightweight terrain map building approach combining laser and vision(LTMB-LV).Based on temporal and spatial synchronization,this method extracts semantic terrain information with improved CSPResnet and fuses it with point clouds to generate semantic point clouds involving terrain information,thereby building a terrain map with terrain description.Meanwhile,local subgraph stitching method based on an improved ICP for optimizing point cloud registration is employed for building terrain maps in large-scale scenarios,while a parallel method enhances real-time performance.Experimental results in real environments demonstrate that the proposed approach can efficiently detect many typical terrains and construct lightweight terrain maps with limited onboard computational power.
Road Extraction from Complex Urban Remote Sensing Images Based on Multi-task Learning
WANG Kunyang, LIU Yang, YE Ning, ZHANG Kai
Computer Science. 2024, 51 (11A): 240300095-8.  doi:10.11896/jsjkx.240300095
Abstract PDF(5352KB) ( 143 )   
References | Related Articles | Metrics
In this paper,we propose a new framework for road extraction from remote sensing images that aims to utilize the knowledge gained from road edge detection to improve the accuracy of road extraction.A multi-scale visual attention module that fuses multi-scale information and visual attention mechanisms is introduced in the study,and a cascading feature fusion module is constructed to integrate the network's prediction results at different scales.Based on this,we construct a multiscale visual attention network(MSVANet) containing encoders and decoders.A multi-task learning framework that incorporates the MSVANet is also proposed,and a particle swarm optimization algorithm(PSO) is used to optimize the automatic selection of the two learning rate hyperparameters of the multi-task learning framework.The training and testing results on the RNBD dataset show that the proposed method outperforms other road extraction methods in terms of various segmentation accuracy metrics and generalization ability.
Improved YOLOv5s Algorithm for Detecting Fireworks in Urban Building Environments
YU Yongbo, SUN Zhen, ZHU Lingxi, LI Qingdang
Computer Science. 2024, 51 (11A): 240100051-7.  doi:10.11896/jsjkx.240100051
Abstract PDF(4666KB) ( 178 )   
References | Related Articles | Metrics
A smoke and fire detection algorithm based on improved YOLOv5s is proposed to address the issues of low detection accuracy and long time consumption in urban building environments.Firstly,the prior boxes for the fireworks dataset are reclustered using K-means;embed CA(Coordinate Attention) mechanism in the backbone feature extraction network of YOLOv5s to suppress noise interference;drawing inspiration from the Efficient RepGFPN and BiFPN(bidirectional feature pyramid network) ideas in DAMO-YOLO,a novel neck BiGFPN is designed to reconstruct the neck of YOLOv5s and promote multi-scale fusion.In order to effectively utilize the semantic information of feature maps,a lightweight universal upsampling operator CARAFE(content aware reassembly of features) is introduced.In order to reduce the number of parameters and computation caused by model improvement,GhostNet is used to reconstruct the neck of YOLOv5s.Replacing the bounding box regression loss function CIoU with SIoU accelerates the convergence of the model and improves accuracy.Experimental results show that the improved YOLOv5s has fewer parameters and computational complexity,and the mAP50 has increased by 4.6%,which can basically meet the requirements of fireworks detection.
PS YOLOv8:Enhancing Detection of Small-scale Damage in Power Lines Inspection
SONG Shangze, LI Li, TIAN Ye, BAI Jie
Computer Science. 2024, 51 (11A): 240100003-6.  doi:10.11896/jsjkx.240100003
Abstract PDF(3503KB) ( 162 )   
References | Related Articles | Metrics
In the field of power line inspection,accurately detecting minute cracks and small damages,which are often overlooked due to their scale and background complexity,is crucial.If not identified and addressed timely,these minor damages can escalate into major safety hazards.To address this challenge,this paper introduces the PowerScreen-YOLOv8(PS-YOLOv8) model.Compared to the original YOLOv8,this model has made significant advancements in detecting small objects.It integrates six key improvements to enhance detection accuracy in complex environments.The model's superiority has been demonstrated through rigo-rous testing and benchmarking against leading algorithms.With an impressive accuracy rate of 90.3% and validated robustness in real-world drone-captured scenarios,PS-YOLOv8 represents a significant leap in power line inspection technology.It offers a more reliable,efficient,and safer approach to infrastructure maintenance.
Automatic Sleep Staging Based on Multimodal Data and Fusion Deep Network
ZHAO Ruonan, LI Duo, SONG Jiangling, ZHANG Rui
Computer Science. 2024, 51 (11A): 231100160-6.  doi:10.11896/jsjkx.231100160
Abstract PDF(3240KB) ( 178 )   
References | Related Articles | Metrics
Accurate sleep staging is an important basis for evaluating sleep quality and diagnosing related diseases.Aiming at the differences between electroencephalogram(EEG) andElectrooculogram in different stages of sleep,this paper proposes a new feature fusion deep network based on EEG and EOG,called MAFSNet,to realize automatic sleep staging.Specifically,we first design two different one-dimensional convolutional neural networks to extract effective sleep features from EEG and EOG signals.Se-condly,an adaptive feature fusion module is constructed to assign different weights to the features according to their contribution degree.By enhancing discriminant features and suppressing irrelevant features,an adaptive fusion feature containing multi-modal sleep information is obtained.Then,the time series related information in the sleep stage transition rule is learned using the bidirectional long short-term memory network.Finally,the public data set Sleep-EDF is used to verify the effectiveness of the proposed model to achieve five-stage sleep staging.The results show that the proposed method has high classification performance in sleep staging,with the accuracy of 94.1%,Cohen's Kappa coefficient of 88.2% and Macro-averaged F1-score of 81.9%,in which the recall rate of N1 and REM sleep stages is significantly increased to 64.6% and 93.5%,respectively.
Optimization of License Plate Image Restoration and Recognition in Video with Motion BlurBased on Frequency Domain Point Spread Function
ZHU Peiwu, GAO Shuhui, XIE Zhaoyu, FU Yu
Computer Science. 2024, 51 (11A): 231200046-9.  doi:10.11896/jsjkx.231200046
Abstract PDF(5202KB) ( 150 )   
References | Related Articles | Metrics
Vehicles are commonly used tools in criminal cases,and license plate recognition is one of the crucial criteria for identi-fying involved vehicles.The restoration of license plate images degraded by motion blur is an important research direction in digi-tal image processing.This paper proposes a license plate image restoration algorithm based on the estimation of frequency domain point spread function parameters.It utilizes two-dimensional discrete Fourier transformation,Radon transformation,and the Wiener filtering algorithm to address the problem of restoring license plates affected by motion blur in surveillance video angles.The process starts with preprocessing the motion-blurred image by converting it to grayscale and reducing noise.A Hanning window is applied to the image,followed by two-dimensional discrete Fourier transformation and logarithmic operations.This computes the power spectrum of the image.Radon transformation is used to detect the spectrum and estimate the blur direction.The minimum value computation on the spectrum determines the blur length,which completes the estimation of the two parameters of the point spread function.The image is then deconvolved using the Wiener filtering algorithm,resulting in the restoration of the image.To address the issue of traditional spectrum estimation being susceptible to interference from central bright lines,a window function is added before the discrete Fourier transformation.To validate the algorithm's restoration effect,experiments are conducted with motion-blurred images captured by road surveillance.These experiments are compared with the non-windowed me-thod,establishing a method for restoring motion-blurred license plate images.Experimental results demonstrate that the proposed algorithm has an advantage in preserving the semantic information of license plate images and can complement the field of criminal image technology in the restoration of blurred license plates.
Enhanced Brain Cancer Detection Algorithm Based on YOLOv8
WANG Zhe, ZHAO Huijun, TAN Chao, LI Jun, SHEN Chong
Computer Science. 2024, 51 (11A): 231100122-7.  doi:10.11896/jsjkx.231100122
Abstract PDF(4907KB) ( 163 )   
References | Related Articles | Metrics
Automatically detecting the location of brain tumors in magnetic resonance imaging is a complex and laborious task that requires a lot of time and resources.Traditional identification schemes often misunderstand,omit,and mislead,affecting the progress of patient treatment and posing a risk to patient safety.To further improve the identification and appraisal results,this paper introduces four key improvement measures.Firstly,an efficient multi-scale attention EMA is adopted,which can encode global information,recalibrate information,and further aggregate information through parallel branch output features for cross-dimensional interaction.Secondly,the BiFPN(Bidirectional Feature Pyramid Network)module is introduced to shorten the time required for each detection and improve image recognition results.Then,the MDPIoU loss function and Mish activation function are improved to further enhance detection accuracy.Finally,simulation experiments are conducted for verification.The experimental results show that the improved YOLOv8 algorithm has improved precision,recall,and mean average precision(mAP)in brain cancer detection.Among them,precision,recall,mAP@0.5 and mAP@0.5:0.9increases by 4.48%,2.64%,2.6% and 17.0% respectively.
Grading Model for Diabetic Retinopathy Based on Graph Convolutional Network
YANG Yufan, YUAN Liming, WANG Ke, LI Hongyi, LI Yixuan, YAO Yujia, WANG Jingyi
Computer Science. 2024, 51 (11A): 231000042-5.  doi:10.11896/jsjkx.231000042
Abstract PDF(2500KB) ( 166 )   
References | Related Articles | Metrics
Diabetic retinopathy is a high-risk blinding disease.If it is detected early,it can be treated to slow or stop further vision loss in patients.There have been some successful cases of using deep learning to conduct diabetic retinal disease detection.Nevertheless,these methods usually only consider the spatial relationship between pixels in images and do not take into account the relationship between deeper features of the images.For this reason,a graph convolutional network based diabetic retinopathy gra-ding model is proposed with the aim of helping doctors and researchers to better grade and diagnose diabetic retinopathy images in clinical practice and scientific research.This model mainly uses the graph convolutional network to capture the important grading information embedded among deep features of an image,to obtain features with stronger semantic information,and to construct a two-way branching network based on it.In addition,for better feature fusion,an adaptive weighting mechanism is used to further improve the grading performance.Experimental results show that the proposed method can fully learn the relationship between the deep features of the image by using the graph convolutional network,so as to improve the classification performance,and its classification accuracy reaches about 84.8% on the APTOS2019 dataset and about 68% on the Messidor-2 dataset.
Gaussian-bias Self-attention and Cross-attention Based Module for Medical Image Segmentation
LUO Huilan, GUO Yuchen
Computer Science. 2024, 51 (11A): 240300071-9.  doi:10.11896/jsjkx.240300071
Abstract PDF(3764KB) ( 285 )   
References | Related Articles | Metrics
To address the problems in medical image segmentation,such as varying target sizes,diverse representations of the same anatomical structures across slices,and low distinction between organs and background leading to excessive redundant information,a novel model based on Gaussian bias self-attention and contextual cross attention,named Gaussian bias and cross attention U-Net(GCA-UNet),is proposed.The model utilizes residual modules to establish spatial prior hypotheses,employs Gaussian bias self-attention and external attention mechanisms to learn spatial priors and enhance feature representations of adjacent areas,and uses external attention to understand inter-sample correlations.The cross attention gated mechanism leverages multi-scale feature extraction to reinforce structural and boundary information while recalibrating contextual semantic information and filtering out redundant data.Experimental results on the Synapse abdominal CT multi-organ segmentation dataset and ACDC cardiac MRI dataset show that,the proposed GCA-UNet achieves Mean Dice accuracy metrics of 81.37% and 91.69%,respectively,with a Mean hd95 boundary precision of 16.01 on the Synapse dataset.Compared to other advanced medical image segmentation mo-dels,GCA-UNet offers higher segmentation accuracy with clearer tissue boundaries.
Improved Lightweight Aerial Photography Object Detection Model Based on YOLOv5s
CHEN Haiyan, MAO Lihong
Computer Science. 2024, 51 (11A): 231100119-8.  doi:10.11896/jsjkx.231100119
Abstract PDF(4943KB) ( 184 )   
References | Related Articles | Metrics
The difficulty of target detection is increased by complex backgrounds,dense targets,and a high proportion of small objects in the unmanned aerial vehicle(UAV)aerial images.Deployment on embedded devices by drones is difficult due to the high computational complexity of the target detection model based on deep learning.Aiming at the above problems,an improved lightweight aerial image object detection model based on YOLOv5s is proposed.Firstly,the C3 module BottleNeck of the YOLOv5s backbone network is replaced with lightweight ShuffleNetv2 to reduce model parameters and computational complexity.Secondly,cross-layer information cross-fusion,SE channel attention mechanism,and residual connections are introduced in the ShuffleNetv2 network to alleviate the problem of reducing the number of feature channels caused by convolution operations and insufficient information utilization of feature maps in the middle layer of the network.Then,the SE channel attention mechanism is introduced into the YOLOv5s multi-scale feature fusion network,augmenting the network′s ability to capture and extract key features.Finally,the proposed target detection model in this paper is further lightened by channel pruning.Experimental results on the NWPU VHR-10 dataset show that,compared with the YOLOv5s model,the proposed model is increased of 3.5% in precision and 1.9% in mean average precision.The number of parameters and computational workload is reduced by 76% and 48.7%,the model size is compressed by 73.8% and detection speed improved by 48%.
Partition-Time Masking:A Data Augmentation Method for Lip Reading
HU Yu, YIN Jibin
Computer Science. 2024, 51 (11A): 240300139-6.  doi:10.11896/jsjkx.240300139
Abstract PDF(2433KB) ( 142 )   
References | Related Articles | Metrics
This paper proposes a new data augmentation method for lip-reading called Partition-Time Masking.This method operates directly on the input data,dividing it into multiple subsequences,each undergoing a separate masking operation before being sequentially reassembled.This approach enhances the model's robustness to inputs with partial frame loss,thereby improving generalization.Five augmentation strategies are designed based on the number of divided subsequences and the source of the mask values.Comparative experiments are also conducted with the Time Masking method,a pivotal data augmentation technique in lip-reading research.Experiments are carried out on the LRW and LRW1000 datasets.The results indicate that the Partition-Time Masking method surpasses the Time Masking method in enhancing model performance.The optimal strategy is identified as using an average frame of each subsequence for masking,with the number of subsequences set to three.This approach improves the performance of the state-of-the-art lip-reading model DC-TCN from 89.6% to 90.0%.
Study on Pedestrian Detection Method Based on Multi-level Feature Fusion
HUANG Lingwa, CUI Wencheng, SHAO Hong
Computer Science. 2024, 51 (11A): 231000106-7.  doi:10.11896/jsjkx.231000106
Abstract PDF(3621KB) ( 176 )   
References | Related Articles | Metrics
In view of the difficulty,low detection accuracy and high missed detection rate of occluded pedestrian detection,a pedestrian detection network model based on multi-layer feature fusion is proposed based on structural optimization of YOLOv7 method,aiming at improving the accuracy of occluded pedestrian detection.The method is to use ELAN-C module in the feature extraction part of the backbone network to enhance the ability of extracting pedestrian feature information,so as to improve the accuracy of pedestrian detection.At the same time,the global attention mechanism is introduced into the multi-scale feature fusion part to form multi-layer feature fusion.Through inter-dimensional information interaction,especially the focus on location information,the representation of detection target features is enhanced and the accuracy of pedestrian detection is improved.In addition,in order to accelerate the convergence rate of the model,EIoU is used as a loss function to further improve the positioning accuracy of the detection frame.The model is trained and verified on the open data set CityPresons,and the log-average miss rate MR-2 of the evaluation index is decreased Bare 0.55%,Partial 0.91%,Reasonable 1.78%,Heavy 1.68%,respectively,which effectively reduce the miss rate.
Remote Sensing Orineted Object Detection Method Based on Dual-label Assignment
DONG Yan, WEI Minghong, GAO Guangshuai, LIU Zhoufeng, LI Chunlei
Computer Science. 2024, 51 (11A): 240100058-9.  doi:10.11896/jsjkx.240100058
Abstract PDF(5631KB) ( 145 )   
References | Related Articles | Metrics
Due to the inherently diverse distribution characteristics of remote sensing image objects,such as their arbitrary orientation,large aspect ratio,and densely arranged,the utilization of preset anchor boxes makes it difficult to accurately match all real objects.This limitation results in low detection accuracy,particularly for oriented objects with large aspect ratios and densely arranged.To address this issue,an oriented object detection method in remote sensing images based on dual label assignment is proposed.Firstly,a dual-label assignment strategy is proposed to assign candidate boxes with maximum and suboptimal intersection union ratios to the real object.Then,the candidate boxes of adjacent objects are constrained by repulsion loss(AP Loss) and attraction loss(UP Loss) to improve the probability of correct object matching.In addition,to extract robust features suitable for classification and regression branches,a Feature Enhancement Module(FEM) is designed.This module constructs adaptive features based on polarization functions,which can effectively enhance the feature expression ability required for classification and regression tasks.Finally,a localization-guided classification(LGC) module is designed,which guides the sampling position of the classification task through localization tasks,performs localization refinement,and obtains key features of the classification task,thereby alleviating the inconsistency between classification and localization.A large number of experiments were conducted on three publicly available oriented object detection in remote sensing datasets,namely DOTA,HRSC-2016,and DIOR-R.The experimental results demonstrated the effectiveness of the proposed method and its accuracy(mAP) is better than existing mainstream methods.
Spatiotemporal Fusion Method for Remote Sensing Images Based on Dual Attention Mechanisms
FAN Xuejing, XUE Xiaorong, DU Yichao
Computer Science. 2024, 51 (11A): 240200097-6.  doi:10.11896/jsjkx.240200097
Abstract PDF(3332KB) ( 153 )   
References | Related Articles | Metrics
Due to the limitations of remote sensing imaging technology conditions,it is difficult to obtain sequences of remote sensing images with both high temporal and high spatial resolution simultaneously. However,spatiotemporal fusion techniques can generate remote sensing images with high temporal and high spatial resolution. In recent years,various spatiotemporal fusion methods have emerged,showing good performance but still falling short in effectively extracting meaningful information in feature extraction. A deep learning model based on dual attention mechanism improvement (ADCSTFN) is proposed to address this issue,which enhances the model's ability to preserve global information and reconstruct detailed scenes. In the experiments,Landsat and MODIS data were used as the research subjects,and the proposed method was tested using two open-source datasets and a local dataset,and compared with four commonly used spatiotemporal fusion methods. Experimental results show that the resi-dual network and dual attention mechanism proposed in this paper better extract effective information from images. The use of a deep supervision loss function mitigates the issue of vanishing gradients during backpropagation,optimizes the learning process,and significantly improves the fusion results.
Multimodal Contrastive Learning Based Scene Graph Generation
ZHU Xudong, LAI Teng
Computer Science. 2024, 51 (11A): 231200185-5.  doi:10.11896/jsjkx.231200185
Abstract PDF(2515KB) ( 167 )   
References | Related Articles | Metrics
Scene graph generation(SGG) methods play a pivotal role in studying objects and their relationships within images,with widespread applications in visual understanding and image retrieval.However,existing SGG methods are limited by visual features or individual visual concepts such as objects,resulting in a low accuracy of relationship recognition and necessitating a substantial amount of manual annotation.To address the aforementioned issues,this paper integrates image and text features and proposes a multimodal contrastive learning based scene graph generation method,multimodal contrastive learning for scene graph(MCL-SG).This method begins by extracting features from both image and text inputs,obtaining image and text features.Subsequently,a Transformer Encoder is employed to encode and fuse feature vectors,enabling a synergistic integration of information from diverse sources.Notably,MCL-SG incorporates a self-supervised contrastive learning strategy,calculating the similarity between image and text features.Training is accomplished by minimizing the dissimilarity between positive and negative samples,eliminating the need for extensive manual annotation.In this study,experiments are conducted using the VG(Visual Genome) dataset,a substantial public dataset for scene graph generation.Experiments are structured into three distinct hierarchical subtasks:SGDet,SGCls,and PredCls and the results demonstrate that,in the mean Recall@100 metric,MCL-SG achieves a 9.8% improvement in scene graph detection,a significant 14.0% enhancement in scene graph classification,and an 8.9% boost in relationship classification,thus proving the effectiveness of MCL-SG.
Research and Implementation of Dynamic Scene 3D Perception Technology Based on BinocularEstimation
HE Weilong, SU Lingli, GUO Bingxuan, LI Maosen, HAO Yan
Computer Science. 2024, 51 (11A): 240300045-8.  doi:10.11896/jsjkx.240300045
Abstract PDF(4908KB) ( 153 )   
References | Related Articles | Metrics
Binocular stereo vision technology has always been of great significance in the field of computer vision research.Unlike monocular or multicular technology,binocular stereo vision has the advantages of low cost,high versatility,simple use and so on while it can accurately obtain the image depth.The three-dimensional perception technology based on binocular vision can greatly improve the computer's understanding and interaction ability to the real world,further enhance the adaptability of computer vision technology in complex and changeable scenes,and play an important role in the fields of automatic driving,robot navigation,industrial inspection,aerospace,etc.This paper focuses on 3D reconstruction and object perception technology in dynamic scenes.In most cases,dynamic objects in the field of vision usually need to be focused on,while static objects,especially the background and static objects in the scene that occupy the main space in most cases,can be ignored,but they do occupy a lot of resources in the actual calculation,It is obviously meaningless and inefficient to spend too much computing resources on targets that are not concerned in the scene.In order to solve this problem,based on the in-depth study of the current mainstream binocular stereo matching methods,image segmentation and other methods,this paper proposes a dynamic scene 3D perception technology based on binocular estimation.The main innovations and research achievements include:Aiming at the low cost and efficiency of the traditional binocular stereo matching algorithm in pixel by pixel computing aggregation,a binocular stereo matching method based on two-dimensional scene instance segmentation is proposed,and the target image after mask segmentation is used for stereo matching,which not only improves the matching performance but also reduces the difficulty of dynamic target matching.At the same time,in order to solve the problem of insufficient segmentation accuracy,the mask edge filtering optimization method based on rgb image is introduced to improve the efficiency and the reconstruction accuracy of the field of view point cloud.Secondly,real-time target point cloud production is carried out based on binocular estimation depth learning network,and a real-time dynamic target perception algorithm based on GPU accelerated neighboring frame point cloud is proposed.At last,a two-dimensional and three-dimensional dynamic object real-time perception technology is proposed,which can quickly recognize the dynamic object in the detection environment while realizing real-time three-dimensional reconstruction of the target scene.
Stereo Matching Network Based on Enhanced Superpixel Sampling
XU Haidong, ZHANG Zili, HU Xinrong, PENG Tao , ZHANG Jun
Computer Science. 2024, 51 (11A): 231100005-7.  doi:10.11896/jsjkx.231100005
Abstract PDF(4509KB) ( 185 )   
References | Related Articles | Metrics
Aiming at the accuracy challenges in stereo matching related to details,occlusion,and textureless regions,a stereo matching method based on improved superpixel sampling is proposed.Initially,an enhanced superpixel sampling method is employed to downsample the high-resolution input images used for stereo matching.Subsequently,the downsampled image pairs are input into the stereo matching network,where a convolutional network with shared weights is utilized for feature extraction.Using 3D convolution,a feature-fused Cost Volume is generated,leading to the creation of a disparity map.The outputted disparity map is then upsampled to reconstruct the final disparity map.To tackle the issue of potential detail loss during the superpixel sampling process,two innovations are introduced:the feature pyramid attention module(FPA)and an improved residual structure.Based on these two innovations,a stereo matching network named FPSMnet(feature pyramid stereo matching network)is proposed.This paper selects and partitions the image datasets BSDS500 and NYUv2 for training,validation,and testing of superpixel sampling.Experimental results in stereo matching demonstrate that,compared to the baseline method,the proposed algorithm achieves a reduction of 0.25 and 0.52 in average pixel errors on the SceneFlow and HR-VS datasets,respectively.These improvements are achieved without compromising runtime efficiency.
Deep Gait Recognition Network Based on Relative Position Encoding Transformer
REN Yuheng, ZHAO Yunfeng, WU Chuang
Computer Science. 2024, 51 (11A): 240400064-6.  doi:10.11896/jsjkx.240400064
Abstract PDF(1856KB) ( 153 )   
References | Related Articles | Metrics
Gait recognition is a rapidly evolving long-range biometric identification technique that has wide applications and advantages in various scenarios,including long distances,non-intrusive setups,and cross-view angles.Traditional biometrics identification technique,such as fingerprint recognition and facial recognition,often require close proximity or specific conditions to be effective,while gait recognition technology breaks through these limitations,making it possible to identify individuals in a wider range of environments.Previous research predominantly employed lightweight neural networks for gait feature extraction and achieved significant progress on popular datasets like CASIA-B,which feature cross-view angles and varying attires.However,experimental results indicate a substantial decline in recognition accuracy when simply stacking neural network layers on the CASIA-B dataset.A deep gait recognition network has been proposed,incorporating the relative position encoding transformer module.This module aims to avoid the pitfall of “local feature association"”and enables continuous learning of temporal features within gait sequences.Compared to current mainstream approaches,the proposed method has garnered enhanced identification precision across indoor environments,as exemplified by the CASIA-B and OUMVLP datasets,alongside outdoor settings typified by the Gait3D dataset.Especially in the task of clothes changing,wherein our method surpasses benchmark approaches by of 1.9%,achieving a recognition rate of 85.5%.
Big Data & Data Science
Online and Offline Multi-source Heterogeneous Data Fusion System for Recycling Information
QIU Mingxin, LEI Shuai, LIU Xianhui, ZHANG Yingyao
Computer Science. 2024, 51 (11A): 240100095-7.  doi:10.11896/jsjkx.240100095
Abstract PDF(1929KB) ( 143 )   
References | Related Articles | Metrics
In the recycling process of waste products in the resource recycling industry,a large number of multi-source hetero-geneous data will be generated due to the collaborative work of multiple systems.Aiming at the problem that the online and offline recycling information of waste products is difficult to fuse and effectively use,an online and offline multi-source heteroge-neous data fusion system for recycling information is proposed.Firstly,the system uses the Web API to realize the data access of online and offline multi-source heterogeneous data,and completes the pretreatment of it through the steps of data parsing,data cleaning and data conversion.Secondly,aiming at the problem that the existing data fusion methods based on clustering analysis usually need to specify the number of clusters in advance in the fusion process,a fusion method based on multi-objective clustering is proposed,which aims to automatically determine the number of clusters in the fusion process.Through feature selection,label co-ding,data conversion and normalization of the preprocessed data,combined with the multi-objective clustering algorithm,feature extraction and clustering of typical data is completed,and data matching based on Euclidean distance is performed for the total and incremental data.Finally,the system uses a distributed database scheme based on MyCat middleware and MySQL master-slave replication to realize the storage,sharing and exchange of fusion data.The test shows that the data fusion system can rea-lize the data fusion,sharing and exchange of online and offline multi-source heterogeneous recycling information of waste pro-ducts.At the same time,compared to the method based on K-Means,the proposed data fusion method based on multi-objective clustering can automatically determine the optimal cluster number on different data sets,and can obtain the compactness and separation no worse than that of the K-Means fusion method.
Urban Traffic Flow Prediction Based on Global Spatiotemporal Graph Convolutional NeuralNetwork
WANG Jiahao, LI Wenbin, GUO Shiyao, XIANG Ping
Computer Science. 2024, 51 (11A): 240200045-9.  doi:10.11896/jsjkx.240200045
Abstract PDF(3083KB) ( 206 )   
References | Related Articles | Metrics
Traffic flow prediction plays an important role in intelligent transportation systems(ITS).The key challenge in traffic flow prediction is to efficiently and comprehensively extract the complex spatiotemporal correlations in cities.Traffic speed has not only short-term and long-term periodic dependencies in the temporal dimension,but also local and global dependencies in the spatial dimension.Existing methods have certain limitations in capturing the spatiotemporal dependencies of traffic data.To this end,this paper proposes a deep learning model based on the global spatialtemporal graph convolutional network(GSTGCN) to address the limitations of urban traffic speed prediction.There are three spatiotemporal components in the model,which can model the three different spatiotemporal correlations in traffic data,namely,recent,daily,and weekly cycles.Each spatiotemporal component consists of a time module and a spatial module.In order to better obtain the temporal dimension information of traffic data,the time module introduces the Informer mechanism to adaptively assign feature weights.In order to better obtain the spatial relationship of traffic data,the spatial model introduces a graph convolutional neural network to extract local and global spatial information of traffic data.In the experiments,the proposed model is tested on two different real-world datasets.The results show that the proposed GSTGCN outperforms the most advanced baseline models.
Time Series Prediction of Hybrid Neural Networks Based on Seasonal Decomposition
XU Junwen, CHEN Zonglei, LI Tianrui, LI Chongshou
Computer Science. 2024, 51 (11A): 231200008-7.  doi:10.11896/jsjkx.231200008
Abstract PDF(3206KB) ( 186 )   
References | Related Articles | Metrics
In recent years,time series forecasting has found widespread applications in various domains such as finance,meteoro-logy,and military.Deep learning has begun to demonstrate significant potential and application prospects in time series forecasting tasks.However,recurrent neural networks often encounter issues like information loss and exploding gradients when dealing with time series predictions over extended periods.In contrast,Transformer models and their variants,when utilizing attention mechanisms,typically overlook the temporal relationships between variables in time series data.To address these challenges,this paper proposes a hybrid neural network time series forecasting model based on seasonal decomposition.This model employs a seasonal decomposition module to capture the variations in different periodic frequency components within the time series.Simultaneously,by integrating multi-head self-attention mechanisms and composite dilated convolution layers,the model leverages the interaction between global and local information to obtain multi-scale temporal positional information among the data.Ultimately,experiments are conducted on publicly available datasets from 4 different domains,and the results indicate that the predictive perfor-mance of the proposed model surpasses that of current popular mainstream methods.
Attribute Reduction of Discernibility Matrix Based on Three-way Decision
SONG Shuxuan, ZHANG Yuhong, WAN Renxia, MIAO Duoqian
Computer Science. 2024, 51 (11A): 231100176-6.  doi:10.11896/jsjkx.231100176
Abstract PDF(2116KB) ( 163 )   
References | Related Articles | Metrics
Attribute reduction is one of the core contents of the study of rough set theory and a crucial component of the theory itself.This approach aims to minimize redundant information and extract a set of attributes that is both representative and pivotal.In the process of attribute reduction,a difference matrix is commonly employed to measure the relationships between attributes.By analyzing the difference matrix,researchers can identify attributes that contribute similar information in describing the system's behavior,facilitating the process of attribute reduction.The three-decision-based difference matrix attribute reduction algorithm,starting from the attributes of the difference matrix,characterizes the importance of attributes beyond the core.It establishes a novel approach to attribute reduction based on the three-decision theory,dividing the upper and lower approximations of traditional probability rough sets into the positive region,negative region,and boundary region within the framework of three decisions.The proposed algorithm provides decision rules based on different regions and controls the three decision thresholds through a decision loss function.Compared to similar algorithms,it yields more concise reduction sets and decision rules,and has a lower time complexity.
BEML:A Blended Learning Analysis Paradigm for Hidden Space Representation of Commodities
ZHENG Qijian, LIU Feng
Computer Science. 2024, 51 (11A): 240300150-6.  doi:10.11896/jsjkx.240300150
Abstract PDF(2530KB) ( 150 )   
References | Related Articles | Metrics
With the advent of the Internet economy era,the efficient management of e-commerce platforms has garnered widespread attention from both academia and industry.Among various factors,the accuracy and automation level of product classification directly impact users' experience and the optimization of operational efficiency.In light of this,this study delves into the latent space representation of product information,proposing a blended learning analysis paradigm for product latent space representation(BEML).This framework integrates advanced bidirectional encoder representations from transformers(BERT) techno-logy with traditional machine learning methods,aiming to significantly enhance the efficiency and accuracy of automated product classification through detailed analysis of the latent space of product information.By conducting comparative analysis with current mainstream deep learning and machine learning algorithms,this study validates the exceptional performance of the BEML framework in product classification tasks.Experimental results demonstrate that the BEML framework achieves a macro F1 score of 85.79% and a micro F1 score of 84.73%.Both exceed the current best F1 score of 83.3%,reaching a state of the art.Moreover,this framework not only represents a theoretical innovation but also holds significant practical application value in the realm of information management and automation processing within the e-commerce sector,providing an efficient and reliable blended lear-ning analysis paradigm for the field of technology and business.
FCTNet:Bus Arrival Time Prediction Method Based on Dual Domain Deep Learning
ZHANG Mingze, LI Yi, WU Wenyuan, SHI Mingquan, WANG Zhengjiang
Computer Science. 2024, 51 (11A): 231000180-7.  doi:10.11896/jsjkx.231000180
Abstract PDF(2611KB) ( 170 )   
References | Related Articles | Metrics
Bus arrival time prediction is an important part of the smart bus system.It can provide passengers with accurate arrival time and help dispatchers make more reasonable scheduling arrangements.A bus arrival time prediction algorithm FCTNet(FFT-Conv-Transformer)based on convolution,attention mechanism and FFT is proposed for dual-domain deep learning of time domain and frequency domain.The algorithm integrates Fourier transform,convolutional neural network and attention mechanism to predict bus arrival times at single stops and multiple stops.Among them,Fourier transform and convolutional neural network are used to learn the spatiotemporal characteristics of the input data in the frequency domain while retaining the time domain signal.The attention mechanism is used to learn the global dependence of the input sequence and predict the final result.In the arrival time prediction experiment of three bus lines 465,506 and 262 in Chongqing,the average absolute percentage error and average absolute error of the FCTNet network model are better than the experimental comparison algorithm.In the busiest bus No.465,the average relative error of the FCTNet network model is better than that of the experimental comparison algorithm.Compared with the existing best model,it is reduced by 2.34%,and the average absolute error is reduced by 4.59s.
Domain Generalization and Long-tailed Learning Based on Causal Relationships
LYU Jiahao, LIU Jinfeng
Computer Science. 2024, 51 (11A): 240300041-8.  doi:10.11896/jsjkx.240300041
Abstract PDF(3038KB) ( 134 )   
References | Related Articles | Metrics
Deep learning,as a representative of machine learning methods,has been widely applied and achieved many successes.However,problems such as dataset distribution shift and long-tailed distribution can significantly degrade the performance of traditional deep learning methods,and these two issues often exist in real-world datasets.Although domain generalization and long-tailed learning research have provided good solutions to these two problems separately,the effect of a single domain generalization or long-tailed learning method is not satisfactory in the complex scenario of combining distribution shift and long-tailed distribution(LT-DS).To address the LT-DS problem,a unified approach can be taken from a causal perspective to solve both issues si-multaneously.For distribution shift,causal intervention and decomposition can be achieved through Fourier transform,and cross-domain invariant causal feature representations can be obtained through decorrelation weighting.For long-tailed distribution,a causal effect classifier can be constructed through debiasing training to eliminate momentum-induced biases,and further eliminate the impact of long-tailed distribution through Balanced Softmax and logit adjustment.Experimental results show that this method outperforms the best existing methods by an average of 8% and 5% on the AWA2-LTS dataset and ImageNet-LTS dataset,respectively,demonstrating competitive results on the LT-DS problem.
Parameterized Quantum Circuits Based Quantum Neural Networks for Data Classification
CHEN Chao, YAN Wenjie, XUE Guixiang
Computer Science. 2024, 51 (11A): 231200112-7.  doi:10.11896/jsjkx.231200112
Abstract PDF(3803KB) ( 173 )   
References | Related Articles | Metrics
Quantum neural network combines the advantages of quantum computing and classical neural network model,and provides a new idea for the development of artificial intelligence in the future.Although quantum neural networks have been widely studied,the impact of data encoding methods and different training circuits on model performance has not yet been fully explored.Therefore,this paper proposes a new quantum neural network model for data classification,which explores the influence of diffe-rent data encoding methods and different structure training layers on classification tasks.The method first preprocesses the classical image,uses different data encoding methods to encode it into different parameterized quantum circuits for training,measures the output of the model,and uses the parameter shift rule to update the training parameters to complete the data classification.Experimental results on MNIST handwritten dataset show that the proposed model achieves more than 97% classification accuracy on digit {3,6} classification task.Compared with the current mainstream methods,the proposed method has a significant improvement in classification accuracy.
Group Polarization Communication Model of Public Opinion in Emergencies Considering NetworkMedia Report and Government Intervention
LIN Yuying, ZHANG Liang, MIAO Jiale
Computer Science. 2024, 51 (11A): 240400053-7.  doi:10.11896/jsjkx.240400053
Abstract PDF(2990KB) ( 136 )   
References | Related Articles | Metrics
Group polarization means that irrational and extreme group views and emotions are easy to appear in the communication of public opinion in emergencies,which increases the risk of social instability.Network media reports and government intervention play important roles in the dissemination of public opinion.It is urgent to understand their mechanism of action on group polarization,so as to reduce the negative impact of public opinion dissemination through multiple efforts.According to the netizens'state of public opinion communication in emergencies,they are divided into five states:susceptible,exposed,infective,pola-rized and recovered.The SEIPR-MG epidemic model is established and the equilibrium point is analyzed.Simulation experiments are conducted to study the transmission mechanism of group polarization of public opinion in emergencies.The SEIPR-MG model reveals the specific mechanism of public opinion communication and group polarization in emergencies under network media reports and government intervention.The reports of the network we-media not only make public opinion communication faster and wider but also promote the emergence and development of group polarization.Government intervention can effectively control the dissemination direction of online public opinion,and prevent the emergence and spread of polarized emotions and views.
Rumor Detection Based on Similarity-enhanced Propagation Structure
LIN Yidi, LI Bicheng, YANG Haijun
Computer Science. 2024, 51 (11A): 240200116-8.  doi:10.11896/jsjkx.240200116
Abstract PDF(2799KB) ( 141 )   
References | Related Articles | Metrics
The rapid rise of social media has led to the issue of rumor dissemination,causing negative impacts on society.Existing rumor detection algorithms mainly focus on the contents and propagation structures of news,but often overlook the potential influence of user preference similarity.When browsing posts,users are more likely to encounter information spreaded by other users with similar preferences,which can further fuel the spread of rumors.Moreover,existing research frequently neglects the diversity of propagation structures and the relationship between news content and its propagation structure.Different types of news should exhibit different propagation patterns.Therefore,this paper proposes a model named “SEPS”,which aims to establish connections between users with similar preferences and then categorize propagation structures into various forms to extract features of diffe-rent propagation patterns.Finally,by introducing contrastive learning and co-attention modules,the model enhances the correlation between news content and propagation structure.Experiments demonstrate that the “SEPS” model can effectively detect rumors,and its performance outperforms that of the best baseline models.
Hybrid Index Structure for Trajectory Range Query Combined with Spatio-Temporal Keywords
MENG Xiangfu, LI Tianshuo, ZHANG Xiaoyan
Computer Science. 2024, 51 (11A): 240200114-8.  doi:10.11896/jsjkx.240200114
Abstract PDF(4098KB) ( 145 )   
References | Related Articles | Metrics
For a wide range of trajectory datasets on the road network,the method of spatial-temporal range query combined with keyword features has redundant storage structure and low query efficiency.In this paper,a spatial-temporal trajectory index structure combining text features,called IG-Tree,is proposed.The basic idea is to divide the road network graph into hierarchical subgraphs and generate a balanced tree structure,in which each tree node maintains its associated trajectory.In addition,the query algorithm designed in this paper utilizes the text features of sub-images associated with IG-Tree nodes and deletes irrelevant tra-jectories at range boundaries to realize text space range query.Experimental results show that the proposed IG-Tree index structure shows high accuracy and fast response speed on Porto & LA dataset.
MB-ATMK:Multi-behavior Sequential Recommendation Integrating Attribute Weights andTemporal Meta-knowledge
CHEN Yuzhe, CAO Qiong, HUANG Xianying, ZOU Shihao
Computer Science. 2024, 51 (11A): 231100047-9.  doi:10.11896/jsjkx.231100047
Abstract PDF(3333KB) ( 147 )   
References | Related Articles | Metrics
Sequential recommendation predicts users' future preferences based on the sequence of interactions between users and items.However,existing methods often overlook the multi-behavior interactions(such as page view,favorite,add to cart)in real-world scenarios.Additionally,users' preferences not only depend on temporal sequences but are also influenced by attribute information.Lastly,in the scenario of multi-behavior sequence recommendation,users' multi-behavior interactions exhibit complex dependencies.Therefore,this paper proposes a multi-behavior sequence recommendation model with attribute weights and temporal meta-knowledge(MB-ATMK).Firstly,we incorporate users' multi-behavior interaction data and design a temporal-aware encoding module based on the timestamps of user interactions to capture users' dynamic preferences through temporal-aware attention.Secondly,we introduce rich attribute information on both the user and item sides and design an attribute-weighted meta-knowledge graph neural network.Using meta-knowledge,we refine users' multi-preference patterns and design an attribute-weighted attention mechanism based on graph neural networks to enhance the model's capture of users' fine-grained preferences.Finally,we propose a meta-knowledge prediction layer that includes a multi-behavior weight generation module and a preference transfer network,capturing users' cross-behavior dependencies through generated customized meta-knowledge.Extensive experiments on two datasets validate the effectiveness and superiority of the proposed model.
Properties and Applications of Average Approximation Accuracy
ZHANG Xiawei, KONG Qingzhao
Computer Science. 2024, 51 (11A): 240300108-5.  doi:10.11896/jsjkx.240300108
Abstract PDF(1808KB) ( 171 )   
References | Related Articles | Metrics
Average approximation accuracy is an important concept in rough set theory,which has only been proposed in recent years.In this paper,the mathematical structure of average approximation accuracy is first analyzed,and another new explanation for average approximation accuracy is provided.Then,we focus on discussing several important properties of average approximation accuracy,and find that average approximation accuracy can characterize the knowledge representation ability of rough set models more effectively than traditional methods.Finally,the applications of average approximation accuracy in incomplete information tables and feature selection are discussed,respectively.These research achievements will enrich the content of rough set theory and expand its application in practical problems.
STK:Clustering Method Based on Contrastive Learning Embedding
LIU Jinxia, ZHANG Xi
Computer Science. 2024, 51 (11A): 240400011-6.  doi:10.11896/jsjkx.240400011
Abstract PDF(2505KB) ( 154 )   
References | Related Articles | Metrics
SimCSE,as a contrastive learning method,has shown good performance in text embedding and clustering.The aim of this paper is to optimize the sentence embedding generated by SimCSE training models to make them suitable for clustering tasks.By combining multiple algorithms and adjusting training parameters,the problems of clustering algorithm selection,noise,and outliers can be solved.This paper proposes an unsupervised clustering model SimCSE t-SNE KMeans(STK) that combines KL divergence and K-Means algorithm.SimCSE is used to encode the text,and then the t-SNE algorithm is used to reduce the dimensionality of high-dimensional embeddings.By minimizing KL divergence and preserving the similarity relationship between high-dimensional data points in low dimensional space,the dimensionality is reduced while improving the text embedding representation.Finally,the KMeans algorithm is used to cluster the reduced embeddings and obtain clustering results.By comparing the clustering results of this study with those obtained by algorithms such as Bert,UMAP,HDBSCAN,etc.,it is found that the model proposed in the paper showed better clustering performance in the field of hydrogen productionpatent and paper datasets,especially in the evaluation index of Silhouette coefficient.
Study on Linear Projection Method for Local Structure Adaptation
YANG Xing, WANG Shitong, HU Wenjun
Computer Science. 2024, 51 (11A): 240100054-7.  doi:10.11896/jsjkx.240100054
Abstract PDF(3207KB) ( 142 )   
References | Related Articles | Metrics
The core of manifold learning lies in capturing hidden geometric information within data by preserving local structures,which are typically assessed using redundant or noisy raw data.This implies that local structures are unreliable,giving rise to issues of insufficient confidence in local structures.To address this issue,a locally adaptive linear projection method is proposed.The essence of this method lies in two aspects:firstly,it enforces that the low-dimensional representation obtained through linear projection preserves local structures in the high-dimensional space;secondly,it updates the local structures in the high-dimensional space through the low-dimensional representation and achieves local structure adaptation through iterative cycles.Experimental results on real datasets demonstrate that the proposed method outperforms other comparative methods across various performance metrics.
Network & Communication
Survey of Combat Parallel Simulation Technology
LI Hao, YE Shuai, SHI Peiteng, SHI Zhijiang
Computer Science. 2024, 51 (11A): 240100127-7.  doi:10.11896/jsjkx.240100127
Abstract PDF(2569KB) ( 159 )   
References | Related Articles | Metrics
Combat parallel simulation is to establish a simulation model that conforms to the nature of synergy and confrontation based on the parallel nature of combat entities,and run these models on the computer logically in parallel,which can ultimately reflect the simulation of the interaction between combat entities.With the advent of the era of intelligent warfare,combat parallel simulation faces challenges such as complex operation logic,frequent entity interaction,and difficulty in dividing parallel tasks.This paper analyzes the typical combat simulation system,summarizes the parallel elements according to its operating characteristics,abstracts the technical framework of the combat parallel simulation,and focuses on the execution mechanism,task allocation,load balancing,parallel communication and other key aspects involved in the combat parallel simulation.The technology is reviewed,and the possible future development direction of combat parallel simulation is pointed out,which provides a reference for the design and technical implementation of a new combat parallel simulation system in the era of intelligence.
Robust Distributed Monitoring Algorithm Under Limited Interference
YAN Xinyu, HUANG Zengfeng
Computer Science. 2024, 51 (11A): 240200050-7.  doi:10.11896/jsjkx.240200050
Abstract PDF(2680KB) ( 153 )   
References | Related Articles | Metrics
Distributed monitoring is a pivotal area in the field of distributed systems.It focuses on the coordination of computational tasks between multiple sensors and a central processor.Typically,sensors notify the central processor immediately upon receiving signals,leading to energy wastage.However,this traditional communication mechanism is inefficient for several reasons.Firstly,the central processor often only needs summarized information,like the total number of signals received over a period.Secondly,sensors buried within objects rely on battery power,making replacements challenging.Lastly,the energy consumed in communication surpasses that needed for computation.In contrast,distributed algorithms summarize results before transmitting to the central processor,proving to be more economical.Good distributed tracking algorithms not only achieve tracking tasks with smaller communication costs,conserving sensor energy and prolonging lifespan,but also demand considerations for communication efficiency and accuracy.However,current research on distributed threshold monitoring problems based on preset probability distributions is relatively limited.Existing studies often rely on idealized assumptions,resulting in algorithms lacking robustness against real-world interference.This paper introduces interference to simulate the complexities of the real world,aiming to identify more robust distributed tracking algorithms.The proposed algorithm reduces communication rounds by judiciously selecting the thresholds for sensors to send notifications to the central processor,significantly reducing communication costs.Additionally,it ensures algorithm accuracy in the presence of interference.The algorithm's accuracy is theoretically proven,while its communication cost can reach O(KloglogN) when interference is limited.This study provides a fresh perspective on distributed tracking algorithms,supporting the solution of practical tracking problems.
Joint Optimization of Delay and Energy Consumption of Tasks Offloading for Vehicular EdgeComputing
LI Wenwang, ZHOU Haohao, DENG Su, MA Wubin, WU Yahui
Computer Science. 2024, 51 (11A): 231000080-7.  doi:10.11896/jsjkx.231000080
Abstract PDF(2588KB) ( 183 )   
References | Related Articles | Metrics
The combination of the Internet of Vehicles(IoV) and connected autonomous vehicles(CAV) has promoted the rapid development of autonomous driving technology,but it has also created a huge demand for computing resources,which is challen-ging to resource-constrained vehicles.Vehicular edge computing(VEC) offers an entirely new solution.By offloading tasks to edge servers deployed in the roadside unit(RSU),we are able to service the IOV in a more efficient way.However,resource preemption will occur when multiple vehicles send offloading requests at the same time,which will increase the task processing delay.How to efficiently dispatch resources to maximize the quality of service is an urgent problem to be solved.To solve this pro-blem,we treat it as a multi-objective optimization pro-blem and propose a task offloading algorithm named NSGA2TO based on non-dominated sorting genetic algorithm-II.The algorithm can find the Pareto optimal solution of multi-objective optimization pro-blems,and extensive simulation results verify that NSGA2TO outperforms counterparts.In addition,we also explore the relationship between the delay and energy consumption involved in the Pareto optimal solution,which helps to better understand the complexity of the vehicle tasksoffloading problem.By properly balancing delay and energy consumption,we will be able to further improve the performance and efficiency of the connected autonomous system,providing users with a safer and more convenient travel experience.
Joint Optimization Method for Node Deployment and Resource Allocation Based on End-EdgeCollaboration
YANG Zheming, ZUO Lulu, JI Wen
Computer Science. 2024, 51 (11A): 240200010-7.  doi:10.11896/jsjkx.240200010
Abstract PDF(2933KB) ( 154 )   
References | Related Articles | Metrics
With the rapid development of IoT technology,edge computing shows its unique advantages in diverse application scenarios.The location deployment and resource allocation of edge servers become the key factors to improve the efficiency of task processing.However,this process faces significant challenges due to the wide distribution of end devices and the heterogeneous nature of edge servers.To effectively address these issues,this paper proposes a joint optimization method for node deployment and resource allocation based on end-edge collaboration,aiming to improve the overall performance of edge computing systems comprehensively.Our approach first utilizes a hierarchical clustering algorithm to effectively divide end devices into several regions based on their functional and geographic similarities.Subsequently,based on key metrics such as processing power,storage space,and network bandwidth of edge servers,the most suitable edge nodes in each region is decided.Finally,allocating tasks is guided by jointly optimizing node deployment and resource utilization.To validate the effectiveness of the proposed method,we conduct simulation experiments of different methods on public datasets.Experimental results show that our proposed method can improve the load balancing level by more than 30% and reduce task processing latency and energy consumption by more than 10%,compared to the existing methods.
Study on Unmanned Aircraft Formation Control Based on Multi-agent Collaboration
GAN Liangqi, DONG Chao
Computer Science. 2024, 51 (11A): 240100105-7.  doi:10.11896/jsjkx.240100105
Abstract PDF(5366KB) ( 162 )   
References | Related Articles | Metrics
The application of unmanned aerial vehicles(UAVs) is receiving more and more attention in all aspects,especially in the complex and changing battlefield,where they have unique advantages.However,single-flight UAV operations in the battlefield are very limited in capability and can only accomplish a single task with low efficiency.Multi-UAV formation coordination can give full play to the advantages of UAV formation and realize wider mission coverage and more efficient mission execution capability.Firstly,the application background and application scenarios of multi-agent networking and UAV formation cooperative dynamic networking are analyzed.The domain of “air-sky-low-altitude-land” joint communication is proposed,which can realize the information between different spatial levels.The joint communication domain of “air-sky-low-altitude-land” is proposed,which can realize the transmission of information and data between different spatial levels,achieve unified command and cooperative operation on the battlefield,and improve the combat efficiency.Then,the mathematical model of a single UAV is established after analysis,and the flight controller of a single UAV is designed and the response times of roll angle,pitch angle and yaw angle are 0.5 s,0.3 s and 2.5 s respectively through simulation.After that,the UAV formation cooperative control system is designed based on tools such as AirSim plug-in,Matlab/Simulink,Python,which can realize the simulation of mission scenarios,the collection and processing of UAV flight data.Finally,in order to solve the problems of data interaction and decision making among UAVs,a cooperative communication control module is designed,and the specific hardware principle and communication protocol for data interaction are given,and the UAV formation cooperative control system is completed through simulation and test.It provides some theoretical support for unmanned combat in modern war and has practical reference significance.
Method of Outdoor CSI Feedback for Massive MIMO Systems Based on Deep Autoencoder
CHEN Meng, QIAN Rongrong, ZHU Yujia, HUANG Zhenguo
Computer Science. 2024, 51 (11A): 231000191-6.  doi:10.11896/jsjkx.231000191
Abstract PDF(2562KB) ( 139 )   
References | Related Articles | Metrics
In outdoor scenarios with high compression,aiming at the problems of low accuracy and high complexity of reconstruction of most existing channel state information(CSI)feedback methods in massive multiple-input multiple-output(MIMO)systems,a deep autoencoder-based CSI compression feedback method is proposed.The method firstly uses a convolutional neural network in the encoder to extract the feature information of the original CSI,and then uses a fully connected network to compress it into a low-dimensional codeword for feedback to the decoder.Considering that the spatial pattern of CSI in outdoor environments is more complicated,and the loss of information is more at high compression,the decoder employs parallel multi-resolution convolutional networks and fully connected networks in a residual structure to reconstruct the received feature codewords.This design enhances the reconstruction and generalization capabilities of the proposed method.Experimental results show that the reconstruction quality of the proposed method is significantly improved at different compression ratios.
Deep Learning Based Joint Beamforming in Intelligent Reflecting Surface Enhanced WirelessCommunication Systems
CHEN Xiao, ZHANG Quanhao, SHI Jianfeng, ZHU Jianyue
Computer Science. 2024, 51 (11A): 231200125-5.  doi:10.11896/jsjkx.231200125
Abstract PDF(2737KB) ( 166 )   
References | Related Articles | Metrics
Intelligent reflecting surface(IRS),as one of the most potential technologies in the next-generation wireless communication,plays a significant role.However,the existing IRS-assisted multiple-input multiple-output(MIMO) systems face a challenging problem that the beamforming methods require high computational capabilities of the antennas.To address this challenge,a deep learning(DL)-based joint beamforming design has been proposed for IRS-aided multi-user MIMO communication systems aiming to maximize the sum data rate of all users.The proposed DL-based beamforming scheme utilizes convolutional neural network to jointly optimize digital beamforming at base station and reflection beamforming at IRS.The proposed DL-based beamforming method forecasts the essential features extracted from the beamforming matrix,which overcomes the challenge of direct prediction of beamforming matrix by neural network.This method significantly reduces the demand on the predictive capability of the neural network,and the trained and optimized beamforming designs are using online that can significantly reduce the real-time computational complexity.Simulation results demonstrate that the proposed beamforming design can achieve over 0.5~1bit/s/Hzdata rate improvement,which will be enhanced with the growth of user number.
Collaborative Target Tracking of Mobile Sensor Networks Based on Force-directed Localization
WANG Zongyao, CUI Wendong, GUO Yue, YU Fangping
Computer Science. 2024, 51 (11A): 231100091-5.  doi:10.11896/jsjkx.231100091
Abstract PDF(3036KB) ( 145 )   
References | Related Articles | Metrics
A collaborative localization system and target-tracking controller for mobile sensor networks is proposed.The localization system uses the onboard UWB ranging equipment to obtain the distance information between sensor nodes and build a distance matrix for the sensor network.The pose information is restored from the distance matrix rough the force-directed algorithm.Compared with global positioning systems such as GPS or beacons,this system does not need to deploy base stations in advance.Compared with visual positioning systems,the localization system uses Urbanology to achieve ranging and positioning which does not influence by illumination,visual occlusion.and limited perception range.The proposed sensor network system uses deep learning and hierarchical clustering to achieve target detection and image data fusion.The sensor network can automatically implement layout adjustment and collaborative target tracking.Because of its advantage,this sensor network system can be deployed in extreme environments such as battlefields,disaster areas,underground tunnels,and even outer space.This paper uses theoretical analysis to demonstrate the system stability of the force-directed positioning algorithm and proves the feasibility and practicality of the system through simulation experiments and robot experiments.
Dynamic Partition Patrol Strategy of Multi-robot Under Visitor Access Trend
MA Wenjie, LI Zonggang, DU Yajiang, CHEN Yinjuan
Computer Science. 2024, 51 (11A): 231200088-9.  doi:10.11896/jsjkx.231200088
Abstract PDF(3778KB) ( 161 )   
References | Related Articles | Metrics
To address the issue of increased patrol workload for robots in areas with high foreign visitor traffic,this paper proposes a multi-robot dynamic partitioning patrol strategy that takes into account visitor trends.This strategy aims to improve the efficiency of the multi-robot system in patrolling dynamic environments.Firstly,an improved k-means strategy is used to complete the static initialization score of the environment.Then,the robots perform patrolling tasks in their respective zones by adding the robots′ access frequency requirements at different locations.Secondly,when visitors enter the environment to visit different nodes,the robots focus on the visitors′ access trends,negotiate with neighboring partitioned robots,and then transfer the region candidate nodes through the neighbouring regions multiple times to balance the workload of the partitioning robots and complete the real-time dynamic partitioning of the region.The simulation results demonstrate that the robots can effectively detect visitors while maintaining dynamic workload balancing.Additionally,the proposed multi-robot dynamic zoning patrol strategy,under the visitor access trend,can significantly enhance the efficiency of multi-robot patrol in dynamic environments.
Queueing Theory-based Joint Optimization of Communication and Computing Resources in Edge Computing Networks
XUE Jianbin, YU Bowen, XU Xiaofeng, DOU Jun
Computer Science. 2024, 51 (11A): 240100103-9.  doi:10.11896/jsjkx.240100103
Abstract PDF(2606KB) ( 153 )   
References | Related Articles | Metrics
High reliability and low latency is one of the most important research directions in edge computing networks for vehi-cular networking.In order to meet the complex and variable task requests in vehicular networking networks,communication and computation resources are allocated effectively and efficiently.A multi-objective reinforcement learning strategy for intelligent communication and computation resource allocation based on the combination of task queuing theory model and edge computing model is proposed.The strategy combines the allocation of communication and computation resources to reduce the total system cost consisting of latency and reliability.The strategy can be decomposed into three algorithms,firstly,the joint computational offloading and collaboration algorithm is a generic framework for the strategy which first selects the offloading layer for the generated task requests such as the edge computing layer and the local computing layer using the KNN method.Then,when the local computing layer is selected to perform the task,an algorithm called collaborative vehicle selection is used to find the target vehicle to perform the collaborative computation.Finally,the allocation of communication and computational resources is defined as two independent objectives and the algorithm called multi-objective resource allocation uses reinforcement learning at the mobile edge computing layer to achieve an optimal solution to the problem.Simulation results show that the proposed strategy effectively reduces the total cost of the system compared to random computing,all edge computing and all local computing.The KNN approach saves the total cost of the system compared to the random offloading approach and the reinforcement learning algorithm outperforms the traditional particle swarm algorithm in controlling the total cost of the system.
Cloud-Edge Collaborative Task Transfer and Resource Reallocation Optimization Based on Deep Reinforcement Learning
CHEN Juan, WANG Yang, WU Zongling, CHEN Peng, ZHANG Fengchun , HAO Junfeng
Computer Science. 2024, 51 (11A): 231100170-10.  doi:10.11896/jsjkx.231100170
Abstract PDF(2972KB) ( 178 )   
References | Related Articles | Metrics
In this paper,we have investigated a heterogeneous cloud-edge environment consisting of multiple edge servers and cloud servers,where each node has computation,storage and communication capabilities.Due to the uncertainty and dynamics of the heterogeneous cloud edge environment,dynamic scheduling is required to optimize resource and task allocation.The traditional deep learning framework only extract the potential features from the input task data,mostly ignoring the network structure information characteristics of the cloud-edge environment.To solve this problem,this paper proposes a distributed SAC-GCN algorithm based on the Actor-Critic framework,using the self-evolutionary ability of the experience training of soft actor-critic(SAC) and the graph-based relationship inference ability of graph convolutional networks(GCN).The proposed SAC-GCN employs an adaptive loss function to provide effective scheduling strategies for different task migration requirements by capturing dynamic task information and heterogeneous node resource information.In this paper,we utilize the Bit-brain dataset sourced from the real world,and carries out a large number of simulations through Cloud-Sim.Experimental results show that compared with the exis-ting algorithms,the proposed SAC-GCN can reduce the system energy consumption by 4.81%,shorten the task response time by 3.46% and the task migration time by 2.73%,and reduce the task SLA violation rate by 1.5%.
Distributed Sensor-Weapon-Target Assignment Algorithm for Ballistic Missile Defense Based on Contract Net Protocol
WANG Song, CHEN Gong
Computer Science. 2024, 51 (11A): 240900024-7.  doi:10.11896/jsjkx.240900024
Abstract PDF(2525KB) ( 168 )   
References | Related Articles | Metrics
The resource assignment algorithm is a key technology of realizing integrated air and missile defense.In order to solve the problem of sensor-weapon-target dynamic assignment in ballistic missile defense,a distributed assignment algorithm based on contract net protocol is proposed.Firstly,a formal model of the sensor-weapon-target dynamic assignment problem is constructed,which considers the practical constraints such as the spatial capability of sensors and weapons,and the number of guidance channels and interceptor missiles.An objective function is designed to achieve the two main principles of earliest interception and maximum success probability in ballistic missile defense.Then,on the basis of the contract network protocol framework,we construct thedynamic process of sensor-weapon-target assignment,and design the bidding and awarding strategies ofsensors and weapons according to their characteristics respectively.In the bidding strategy of weapons,the replacement of assigned targets is considered,and a method of selecting the replacement target is proposed which estimates the effectiveness loss caused by the replacement target rebidding.Computer simulation experimental results show that the proposed algorithm can assign sensors and weapons dynamically.Compared to the assignment method of traditional ballistic missile defense system,the proposed algorithm leads to earlier interception and greater success probability,and achieves 43.7% effectiveness improvement.
Study on Optimization of Long-distance Relay Communication and Computational Offloading Strategy Based on Self-powered UAVs
XUE Jianbin, TIAN Guiying, MA Yuling, SHAO Fei, WANG Tao
Computer Science. 2024, 51 (11A): 240300069-7.  doi:10.11896/jsjkx.240300069
Abstract PDF(2903KB) ( 135 )   
References | Related Articles | Metrics
Mobile edge computing(MEC) plays an important role in wireless subscriber services and significantly improves the efficiency of computing services.However,with the rapid growth of the number of terrestrial users,it becomes increasingly difficult for wireless devices to directly access MEC nodes.To address this challenge,this paper proposes an innovative communication system model that utilizes self-charging unmanned aerial vehicles(UAVs) to collaborate with terrestrial base stations,including MEC nodes and energy transmitting station LSs,aiming to enhance the performance of terrestrial wireless communication systems.The cooperative working mechanism between UAV-MEC system,energy launching station(LS),IoT devices,and edge cloud(EC) is deeply explored.The power consumption of the UAV,the charging process of the LS to the UAV,and the conversion loss of the RF-DC signals are considered comprehensively,aiming to maximize the residual energy of the UAV after completing its mission while ensure its continuous and stable operation.Secondly,the UAV's hovering position,the allocation of communication and computational resources,and the decision of task segmentation are jointly optimized with the aim of minimizing the UAV's energy consumption while ensuring the optimization of the overall performance of the wireless communication system.Since the problem is highly nonconvex,an efficient algorithm based on successive convex approximation is proposed to obtain a suboptimal solution.Extensive simulation experiments verify that the proposed scheme significantly outperforms the baseline schemes in practical applications.
Scheduling Jobs with Multiple Deadlines in Cloud
LIU Zhimin, CHEN Jianer
Computer Science. 2024, 51 (11A): 240100120-7.  doi:10.11896/jsjkx.240100120
Abstract PDF(1874KB) ( 146 )   
References | Related Articles | Metrics
With the increasing impact of big data on people's lives,the demand for data storage and computation continues to grow,and the emergence of cloud computing effectively meets this demand.In cloud computing systems with high real-time requirements,resource requests from clients are regarded as 2-stage jobs with deadlines and certain profits,while cloud servers are seen as 2-stage machines.The deadlines for different resource requests are usually different,and if the cloud center can complete the request before the deadline,it can obtain corresponding profits.Existing 2-stage jobs scheduling algorithms that aim to maximize revenue are conducted under a common deadline constraint,whereas in reality,different resource requests may have varying deadlines.Based on the demand in the research and applications in cloud computing and data centers,we build a mathematical model for job scheduling in cloud computing.We study the problem of scheduling 2-stage jobs with multiple deadlines on multiple 2-stage machines.Let k be the number of deadlines of the jobs.When k is a constant,a polynomial-time approximation algorithm with approximation ratio(3k+∊) is provided.When the number of machines is a fixed constant,the approximation ratio is further improved to(k+∊),where ∊ > 0 is an arbitrary constant.Therefore,when k is a constant,the problem has a constant ratio polynomial-time approximation algorithm.In the case where T-processing time is greater than R-processing time,a pseudo-polynomial time approximation algorithm with approximation ratio 2 is presented,further improving the approximation ratio.
Adaptive Fingerprint Subspace Matching WiFi Location Algorithm
CHEN Lijiu, WANG Ke, LI Peng, ZHANG Zhengpeng, DENG Ganlin, ZHANG Zhisheng
Computer Science. 2024, 51 (11A): 231000172-6.  doi:10.11896/jsjkx.231000172
Abstract PDF(2964KB) ( 159 )   
References | Related Articles | Metrics
In traditional wireless fidelity(WiFi)fingerprint matching algorithms,factors such as remote proximity points caused by signal fluctuation and the occlusion of access point(AP)signals by objects in the environment will seriously affect the positioning accuracy.To solve this problem,this paper proposes an adaptive fingerprint subspace matching positioning algorithm.According to the combination of different APs,the fingerprint database and the test fingerprint are divided into subspaces.In each subspace,the difference between Euclidean distances is used to set the optimal critical value of performance,and the nearest K reference points are selected.The weighted K-nearest neighbor method is used for coarse positioning to eliminate the error caused by remote neighboring points.Finally,the estimated value of coarse position in each subspace is integrated,and the average filter is used for precise positioning.Experimental results show that,compared with the traditional WiFi fingerprint matching algorithm,the proposed algorithm effectively reduces the impact of remote proximity points and AP occlusion on the positioning accuracy,enhances the constraint of AP on different positions,and improves the accuracy and robustness of the WiFi positioning system.
Study on Dynamic Redundancy Mechanism of Time Sensitive Networks Based on Segmented Frame Copy and Elimination
ZHANG Hao, GUO Oufan, ZHOU Feifei, MA Tao, HE Yingli, YAO Subin
Computer Science. 2024, 51 (11A): 240300085-7.  doi:10.11896/jsjkx.240300085
Abstract PDF(3058KB) ( 162 )   
References | Related Articles | Metrics
To address the issue of how to achieve reliable transmission of replicated frames in the IEEE 802.1CB protocol for time sensitive networks,as well as the waste of network resources caused by end-to-end reliability protection,this paper proposes a dynamic redundancy mechanism for time sensitive networks based on segmented frame replication and elimination.This mechanism utilizes a reliability probability model to deploy different redundant paths for each data flow based on its priority,and utilizes the idea of segmented protection to compress network redundancy,effectively compressing network redundancy while ensuring high reliability of data transmission.This algorithm first filters the client stream based on its priority,ensuring redundancy only for data streams with priority greater than or equal to 4.Secondly,genetic algorithm is used to calculate the optimal main path between the source node and the destination node for data transmission,and the reliability probability model is used to determine whether the expected reliability has been achieved.If not,segmented frame replication and elimination methods will be used to confirm redundant paths and the number of FERE-NODES that need to be deployed.Finally,through continuous iteration and updating,the optimal solution for deploying FERE-CODE and the optimal redundant path strategy are obtained.Through simulation experiments on the NeSTiNg platform,the results show that compared with the shortest path algorithm and the minimum cost algorithm based on Lagrangian relaxation delay constraint(DCLC),the proposed redundant algorithm reduces packet loss rates by 0.15% and 0.23% respectively,and reduces average latency by 9.33% and 7.35% respectively.In comparison with two end-to-end redundancy mechanisms,ETE-FRER and ONE-FRER,the proposed redundancy algorithm reduces bandwidth consumption by 35.0% and 12.4% respectively under the 99.999% reliability requirement,fully verifying that this algorithm can effectively reduce network redundancy consumption while ensuring high network reliability.
Computer Software & Architecture
Implementation of Retargeting CompCert Trusted Compiler for Loongson Processors
HU Shaoru, WANG Juanwei, WANG Shengyuan
Computer Science. 2024, 51 (11A): 240200115-9.  doi:10.11896/jsjkx.240200115
Abstract PDF(3175KB) ( 173 )   
References | Related Articles | Metrics
CompCert is a well-known trustworthy C-language compiler,which is highly credible and has been widely used in many research and development work in academia and industry in recent years.CompCert proves the property that the target assembly code it produces can keep the semantics of the source code,with Coq,an interactive proof assistant.The CompCert compiler now supports multiple target machine architectures,but there is currently a lack of versions specifically designed for domestically developed processors,such as the Loongson processor architecture(LoongArch).Retargetting CompCert to domestic processors such as Loongson is of great benefit to the development of safety-critical software in China.This paper analyzes the design and the structure of the CompCert compiler backend and the characteristics of Loongarch,revises the backend of the CompCert compiler to make it available to generate the assembly code suitable to run on the Loongson processor,and presents the work of the different modules.The revised CompCert compiler,which retargets to Loongson processors,has performance competitive with GCC at optimization level 1,and can meet the needs of various scenarios.
Automatic Mixing Precision Optimization for Matrix Multiplication Calculation
HE Haotian, ZHOU Bei, GUO Shaozhong, ZHANG Zuoyan, HAO Jiangwei, JI Liguang, XU Jinchen
Computer Science. 2024, 51 (11A): 240300057-10.  doi:10.11896/jsjkx.240300057
Abstract PDF(5227KB) ( 151 )   
References | Related Articles | Metrics
The implementation of mixed-precision optimization for matrix multiplication computation greatly improves the performance of matrix multiplication computation,but at the same time,compared with high-precision matrix multiplication computation,mixed-precision matrix multiplication computation introduces errors.In order to effectively reduce the errors introduced in the mixed-precision computation,this paper implements an automatic mixed-precision tool AMAO for matrix multiplication computation.On the basis of low precision times high precision plus basic mixing accuracy calculation,the tool divides the original basic mixing accuracy calculation into two parts according to a certain proportion through the precision optimization algorithm of iterative space division,one part uses high precision calculation method and the other part uses the basic mixing accuracy calculation method,and realizes the automatic generation tool of mixed accuracy code according to the algorithm.Experiments show that compared with the mixed-precision tool AGMMMPC,the performance of mixed-precision codes generated by AMAO is reduced by 5.90% on average,and the accuracy is improved by 49.31% on average.
Style-oriented Software Architecture Evolution Path Generation Method
ZHONG Linhui, YANG Chaoyi, XIA Zihao, HUANG Qixuan, QU Qiaoqiao, LI Fangyun, SUN Wenbin
Computer Science. 2024, 51 (11A): 240100130-9.  doi:10.11896/jsjkx.240100130
Abstract PDF(3106KB) ( 160 )   
References | Related Articles | Metrics
Software architecture style is a generalization of the common structure of software,and the structure style of software is usually closely related to the structural characteristics.By evolving to a certain style,the structural characteristics of the software can be more obvious.Traditional software architecture style evolution methods not only require manual construction of the target software architecture when building the evolution path,which lack the automation support,but also no measurement method for software architecture style has been proposed.Therefore,this paper takes orthogonal software architecture style as an example and proposes a software architecture style evolution path generation method that combines genetic algorithm and planning domain definition language(PDDL).This method proposes a genetic mutation operator based on semantic similarity and a measurement method for orthogonal software architecture style,and proposes the mapping rules between software architecture and PDDL.Experiments show that the proposed genetic mutation operator can better improve the convergence efficiency of the algorithm in the early stage,and after the orthogonal software architecture style evolution is completed,the software is improved in terms of change cost,orthogonal style distance and McCabe measurement.
Compilation Optimization and Implementation of High-order Cryptographic Operators on FPGA
PEI Xue, WEI Shuai, SHAO Yangxue, YU Hong, GE Chenyang
Computer Science. 2024, 51 (11A): 231200184-11.  doi:10.11896/jsjkx.231200184
Abstract PDF(4356KB) ( 155 )   
References | Related Articles | Metrics
Aiming at the different compilation requirements of cryptographic algorithms,a method of abstracting cryptographic operators at different granularities is proposed.This method addresses the issue of rapid and efficient deployment of high-order cryptographic operators on FPGAs through compilation optimization and mapping of operators at different granularities.Hotspot operators are abstracted from cryptographic algorithms to construct an operator library.Multi-level compilation optimization is used to optimize and deploy cryptographic algorithms.Data tensorization and register optimization methods are employed to enhance the deployment and computation efficiency of high-order cryptographic operators on the VTA hardware architecture.Experimental results show that the execution efficiency using tensorization and register optimization methods is 32 times higher than the original compilation and deployment methods,and approximately 34 times higher than OpenCL.Additionally,the constructed operator library allows for the rapid development and implementation of cryptographic algorithms.
Reconfigurable Computing System for Parallel Implementation of SVM Training Based on FPGA
PENG Weidong, GUO Wei, WEI Lin
Computer Science. 2024, 51 (11A): 231100120-7.  doi:10.11896/jsjkx.231100120
Abstract PDF(3132KB) ( 149 )   
References | Related Articles | Metrics
To address the problems of high computational complexity and long training time faced by support vector machines when dealing with large-scale datasets,a reconfigurable computing system for parallel SVM training based on FPGA is designed.The hardware resource consumption and acceleration performance under different quantization methods are analyzed.By utilizing the stochastic gradient descent method for SVM training,the dimensions to be solved are associated with the sample dimensions,significantly reducing computational complexity compared to traditional quadratic programming-based methods.Additionally,a specialized parallel computing structure is designed using FPGA-based reconfigurable hardware platform to accelerate the SVM training process.The entire system is jointly simulated in software and hardware.Simulation results on four public datasets show that the overall model prediction accuracy exceeds 90%.During the training phase,compared to software implementation using the same algorithm,the proposed hardware implementation reduces the processing time for a single sample by at least two orders of magnitude under floating-point representation.Under fixed-point representation,the processing time for a single sample is reduced by up to three orders of magnitude.Compared to the hardware implementation based on quadratic programming problem solving,the processing speed for a single sample is improved by up to 394 times.
Study on Information Enhancement Method of API Structural Pattern Based on EBRCG
ZHONG Linhui, ZHU Yanxia, HUANG Qixuan, QU Qiaoqiao, XIA Zihao, ZHENG Yi
Computer Science. 2024, 51 (11A): 230900121-10.  doi:10.11896/jsjkx.230900121
Abstract PDF(3203KB) ( 163 )   
References | Related Articles | Metrics
A method for enhancing API structural pattern information is proposed in response to issues such as lack of structural information and high redundancy in API call modes.The method is based on the extended branch-reserving call graph(EBRCG),which is used to represent method structural information in Java open source project source code.In the EBRCG,API call statements,branch statements(which treat if statements and all loop statements as branch statements),switch-case multi-branch statements,and exception statements are considered.The EBRCG pruning algorithm is proposed to obtain code structures for specific API call modes.Additionally,clustering and sorting methods are used to filter multiple code structure information for API call modes,and representative API call mode code structures are selected.To validate the effectiveness of this method,three sets of experiments are compared with the TextRank method.The results show that the proposed method can effectively obtain code structures for API call modes,more accurately describing API usage than the TextRank method.This method has certain research significance and provides a reference for software developers.
Multi Level Parallel Computing for SW26010 Discontinuous Galerkin Finite Element Algorithm
WANG Xiaozhong, ZHANG Zuyu
Computer Science. 2024, 51 (11A): 240700055-5.  doi:10.11896/jsjkx.240700055
Abstract PDF(2090KB) ( 139 )   
References | Related Articles | Metrics
The discontinuous Galerkin finite element method(DGM) is a high-precision numerical solution algorithm.Aiming at the problems of low efficiency and high computational complexity of DGM parallel computing in electromagnetic engineering applications,a parallel DGM algorithm based on the SW26010 platform is proposed.The parallel optimization of the DGM algorithm is achieved through region decomposition,data structure reconstruction,kernel parallel computing of hotspot functions,computation and communication overlap,and kernel buffering optimization techniques.Experiment results show that compared with the DGM parallel algorithm based on MPI process level,the proposed algorithm can achieve an average acceleration ratio of 46.8.
Information Security
Overview of Mnemonic Password Creation Policies
CHEN Jiamin, JIANG Huiping
Computer Science. 2024, 51 (11A): 240300100-11.  doi:10.11896/jsjkx.240300100
Abstract PDF(3142KB) ( 150 )   
References | Related Articles | Metrics
Password authentication is the most common authentication method today due to its good simplicity and nice deployability.As algorithms for password guessing attacks continue to improve,the requirement for strong passwords is also increasing.Strong passwords,while improving security,are often difficult to memorize,while easy-to-remember passwords are vulnerable to cracking threats,making it a challenge to choose passwords that are both strong and easy to remember.As the number of accounts per user continues to grow,so does the number of passphrases that need to be memorized,placing a noticeable strain on human memory and making it necessary to find ways to generate strong passphrases that are easy to remember.Over the past two decades,many researchers have proposed strategies for creating mnemonic passphrases based on different mnemonic tools.Therefore,a review of existing mnemonic password creation strategies is conducted.Firstly,an overview is summarized for the background of password creation and the strength of the password.Secondly,according to the characteristics of mnemonic tools,they are categorized into four types:sentence-based,word-based,keyboard-based and other special types,and each type is reviewed in depth.Finally,the strategies for creating mnemonic passphrases are summarized and outlooked,and future research directions and development trends are pointed out.
Proactive Defense Technology in Cyber Security:Strategies,Methods and Challenges
HU Hongchao, SUI Jiaqi, ZHANG Shuai, TONG Yu
Computer Science. 2024, 51 (11A): 231100132-13.  doi:10.11896/jsjkx.231100132
Abstract PDF(2897KB) ( 177 )   
References | Related Articles | Metrics
Emerging technologies like artificial intelligence(AI),cloud computing,big data,and the Internet of Things(IoT) are developing quickly,making cybersecurity a vital issue.There is a clear asymmetry between cyberspace defense and attack,as the more sophisticated cyberattacks are beyond the reach of conventional defense strategies like intrusion detection,vulnerability scanning,virus detection,authentication,access control,etc.To counteract this state of passive vulnerability-which is “easy to attack but hard to defend”-academics have been actively pushing the study and creation of proactive defense technologies.Three such technologies—moving target defense,deception defense,and mimic defense-are maturing and developing quickly.Unfortunately,there is currently a dearth of literature that systematically summarizes the three proactive defensive mainstream technologies;additionally,there is no analysis of the advantages and disadvantages of the three technologies,nor a horizontal comparison.This work fills this vacuum by conducting a thorough and methodical evaluation of the research findings about the three proactive defensive strategies.Initially,the concepts,techniques,and methods of the three proactive defensive technologies are presented in their respective orders,and the current research findings are classified based on the various study topics.Subsequently,a horizontal comparison of the three proactive defense systems is conducted to examine their shared and unique characteristics,benefits and drawbacks,and potential synergies and complementarities that could improve the overall protection efficacy of these technologies.Lastly,the three proactive defensive technologies' difficulties and potential directions are discussed.
Overview of Attribute-based Searchable Encryption
YAN Li, YIN Tian, LIU Peishun, FENG Hongxin, WANG Gaozhou, ZHANG Wenbin, HU Hailin, PAN Fading
Computer Science. 2024, 51 (11A): 231100137-12.  doi:10.11896/jsjkx.231100137
Abstract PDF(2430KB) ( 148 )   
References | Related Articles | Metrics
With the advent of the big data era,the size and complexity of data continue to increase,which makes the requirement for data privacy and security increasingly urgent.However,traditional encryption methods cannot meet the demand for efficient searching in large-scale datasets.To address this problem,searchable encryption introduces trapdoor functions and other cryptographic techniques that allow searching in encrypted data without decrypting the entire dataset.However,searchable encryption alone still cannot meet the complex data access control needs in the real world.Therefore,researchers have introduced the concept of attribute-based encryption into searchable encryption,resulting in attribute-based searchable encryption.This approach aims to achieve efficient search by attributes in encrypted data sets.Attribute-based searchable encryption has a wide range of applications in the fields of privacy protection,data sharing and cloud computing.In this paper,we describe the development trends in terms of enhancing privacy protection,improving computational efficiency,and increasing flexibility.We also present the related schemes involved.In terms of enhancing privacy protection,we discusstechniques such as policy hiding,permission management,and securi-ty enhancement.The current methods for optimizing efficiency primarily involve outsourcing computation,online/offline encryption mechanisms,and index structure optimization,among others.Additionally,the improvement of the attribute-based searchable encryption scheme in terms of access policy expression ability and search capability is discussed.In addition,this paper introduces several common application areas and summarizes the relevant schemes proposed by researchers.In addition,it discusses the challenges and future directions of attribute-based searchable encryption.
Study on Identity Authentication Scheme of Alliance Chain Based on Multi-level Commitment Protocol
SUN Min, LI Xinyu, ZHANG Xin
Computer Science. 2024, 51 (11A): 240200079-7.  doi:10.11896/jsjkx.240200079
Abstract PDF(2311KB) ( 160 )   
References | Related Articles | Metrics
As existing schemes only support coarse-grained attribute protection policies in differentiated privacy protection scenarios,an identity authentication privacy protection scheme based on multi-level commitment protocol(Iascb-Mcp) is proposed in this paper,which aims to allow users to selectively disclose or keep secret their attribute information according to requirements,so as to meet the protection requirements in different privacy scenarios.The scheme realizes the protection of user attributes through multi-level commitment structure.First,each user attribute is assigned a privacy level,and the corresponding commitment protocol is designed according to the privacy level.Secondly,different authentication methods are adopted according to the user attributes of different privacy levels,and zero-knowledge proof is used to ensure that the user's high privacy attributes can still be effectively authenticated without being exposed.Finally,the Iascb-Mcp scheme is used to construct a system based on al-liance chain authentication,which solves the privacy authentication of off-chain user attributes and the security of transactions between different groups on the chain.The results of security analysis and experiment show that other users cannot obtain the high privacy attribute of the prover in the authentication process.Compared with the group signature scheme,the authentication time of Iascb-Mcp is reduced to 1s to 3 s.Compared with the two-ring signature scheme,the newly generated proof file is about one-tenth of the size of the original file.
Intelligent Penetration Path Based on Improved PPO Algorithm
WANG Ziyang, WANG Jia, XIONG Mingliang, WANG Wentao
Computer Science. 2024, 51 (11A): 231200165-6.  doi:10.11896/jsjkx.231200165
Abstract PDF(2723KB) ( 157 )   
References | Related Articles | Metrics
Penetration path planning is the first step of penetration testing,which is important for the intelligent penetration testing.Existing studies on penetration path planning always model penetration testing as a full observable process,which is difficult to describe the actual penetration testing with partial observability accurately.With the wide application of reinforcement learning in penetration testing,this paper models the penetration testing as a partially observable Markov decision process to simulate the practical penetration testing accurately.In general,the full connection of policy network and evaluation network in PPO cannot extract features effectively in penetration testing with partial observability.This paper proposes an improved PPO algorithm RPPO,which integrating of full connection and long short term memory(LSTM) in the policy network and evaluation network.In addition,a new objective function updating is designed to improve the robustness and convergence.Experimental results show that,the proposed RPPO converges faster than A2C,PPO and NDSPI-DQN algorithms.Especially,the convergence iterations is reduced by 21.21%,28.64% and 22.85% respectively.Meanwhile RPPO gains more cumulative reward about 66.01%,58.61% and 132.64%,which is more suitable for larger-scale network environments with more than fifty hosts.
Study on Malicious Traffic Classification Algorithm Based on CNN Combined with BiGRU
YANG Yongping, WANG Siting
Computer Science. 2024, 51 (11A): 231100106-9.  doi:10.11896/jsjkx.231100106
Abstract PDF(3009KB) ( 185 )   
References | Related Articles | Metrics
Network intrusion detection is an important network security technology,malicious traffic recognition and classification is the basis of network intrusion detection.In the current network environment,port detection technology,deep packet detection technology,and feature engineering machine learning algorithm detection technology for malicious traffic identification and classification have failed or are not easy to implement.This paper proposes a malicious traffic recognition classification algorithm model CNNBiGRU,which combines convolutional neural network and bidirectional gated recurrent unit.CNNBiGRU uses convolutional neural network CNN to extract network flow structure features and spatial features,and uses bidirectional gated recurrent unit BiGRU to extract sequence features,which is consistent with the characteristics of network flow with both spatial structure and sequence features.Tests and model optimization and parameter selection are performed on the CIC-IDS2017 dataset.The experimental results show that the proposed algorithm has certain advantages in classification effect and no feature engineering is required compared with the classical machine learning algorithm,and also has better recognition effect compared with the single-neural network algorithm.Compared with the fusion neural network algorithm,it maintains the same high detection result and has a little advantage in the number of learning iterations under the same accuracy target measurement.
Multi-party Co-governance Prevention Strategy for Horizontal Federated Learning Backdoors
XU Wentao, WANG Binjun, ZHU Lixin, WANG Hanxu, GONG Ying
Computer Science. 2024, 51 (11A): 240100176-9.  doi:10.11896/jsjkx.240100176
Abstract PDF(3129KB) ( 194 )   
References | Related Articles | Metrics
Federated learning is susceptible to backdoor attacks based on model replacement.In response to the poor performance of current backdoor detection methods,multi-party co-governance prevention strategy is proposed.The aim is to establish a co-go-vernance mechanism between the federated learning center server and the client,so as to effectively detect and prevent backdoors in the model without compromising data privacy and main task performance.This strategy covers shallow backdoor scanning,deep backdoor detection,and model repair,all of which are completed by the client in collaboration with the central server.Among them,shallow backdoor scanning is a lightweight real-time backdoor detection scheme that does not significantly increase time overhead.This scheme captures abnormal changes in the aggregated model parameters by the client and reports them to the central server.When the number of reports reaches the set threshold,the central server initiates deep backdoor detection,and each client pauses the federated learning process for deep detection to determine whether the neurons in the model are affected by backdoor attacks and exhibit abnormalities.If there are anomalies,each client adopts a method of concatenating a benign model and an attacked model to restore the model to a benign state,and submits the results of deep backdoor detection and model repair plans to the central server.It is up to the central server to decide the final repair plan,thereby thoroughly clearing the backdoor.Experimental results show that this strategy can effectively detect and remove backdoors in the federated learning model,ensuring the safe operation of horizontal federated learning.
IoT Devices Identification Method Based on Weighted Feature Fusion
CAO Weikang, LIN Honggang
Computer Science. 2024, 51 (11A): 240100137-9.  doi:10.11896/jsjkx.240100137
Abstract PDF(3653KB) ( 156 )   
References | Related Articles | Metrics
IoT device identification plays an extremely important role in the field of device management and network security,which not only helps administrators review network assets in a timely manner,but also correlates device information with potential vulnerability information to discover potential security risks in a timely manner.The current IoT device identification methods do not make full use of the characteristics of iot devices,and it is difficult to identify devices with fewer samples in the case of unbalanced samples.To solve the above problems,this paper proposes a weighted feature fusion based method for IoT device recognition.A parallel structure of TextCNN-BiLSTM_Attention is designed to extract the local features and context features of the application layer service information of networked devices respectively.A weighted feature fusion algorithm is proposed to fuse the features extracted from different models.Finally,multi-layer perceptron is used to recognize the device.Experimental results show that the proposed method can extract the features of networked devices more comprehensively,identify devices with fewer samples under the condition of data imbalance,and the macro average precision rate is improved by 2.6%~12.85% compared with the existing methods,which has good characterization and generalization abilities,and is superior to multi-model methods such as CNN_LSTM in recognition efficiency.
Airborne Software Provable Data Possession for Cloud Storage
YUE Meng, WEN Cheng, HONG Xueting, YAN Simin
Computer Science. 2024, 51 (11A): 240400040-10.  doi:10.11896/jsjkx.240400040
Abstract PDF(3960KB) ( 138 )   
References | Related Articles | Metrics
With the increasing number of civil aviation airborne software,the traditional software distribution methods face the problems of low efficiency,high cost and poor security.In order to improve the distribution efficiency of airborne software,we combine cloud storage with airborne software and propose an airborne software storage architecture based on Cloud-P2P,which realizes distributed cloud storage of airborne software and airborne software sharing.On this basis,a provable data possession is proposed,which reduces the risk of complicity by binding the logo to the public key,and completes the integrity verification of the airborne software through sampling audit,reducing the verification cost.Security analysis shows that this scheme is unforgeable and resistant to replay attacks,and proves the correctness of the data-holding proof protocol.Compared with existing data integrity auditing schemes,the computational overhead is reduced by 10% and the communication overhead is reduced by 20%.This research has practical implications for ensuring efficient and secure distribution of airborne software.
Robust Federated Learning Algorithm Based on Multi-feature Detection and Adaptive WeightAdjustment
WANG Chundong, ZHAO Liyang, ZHANG Boyu, ZHAO Yongxin
Computer Science. 2024, 51 (11A): 231100072-10.  doi:10.11896/jsjkx.231100072
Abstract PDF(3360KB) ( 174 )   
References | Related Articles | Metrics
The federated learning paradigm is designed to preserve privacy by enabling multiple clients to collaboratively train a global model without compromising the original training data.However,due to the lack of direct access to local training data and monitoring capabilities during the training process,federated learning is vulnerable to various Byzantine attacks,including data poisoning and model tampering attacks.These malicious activities aim at disrupting the federated learning model training process and degrading its performance.While several studies have proposed various aggregation algorithms to address this issue,they predominantly concentrate on single Byzantine attack scenarios,often overlooking the threats associated with hybrid Byzantine attacks that can manifest in real-world environments.To address this issue,inspired by the principle of water purifiers,we propose an innovative multi-feature detection and adaptive dynamic weighting allocation algorithm called FL-Sieve for identifying Byzantine clients,aiming to filter out malicious clients through multi-level screening.Firstly,the algorithm assesses feature similarity between clients through angular range similarity and model boundary metric,generates a similarity matrix and calculates the similarity score.Then,it performs clustering to ensure that nodes with similar features are grouped together.Subsequently,it employs predefined rules to filter potential benign clients.Finally,it intelligently allocates weights based on the trustworthiness of each client,further enhancing the defense mechanisms and system robustness.To evaluate the performance of the FL-Sieve algorithm,experiments are conducted using three datasets:MNIST,Fashion-MNIST,and CIFAR-10.The experiments consider scenarios with both non-IID data distribution and hybrid Byzantine attack situations.The number of hybrid Byzantine clients increases from 20% to 49% to simulate large-scale hybrid Byzantine client attacks.Additionally,the performance of the FL-Sieve algorithm is tested in both IID and non-IID data distribution,as well as in single attack scenarios.The experimental results demonstrate that FL-Sieve effectively withstands Byzantine attacks in various scenarios,maintaining high main task accuracy even under the challenging condition of 49% hybrid Byzantine client attacks.In comparison,several existing classical algorithms exhibit varying degrees of failure,underscoring the significant advantages of the FL-Sieve algorithm.
Intrusion Detection Model Based on Combinatorial Optimization of Improved Pigeon SwarmAlgorithm
WANG Chundong, LEI Jiebin
Computer Science. 2024, 51 (11A): 231100054-7.  doi:10.11896/jsjkx.231100054
Abstract PDF(2609KB) ( 159 )   
References | Related Articles | Metrics
Intrusion detection,as a security defense technique to protect the network from attacks,plays an important role in the field of network security.Researchers have proposed different network intrusion detection models using machine learning techniques.However,the problems of feature redundancy and machine learning parameter optimization are still challenges for intrusion detection systems.Existing studies considerthe two as independent problems and optimized them separately.However,the machine learning parameters are closely related to the features in the training data,and changes in the feature set are likely to cause changes in the optimal machine learning parameters.To address this problem,an intrusion detection method based on combined optimization of improved pigeon flocking algorithm(ICOPIO)is proposed.It can simultaneously achieve feature screening and machine learning parameter optimization,avoiding the interference of human parameter settings,reducing the influence of redundant and irrelevant features,and further improving the performance of the intrusion detection model.In addition,Spark is used to parallelize ICOPIO to improve the efficiency of ICOPIO.Finally,two intrusion detection standard datasets,NSL-KDD and UNSW-NB15,are used to evaluate the model,and by comparing with several existing related methods,the proposed model achieves the best results in the evaluation metrics of TPR,FPR,and average accuracy,and it proves that ICOPIO has good scalability.
Bank Transaction Fraud Detection Method Based on Graph Neural Network
QIN Zhongpiao, ZHOU Yatong, LI Zhe
Computer Science. 2024, 51 (11A): 240200024-8.  doi:10.11896/jsjkx.240200024
Abstract PDF(3147KB) ( 171 )   
References | Related Articles | Metrics
With the rapid development of electronic payments,the fraud problem is increasing.Limited by rule and feature engineering,traditional fraud detection methods are difficult to capture complex transaction patterns.Conversely,graph-based me-thods often downplay the significance of feature engineering while highlighting the relational aspect of the data.In addition,few studies have examined the application of graph methods in the field of fraud detection for specific bank transaction data.To address this problem,this paper proposes an end-to-end telecom fraud detection method,a fraud detection method for banking tran-sactions based on graph neural networks.The proposed method designs a feature engineering for graph models and trains it using a fusion model.Specifically,oversampling and node weighting are used to address the unbalanced dataset.Next,a user transaction graph model is built utilizing an adaptive similarity edge and node degree weight fusion technique to construct a user transaction graph model and mine potential correlation information between transaction nodes.Furthermore,model fusion is employed to merge local and global variables to overcome the constraints of separate classifiers.Experimental results show that in Guangxi Yulin Bank transaction data,the proposed model for the detection of transaction fraud dataon the three indicators of F1 score,recall rate,and AUC is improved by 1.65%,1.36% and 4.2% compared to GraphSAGE,respectively.The model also achieves a reduction of approximately 80% in training time.In comparison to other mainstream detection algorithms,it exhibits higher detection accuracy.
Study on Open Set Based Intrusion Detection Method
WANG Chundong, ZHANG Jiakai
Computer Science. 2024, 51 (11A): 231000033-6.  doi:10.11896/jsjkx.231000033
Abstract PDF(3916KB) ( 164 )   
References | Related Articles | Metrics
Intrusion detection is an important task in network security,which aims to detect anomalous behaviors and potential attacks.In recent years,deep learning methods have made great breakthroughs in intrusion detection tasks.However,with the rapid development of the Internet industry in recent years,new types of attacks are increasing,and deep learning methods tend to give a prediction result in a known category with high confidence when faced with a new type of category in testing,resulting in the inability to recognize unknown attacks.Based on this,this paper proposes an open set identification method based on uncertainty modeling,i.e.,MC-Dropout is applied to deep learning classifiers to capture uncertainty and thus obtain high-quality prediction probabilities.This open set identification method is not only able to classify known categories,but also able to discriminate unknown categories.The proposed method is validated on the CICIDS2017 dataset,and is able to achieve the detection of unknown categories,and has a certain degree of sophistication compared with other existing methods,and achieves the best performance in all the metrics compared with the benchmark model,which can be effectively applied to the real-world network environment.
Threat Assessment of Air Traffic Control Information System Based on Knowledge Graph
GU Zhaojun, YANG Wen, SUI He, LI Zhiping
Computer Science. 2024, 51 (11A): 240200052-11.  doi:10.11896/jsjkx.240200052
Abstract PDF(5747KB) ( 156 )   
References | Related Articles | Metrics
With the development of intelligent and open air traffic control information system,the risk exposure is gradually increasing.Threat assessment is an important means to effectively assess the vulnerability and security risk of air traffic control information system.However,most of the previous threat assessment models have have two limitations.On the one hand,they usually only focus on the explicit correlation of threat information,which leads to the potential attack path being ignored or not accurately analyzed.On the other hand,the factors taken into account in the quantification of threats are rough and out of line with the actual system environment,resulting in the threat severity not being consistent with the actual situation.Therefore,an air traffic control information system threat assessment model based on knowledge graph is proposed.This paper extends the scope of knowledge graph ontology model to key concepts such as asset security attributes,mitigation measures and compromised assets,fully integrates multi-source threat data such as assets,attacks and vulnerabilities to build security knowledge graph,and designs logical reasoning rules to make up for the limitation of description ability of knowledge graph.An attack path recognition algorithm based on breadth-first strategy combined with inference rules is proposed to extract more comprehensive and accurate attack paths and attack relationships.A fine-grained threat quantification method is proposed based on the actual operating environment of the system,considering the external exposure degree of assets,physical protection and network protection.Experiments show that this evaluation model can help to identify potential attack paths formed by the joint exploitation of multiple vulnerabilities in air traffic control information system,and prioritize attack responses according to threat quantification,which can effectively improve the efficiency of network security defense.
Multimodal Fusion Based Dynamic Malware Detection
LI Jianqiu, LIU Wanping, HUANG Dong, ZHANG Qiong
Computer Science. 2024, 51 (11A): 240200098-7.  doi:10.11896/jsjkx.240200098
Abstract PDF(2351KB) ( 159 )   
References | Related Articles | Metrics
In recent years,the number of new types of malware has been increasing rapidly,and traditional signature-based malware detection methods are ineffective in the face of these these emerging threats.Therefore,there is an urgent need to develop new detection methods.As a solution,a novel approach based on multimodal dynamic malware detection is proposed.The method utilizes API call sequences as features,mapping these API features into multimodal information,and employs two distinct neural network models to process the multimodal information,thereby obtaining detection outcomes.By testing the proposed method on multiple public datasets,a detection accuracy of up to 99.98% is achieved.Experiments demonstrate that the proposed method exhibits high accuracy and generalization capability.Because this method does not require any disassembly operations,it can detect malware that uses packing techniques,effectively enhancing the robustness of the detection method.
ANP-BP Based Executive Heterogeneity Quantification Method in Mimicry Defense
ZHAO Jia, GU Liang, WU Yao, DU Feng
Computer Science. 2024, 51 (11A): 231000005-6.  doi:10.11896/jsjkx.231000005
Abstract PDF(2423KB) ( 136 )   
References | Related Articles | Metrics
Mimicry defense technology based on dynamic heterogeneous redundancy framework is an active defense technology,which uses characteristics such as non-similarity and redundancy to block or disrupt network attacks to improve system reliability and security.The key to improve the security benefits of mimicry defense is to maximize the heterogeneity among executives.This paper proposes a quantitative method of executive heterogeneity based on network analytic hierarchy process(ANP)and back propagation of error(BP).By collecting and analyzing different influencing factors of heterogeneity,this method establishes a multi-dimensional feature matrix.The ANP method comprehensively considers the interdependence between various dimensions and assigns weights to features of different dimensions.At the same time,BP neural network is used to solve the problem that ANP method is too subjective.The isomerism evaluation model based on ANP-BP can quickly,accurately and effectively screen out the most influential factors of isomerism,and provide scientific basis and technical suggestions for the isomerism evaluation of mimicry defense executive.
Interdiscipline & Application
Study on Analysis and Prediction Method of Small Sample Aircraft Production QualityDeviation Data
WANG Luhang, ZHANG Dongdong, LU Hu, LI Rupeng, GE Xiaoli
Computer Science. 2024, 51 (11A): 240300123-8.  doi:10.11896/jsjkx.240300123
Abstract PDF(2027KB) ( 160 )   
References | Related Articles | Metrics
With the development of modern industrial capabilities and the demand for increased precision in aircraft,the analysis and control of aircraft production quality have become a focal point for major aerospace enterprises.At the current stage,traditional analytical methods face challenges in accurately constructing deviation analysis models for aircraft production due to inhe-rent features such as limited reference sample data,significant uncertainties,non-linearity,and multi-level assembly deviations.Therefore,this paper focuses on the deviations in the aircraft production process and systematically explores methods for analyzing and predicting aircraft production quality discrepancies.Firstly,this paper analyzes the deviation relationships among various components,identifying the key component with the greatest impact on total deviation based on principal component analysis,which pinpoints the focus for predictive targeting.Subsequently,starting from actual production data resembling a normal distribution,this paper pays special attention to key components,enabling the prediction,generation,and validation of deviation data based on the normal cloud model,which yields aircraft production quality deviation data and their memberships for a greater varietyof samples,alleviating the issue of “small sample” and evaluating the predictive model through k-fold cross-validation.Finally,a cooperative predictive model for assembly deviation fluctuation intervals based on the improved grey forecasting model and multi-source data fusion is established.The alleviation of the “small sample” issue enhances the precision and scientific nature of interval prediction.With reference to tolerance data,predict the interval range where aircraft production quality deviations are located,providing guidance for actual production and the formulation of tolerance correction mechanisms.
Radar Emitter Target Dynamic Threat Assessment Based on Combining Weighting-TOPSIS Method
ZHANG Yinling, SHANG Tao, LI Zhaokun
Computer Science. 2024, 51 (11A): 231000038-7.  doi:10.11896/jsjkx.231000038
Abstract PDF(2188KB) ( 151 )   
References | Related Articles | Metrics
Nowadays,index weights determining methods of radar emitter target threat assessment are incomplete and non-dynamic.A dynamic and combining weighting assessment method based on analytic hierarchy process(AHP),entropy weight me-thod and technique for order preference by similarity to ideal solution(TOPSIS) model is proposed.First,an index system which includes platform attributes and radar attributes is proposed.Second,the threat subordinating degree functions of qualitive and quantitative attributes are established.Third,attributes are weighted by subjective and objective weights based on AHP and EWM.Fourth,the Poisson reverse distribution is induced for fusing decision information in multiple times.Finally,the threat assessment is obtained based on TOPSIS model.The simulation results show that,the proposed combining weighting dynamic assessment method is more comprehensive,effective,reasonable and actual compared with purely subjective weighting or objective weighting and single-moment method.It can reflect the multi-target threat ranking well and provide a strong basis for the combat command decision.
Study on Trust Evaluation System Based on Trusted Platform Control Module
HUANG Jianhui, ZHANG Jiangjiang, SHEN Changxiang, ZHANG Jianbiao, WANG liang
Computer Science. 2024, 51 (11A): 240200109-6.  doi:10.11896/jsjkx.240200109
Abstract PDF(2991KB) ( 155 )   
References | Related Articles | Metrics
The existing trust assessment is based on computer software scanning or trust modules that are achieved through local reporting or remote network authentication,which solves the trust measurement guarantee for the construction process and running status of the local execution environment.However,from the perspective of network applications,there are still systemic security risks.This paper proposes a network node trust evaluation method that adds implementation within the trusted platform control module(TPCM) to address this issue.This method achieves a fast and reliable trust evaluation system under a dual architecture(computing+defense) through the TPCM of defense units,and the evaluated trust values are stored and maintained through TPCM.This scheme not only avoids device forgery after being attacked,but also frees up CPU computing resources.This paper studies a network node trust evaluation system based on TPCM support to achieve a systematic evaluation of the cre-dibility of lightweight computer network platform nodes,ensuring the safe and reliable operation of the network.
Evolutionary CatBoost Based Housing Price Prediction Model
WANG Chengzhang, BAI Xiaoming, TANG Wenying, CHEN Shuhan
Computer Science. 2024, 51 (11A): 240300180-5.  doi:10.11896/jsjkx.240300180
Abstract PDF(1939KB) ( 147 )   
References | Related Articles | Metrics
Genetic programming algorithm uses function transformation to map the space formed by the original variables to a new feature space,and optimizes the objective function through genetic operator operations.There are many factors that affect housing price fluctuations,and each influencing factor exhibits a complex nonlinear relationship with housing prices.This paper proposes an evolutionary CatBoost algorithm based housing price prediction model.Various factor variables that affect housing price fluctuations are encoded as terminal variables of the genetic programming algorithm.CatBoost algorithm is employed as the base learner to construct a fitness function,and reasonable genetic operators are designed according to the characteristics of hou-sing price prediction.The objective function is optimized in the feature space after function mapping to improve the performance of the prediction model.Experimental results show that the prediction performance of evolutionary CatBoost algorithm based housing price prediction model is superior to that of traditional prediction models based on random forest algorithm,support vector machine algorithm,adaptive enhancement algorithm,extreme gradient enhancement algorithm,etc.It can predict housing prices more accurately than the rivals under the same conditions.
Air Quality Fuzzy Cognitive Map Forecasting Based on Niche Genetic Algorithm
HAN Huijian, LIU Kexin, LIN Xue
Computer Science. 2024, 51 (11A): 240300120-6.  doi:10.11896/jsjkx.240300120
Abstract PDF(2035KB) ( 161 )   
References | Related Articles | Metrics
Industrialization has led to the rapid growth of global economy,but it has also made environmental pollution more and more serious.Air pollution has become a worldwide hot topic around the world.In this paper,an air quality fuzzy cognitive map forecasting method based on niche genetic algorithm is proposed.This method indicates the relationship of airpollutants and air quality index by using fuzzy cognitive map,and makes the training target moreclose to the global best solution by using modified niche genetic algorithm.The air quality data from 2015 to 2021 is used to train the model,and the model is tested on the 2022 data.Theresult indicates that compared to the traditional genetic algorithm and BP neural network,theproposed method has higher prediction accuracy and better generalization performance,which proves its effectiveness.
Robot Performance Teaching Demonstration System Based on Imitation Learning
ZHAO Yufei, JIN Cong, LIU Xiaoyu, WANG Jie, ZHU Yonggui, LI Bo
Computer Science. 2024, 51 (11A): 240300063-5.  doi:10.11896/jsjkx.240300063
Abstract PDF(2403KB) ( 166 )   
References | Related Articles | Metrics
In recent years,imitation learning has been widely applied in the field of robotics,demonstrating significant potential.At the same time,the application of intelligent systems in the field of education is becoming more and more diversified,and the reasonable application of robots in teaching can improve the teaching effect.If robots can instruct certain professional skills,such as playing musical instruments,it could offer significant convenience for both students and human teachers.Imitation learning is particularly suitable for highly specialized and technically demanding tasks,such as violin performance.However,the introduction of expert demonstrations into the process of dynamic movement primitives(DMP),especially regarding the ambiguity issues like uncertainties in string-changing angles,poses a prominent challenge.Traditional methods of measuring string-changing angles,such as physical measurements,exhibit substantial errors and lack generalization.To address this issue,a new model named fuzzy dynamic movement primitive for teaching(T-FDMP) is proposed.The model is constructed based on Type-2 fuzzy model and principal component analysis(PCA).It utilizes the features obtained from principal component analysis(PCA),specifically the bowing angle,as input for the membership functions(string angles) and simultaneously builds a professional-level music perfor-mance behavior database.Bionic experimental results demonstrate that our T-FDMP model can precisely control the robot for violin performance.Furthermore,it opens up new research directions for imitation learning in other highly specialized and technical domains.
Interdiscipline & Application
Recognition Method of Online Classroom Interaction Based on Learner State
RAO Yi, YUAN Bochuan, YUAN Yubo
Computer Science. 2024, 51 (11A): 231200133-9.  doi:10.11896/jsjkx.231200133
Abstract PDF(2229KB) ( 165 )   
References | Related Articles | Metrics
With the widespread application of artificial intelligence in the field of education,online classrooms have become a highly convenient and efficient mode of modern education.However,effectively managing the learning status of students in the classroom has become an important challenge in education management.In light of this,method for recognizing learner interaction states in online classrooms is proposed.First,the online classroom data source is divided into video data and audio data.Based on video data,a multidimensional set of interaction state features is constructed,including learner upper body posture features,facial expression features,and facial features.Based on audio data,classroom response state features of the learners are built.Next,using a feature selection algorithm to select key features,a binary classification model is constructed to achieve precise recognition of students′ classroom interaction states using Bayesian optimization.Finally,an overall classroom assessment model is designed to provide comprehensive classroom assessment results for teachers and optimize teaching strategies.The accuracy of the learner interaction state recognition algorithm in this single-student classroom exceeds 93%,as validated on a self-constructed online classroom video data set.
Interdiscipline & Application
Study on Multi-wing Transient Chaotic Systems and Finite Time Synchronization
YANG Yang, BAI Yulong, LI Yan, HUO Tingting
Computer Science. 2024, 51 (11A): 240300056-6.  doi:10.11896/jsjkx.240300056
Abstract PDF(4002KB) ( 169 )   
References | Related Articles | Metrics
A three-dimensional chaotic system that can generate multi-wing attractors is constructed.Through the analysis of the phase portraits,Lyapunov exponent spectra,bifurcation diagrams,and complexity,it is found that the system exhibits complex dynamical characteristics.By studying multiple sets of parameters,it is discovered that the system exhibits diverse multi-wing attractors,and the topological structure of the attractors changes from four-wing to double-wing,and then back to four-wing.In addition,the system exhibits coexistence of transients and attractors.Multisim is used to perform circuit simulations of the system,and the experimental results are consistent with the numerical analysis,verifying the feasibility of the chaotic system implementation.Finally,based on the finite time theory,a synchronization controller is designed to realize the finite time synchronization of different structures,which provides a good basis for chaotic secure communication.
Reliable Power Data Scheduling Scheme Based on Blockchain
MA Junwei, PAN Xiukui, WANG Yuqi, WU Jian, DU Feng
Computer Science. 2024, 51 (11A): 231100178-8.  doi:10.11896/jsjkx.231100178
Abstract PDF(3782KB) ( 150 )   
References | Related Articles | Metrics
The rapid development of the intelligent Internet of Things(IoT) has enabled efficient aggregation of electrical information resources in the power grid.The immutability,transparency,and high availability of blockchain technology enhance the security and efficiency of shared information.With the opening of the energy and electricity market,the demand for power resource integration,load regulation,and optimized allocation has become increasingly urgent.During the information gathering stage,electricity data can be collected through intelligent devices in the power grid.However,in the stage of electricity data dispatch,there are barriers to information sharing and threats of false information,which seriously affect dispatch efficiency.In this paper,a reliable power data dispatch scheme based on blockchain is proposed.The scheme utilizes blockchain to achieve information sharing in dispatching,and off-chain design for intelligent terminal device access mechanisms applicable to electricity dispatch scenarios.It designs a power data publication method based on data reliability assessment and a multi-strategy dispatch model based on utility theory to ensure the reliability of on-chain data and achieve controllable data dispatch risks.Furthermore,it designs a trust update calculation method based on dynamic and static evaluation combination to quantify user dispatching behaviors on the blockchain.The effectiveness of the proposed scheme is validated through simulation experiments on dispatch success rate,total system revenue,and other indicators.
Research on Microgrid Energy Dispatch Based on Distributed Fixed-timeTime-varyingAlgorithm
YANG Shuai, DAI Xiangguang, XU Shuying, ZHANG Liangliang
Computer Science. 2024, 51 (11A): 240200108-6.  doi:10.11896/jsjkx.240200108
Abstract PDF(2447KB) ( 147 )   
References | Related Articles | Metrics
Energy optimization dispatch within microgrid aims to minimize generation costs by formulating the objective of achieving the optimal device generation strategy.This paper establishes a microgrid model based on multiple intelligent agents,fully considering the dynamic nature of the total load in the microgrid as it varies over time.To address the minimization of generation costs while accounting for time-varying loads,a distributed fixed-time time-varying algorithm is further designed.The objective function of the optimization problem is defined as the summation of all local convex objective functions,subject to constraints imposed by equations.The theoretical foundation of this study involves proving the stability and convergence of the algorithm through the construction of a Lyapunov function.This theoretical underpinning ensures the reliability of the algorithm in practical applications.Numerical simulation experiments demonstrate that the proposed algorithm effectively resolves the energy optimization dispatch problem within microgrid.This not only furnishes a potent tool for microgrid management,but also lends robust support to the sustainable development of energy systems.By minimizing generation costs,microgrid can efficiently meet the constantly evolving demands of loads,thereby enhancing the economic efficiency and sustainability of the system.The research provides valuable insights for the intelligent management of microgrids and the design of future energy systems.
Real-time Collaborative Pricing Mechanism of Between Vehicle and Power Grid Based on Bi-levelOptimization
WANG Qiong, LU Yue, LIU Shun, LI Qingtao, LIU Yang, WANG Hongbiao, LIU Weiliang
Computer Science. 2024, 51 (11A): 240300013-7.  doi:10.11896/jsjkx.240300013
Abstract PDF(2523KB) ( 170 )   
References | Related Articles | Metrics
Due to the incomplete competition and information on the behavior of electric vehicles,as well as the nonlinearity and uncertainty of power systems,the modeling and solving of real-time pricing problems are highly complex. Existing solutions typically model this as a constrained optimization problem,assuming that the utility function,which is a quantitative representation of the electricity network's economic benefits,is known to the network operators.This overlooks the incomplete information prevalent in actual scenarios.To overcome this limitation,this paper proposes an innovative real-time pricing mechanism for the vehicle-to-grid problem based on a bi-level optimization approach under the condition of unknown utility function parameters.Meanwhile,it considers the power flow equation to reflect the distributed grid's real-time load.This mechanism more accurately reflects the market's real dynamics.In this bi-level model,the upper level represents the optimization problem of the market operators,aiming to maximize their own welfare and minimize the load of the distributed grid.In contrast,the lower level represents the optimization problem of electric vehicles,aiming to maximize their profits or minimize their cost.Through comparative experiment simulations with the fixed pricing and peak-valley pricing methods,the experimental simulation data demonstrates the effectiveness of improving the profit of the grid and vehicles.At the same time,the load of the power grid is reduced.
Prediction of Spatial and Temporal Distribution of Electric Vehicle Charging Loads Based on Joint Data and Modeling Drive
GU Wei, DUAN Jing, ZHANG Dong, HAO Xiaowei, XUE Honglin, AN Yi , DUAN Jie
Computer Science. 2024, 51 (11A): 231100110-6.  doi:10.11896/jsjkx.231100110
Abstract PDF(3010KB) ( 153 )   
References | Related Articles | Metrics
In response to the current research on charging vehicle(EV)load prediction,the accuracy of real-time prediction of charging vehicle origin-destination(OD)is not high and considers the influence of road information on users' charging behavior choices.On the data-driven side,a combination of long short-term memory(LSTM)networks and graph convolutional networks(GCN)is used to analyze the existing charging load data and realize the prediction of charging vehicle origin-destination(OD),with respect to the spatial and temporal characteristics of the OD matrix of charging vehicle trips.On the model-driven side,a combination of road information is taken into consideration to predict the OD in real time.In terms of model driving,based on the comprehensive consideration of traffic network composition,ambient temperature,real-time traffic flow and other methods,a dri-ving behavior model of electric vehicle users is established,including dynamic traffic information,mileage energy consumption of electric vehicles in various road segments in the city and user path planning,and the improved A* algorithm is used to plan driving paths for the starting and ending points of electric vehicles in accordance with the users' choices,so as to simulate the driving behavior of electric vehicle users.The improved A* algorithm is used to plan driving paths for the starting and ending points of EVs that meet the user's choice,and simulate the driving behavior of EV users.Finally,the path planning test and charging demand prediction test for different types of EVs are completed under different application scenarios.The results show that the spatial and temporal distribution characteristics of charging demand are consistent with the objective demand.
Data Exchange and Decision Optimization for Intelligent Maintenance of Xinjiang Ship Locks
DING Guangming, ZHAO Yuzhong, ZHENG Yong
Computer Science. 2024, 51 (11A): 240800116-4.  doi:10.11896/jsjkx.240800116
Abstract PDF(1924KB) ( 144 )   
References | Related Articles | Metrics
Due to the huge scale and complex operating environment of cascade shipping hub engineering,its maintenance tasks still face problems such as insufficient detection perception,limited detection and maintenance time,and the need to improve the level of structural evaluation and maintenance decision-making.Based on the above ship lock operation and maintenance problems and data management requirements,taking the Jiangxi Xinjiang cascade shipping junction as the research object,this paper studies and establishes the overall technical framework of maintenance data for intelligent ship locks,implements the interactive design of intelligent shipping ship lock maintenance data,and optimizes the construction of functions such as intelligent monitoring and management,automated deployment of equipment and personnel,operation and maintenance services and decision-making ma-nagement.It realizes the automated configuration,monitoring and early warning of various application systems and the release of operation and maintenance services,and deeply integrates operation and maintenance data,forming a professional data mining platform and the vision of visualized maintenance data.Through the data-driven model,the ship lock operation and maintenance management and service capabilities are further improved,and it provides a reference for the research of maintenance systems of other similar projects.
Reinforcement Learning Algorithm for Charging/Discharging Control of Electric Vehicles Considering Battery Loss
LU Yue, WANG Qiong, LIU Shun, LI Qingtao, LIU Yang, WANG Hongbiao
Computer Science. 2024, 51 (11A): 231200147-7.  doi:10.11896/jsjkx.231200147
Abstract PDF(2778KB) ( 194 )   
References | Related Articles | Metrics
With the gradual increase in the number of electric vehicles,their integration has a significant impact on the load of the power grid.In this context,V2G/G2V technology is widely believed to play an important role in power grid management.Taking the charging and discharging control algorithm of electric vehicles as the research object,a deep reinforcement learning algorithm based on Soft Actor-Critic(SAC) is introduced.In terms of the dynamic change of load sequence in the power grid,the charging/discharging rate of different vehicles is controlled to maximize the benefits for users under different electricity prices.In addition,in order to address the issue of increased battery loss during the charging and discharging process,a battery loss prediction model based on physical hybrid neural network(PHNN) is introduced in the research.Meanwhile,the charging/discharging process is modeled as a Markov decision process.By integrating the PHNN model into the charging and discharging control of electric vehicles,a new reward function is constructed to accurately quantify the cost of battery loss.Based on the SAC algorithm,this reward function is used to learn the optimal charging and discharging strategy.Experimental results show that this algorithm can effectively regulate the charging and discharging behavior of vehicles,play a regulatory role in the power network,and reduce the loss of battery life during the charging and discharging process,further ensuring the economic interests of users.