Started in January,1974(Monthly)
Supervised and Sponsored by Chongqing Southwest Information Co., Ltd.
ISSN 1002-137X
CN 50-1075/TP
CODEN JKIEBK
Editors
Current Issue
Volume 51 Issue 6A, 16 June 2024
  
Contents
CONTENTS
Computer Science. 2024, 51 (6A): 0-0. 
Abstract PDF(590KB) ( 678 )   
RelatedCitation | Metrics
Artificial Intelligenc
Visual Bibliometric Analysis of Knowledge Graph
HE Jing, ZHAO Rui, ZHANG Hengshuo
Computer Science. 2024, 51 (6A): 230500123-10.  doi:10.11896/jsjkx.230500123
Abstract PDF(8967KB) ( 402 )   
References | Related Articles | Metrics
With the continuous development of the network society,people put forward higher requirements for information retrieval,and the emergence and development of knowledge graph provide support for it.Therefore,the research on knowledge graph has gradually attracted the attention of scholars,and the relevant research on its integration with various fields has also gradually increased.In order to gain insight into the research process and future development trend of knowledge graph,this paper uses CiteSpace software to visually analyze the research of knowledge graph in CNKI and Web of Science(WOS) databases,and sort out the documents from 2013 to 2022 according to the number of documents issued annually,institution co-occurrence,author co-occurrence,key word co-occurrence,keyword clustering and burst words.The in-depth learning,artificial intelligence,literature metrology and visualization in Chinese research,and social network analysis,task analysis,data mining,and multi-agent system in foreign language research are selected as the research hotspots for keyword review.The study finds that at this stage,despite the trend of comprehensive and in-depth development of knowledge graph related research,the Chinese research presents a weak linkage,weak stability,and a narrow research scope,which can be continuously improved accordingly in the subsequent research.
Lightweighting Methods for Neural Network Models:A Review
GAO Yang, CAO Yangjie, DUAN Pengsong
Computer Science. 2024, 51 (6A): 230600137-11.  doi:10.11896/jsjkx.230600137
Abstract PDF(2583KB) ( 1704 )   
References | Related Articles | Metrics
In recent years,with its strong feature extraction capability,neural network models have been more and more widely used in various industries and have achieved good results.However,with the increasing amount of data and the pursuit of high accuracy,the parameter size and network complexity of neural network models increase dramatically,leading to the expansion of computation,storage and other resource overheads,making their deployment in resource-constrained scenarios extremely challenging.Therefore,how to achieve model lightweighting without affecting model performance,and thus reduce model training and deployment costs,has become one of the current research hotspots.This paper summarizes and analyzes the typical model lightweighting methods from two aspects:complex model compression and lightweight model design,so as to clarify the development of model compression technology.The complex model compression techniques are summarized in five aspects:model pruning,model quantization,low-rank decomposition,knowledge distillation and hybrid approach,while the lightweight model design is sorted out in three aspects:spatial convolution design,shifted convolution design and neural architecture search.
Survey of Multi-agent Deep Reinforcement Learning Based on Value Function Factorization
GAO Yuzhao, NIE Yiming
Computer Science. 2024, 51 (6A): 230300170-9.  doi:10.11896/jsjkx.230300170
Abstract PDF(3173KB) ( 438 )   
References | Related Articles | Metrics
The multi-agent deep reinforcement learning is an extension of the deep reinforcement learning method to the multi-agents problem,in which the multi-agents deep reinforcement learning based on the value function factorization has achieved better performance and is a hotspot for research and application at present.This paper introduces the main principles and framework of the multi-agents deep reinforcement learning based on the value function factorization.Based on the recent related research,three research hotspots are summarized:the problem of improving the fitting ability of mixing network,the problem of improving the convergence effect and the problem of improving the scalability of algorithms,and the reasons for the three hotspot problems are analyzed in terms of algorithm constraints,environmental complexity and neural network limitations.The existing research is classified according to the problems to be solved and the methods to be used,the common points of similar methods are summarized,and the advantages and disadvantages of different methods are analyzed;the application of multi-agent deep reinforcement learning method based on value function decomposition in two hot fields of network node control and unmanned formation control is expounded.
Review of Point Cloud Semantic Segmentation Based on Graph Convolutional Neural Networks
HUANG Haixin, CAI Mingqi, WANG Yuyao
Computer Science. 2024, 51 (6A): 230400196-7.  doi:10.11896/jsjkx.230400196
Abstract PDF(3986KB) ( 417 )   
References | Related Articles | Metrics
As point clouds are widely utilized in various fields such as autonomous driving,map making,and mining measurement,there is a growing interest in this data representation that contains rich information.Point cloud semantic segmentation,as an important means of point cloud data processing,has attracted wide attention due to its high research value and application prospects.Due to the characteristics of permutation invariance and rotation invariance in point clouds,traditional convolutional neural networks cannot directly process irregular point cloud data,but graph convolutional neural networks can use graph convolution operators to directly extract point cloud features.Therefore,this paper provides a detailed review of recent point cloud segmentation methods based on graph convolution.The methods are further divided according to the type of graph convolution,and representative algorithms in each category are introduced and analyzed,summarizing the research ideas and advantages and disadvantages of each method.Then,some mainstream point cloud datasets and evaluation metrics in the field of point cloud semantic segmentation are introduced,and the experimental results of the mentioned segmentation methods are compared.Finally,the development direction of various methods is discussed.
Recent Progress on Machine Translation Based on Pre-trained Language Models
YANG Binxia, LUO Xudong, SUN Kaili
Computer Science. 2024, 51 (6A): 230700112-8.  doi:10.11896/jsjkx.230700112
Abstract PDF(1950KB) ( 433 )   
References | Related Articles | Metrics
Natural language processing(NLP) involves many important topics,one of which is machine translation(MT).Pre-trained language models(PLMs),such as BERT and GPT,are state-of-the-art approaches for various NLP tasks including MT.Therefore,many researchers use PLMs to solve MT problems.To push the research forward,this paper provides an overview of recent advances in this field,including the main research questions and solutions based on various PLMs.We compare the motivations,commonalities,differences and limitations of these solutions,and summarise the datasets commonly used to train such MT models,as well as the metrics used to evaluate them.Finally,further research directions are discussed.
Weighted Double Q-Learning Algorithm Based on Softmax
ZHONG Yuang, YUAN Weiwei, GUAN Donghai
Computer Science. 2024, 51 (6A): 230600235-5.  doi:10.11896/jsjkx.230600235
Abstract PDF(2861KB) ( 337 )   
References | Related Articles | Metrics
As a branch of machine learning,einforcement learning is used to describe and solve the problem that agents maximize returns through learning strategies in the process of interaction with the environment.Q-Learning,as a classical model free reinforcement learning method,has the problem of maximizing the bias caused by overestimation,and performs poorly when there is noise in the environment.The emergence of double Q-Learning(DQL) solves the problem of overestimation,but at the same time causes the problem of underestimation.To solve the problem of high and low estimation in the above algorithms,weighted Q-Learning algorithm based on softmax is proposed.And combined with DQL,a new weighted double Q-Learning algorithm based on softmax(WDQL-Softmax) is proposed.This algorithm is based on the construction of weighted dual estimators,which perform softmax operations on the expected values of the samples to obtain weights.The weights are used to estimate the action value,effectively balancing the problem of overestimation and underestimation of the action value,making the estimated value closer to the theoretical value.Experimental results show that in the discrete action space,compared with Q-Learning algorithm,double Q-Learning algorithm and weighted double Q-learning algorithm,weighted double q-learning algorithm based on softmax has faster convergence rate and smaller error between the estimated value and the theoretical value.
Three Layer Knowledge Graph Architecture for Industrial Digital Twins
TANG Xin, SUN Yufei, WANG Yujue, SHI Min, ZHU Dengming
Computer Science. 2024, 51 (6A): 230400153-6.  doi:10.11896/jsjkx.230400153
Abstract PDF(3703KB) ( 419 )   
References | Related Articles | Metrics
As digitalization and intelligence continue to develop in the industrial field,enterprises are facing challenges in improving production efficiency,reducing production costs,optimizing production processes,and achieving real-time monitoring.Digital twin technology has received widespread attention as an effective solution.However,there are difficulties in data acquisition and integration,model construction and updating,and real-time performance and accuracy in the process of industrial digital twin construction.To address these issues,this paper proposes a concept-instance-module structure design method based on digital twin knowledge graph.The digital twin knowledge graph model proposed in this paper adopts a three-layer architecture of concept-instance-module.The concept layer establishes a comprehensive and organic knowledge network through the knowledge graph.The instance layer achieves digital modeling to reproduce theoretical parameters realistically.The knowledge module layer integrates the knowledge of the previous two layers to form functional modules for comprehensive monitoring and control.This model can provide more accurate and detailed modeling and analysis of industrial processing knowledge,helping enterprises to achieve advanced application functions such as digital modeling,accurate simulation,predictive analysis,and anomaly detection.
Study on Matching Design of Ship Engine and Propeller Based on Improved Moth-Flame Optimization Algorithm
CHEN Zhenlin, LUO Liang, ZHENG Long, JI Shengchen, CHEN Shunhuai
Computer Science. 2024, 51 (6A): 230500157-9.  doi:10.11896/jsjkx.230500157
Abstract PDF(5623KB) ( 290 )   
References | Related Articles | Metrics
This paper develops an improved moth-flame optimization(IMFO) algorithm for the ship propeller-matching problem,which comprehensively considers propeller efficiency,cavitation,and strength for two existing ships as calculation examples.Genetic algorithm(GA) and the original moth-flame optimization(MFO) algorithm are used as comparison algorithms to analyze the performance of the IMFO-assisted propeller-matching task.Numerical experiment results show that the convergence time of the IMFO algorithm in solving the propeller-matching problem is reduced by 44.24% and 54.14% compared to the GA algorithm in the two examples,and by 23.9% and 23.12% compared to the MFO algorithm,respectively.In addition,in terms of solution accuracy,the IMFO algorithm is slightly better than the GA and MFO algorithms in calculation example 1.In calculation example 2,the IMFO algorithm is improved by 3.66% compared to the GA algorithm and by 0.98% compared to the MFO algorithm.Finally,by visualizing the feasible solution space of the two examples,the performance of the IMFO algorithm is further discussed.The above results demonstrate that the IMFO algorithm has strong global search capability and is competitive and robust in solving the propeller-matching problem.
Path Planning for Mobile Robots Based on Modified Adaptive Ant Colony Optimization Algorithm
WEI Shuxin, WANG Qunjing, LI Guoli, XU Jiazi, WEN Yan
Computer Science. 2024, 51 (6A): 230500145-9.  doi:10.11896/jsjkx.230500145
Abstract PDF(4060KB) ( 365 )   
References | Related Articles | Metrics
For the traditional ACO has the disadvantages of slow convergence,low efficiency and easy to fall into local optimum,a new variant of ACO is proposed.Firstly,a new heuristic mechanism with directional information is introduced to add directional guidance in the iterative process,which further improves the convergence speed of the algorithm.Second,an improved heuristic function is proposed to enhance the purpose of the objective and reduce the number of turns in the path.Then,an improved state transfer probability rule is introduced to improve the search efficiency and increase the population diversity.In addition,a new method of unevenly distributing the initial pheromone concentration is proposed to avoid blind search.The new ACO variant is called the modified adaptive ant colony optimization algorithm(MAACO).To verify the effectiveness of the proposed MAACO,a series of experiments are conducted with seven other existing algorithms based on three different obstacle distribution environment patterns.In all simulation experiments,the proposed MAACO generates the shortest path with zero standard deviation and achieves the minimum number of turns within the minimum convergence generation.For the three experiments,the average reduction in the number of turns compared to the best available results is two,with a typical reduction of 22.2%.Experimental results demonstrate the advantages of MAACO in reducing path length,reducing the number of turns and increasing the convergence speed and its usefulness and efficiency in path planning.
Speaker Verification Network Based on Multi-scale Convolutional Encoder
LIU Xiaohu, CHEN Defu, LI Jun, ZHOU Xuwen, HU Shan, ZHOU Hao
Computer Science. 2024, 51 (6A): 230700083-6.  doi:10.11896/jsjkx.230700083
Abstract PDF(2371KB) ( 337 )   
References | Related Articles | Metrics
Speaker verification is an effective biometric authentication method,and the quality of speaker embedding features largely affects the performance of speaker verification systems.Recently,the Transformer model has shown great potential in the field of automatic speech recognition,but it is difficult to extract effective speaker embedding features because the traditional self-attention mechanism of the Transformer model is weak for local feature extraction.The performance of the Transformer model in the field of speaker verification can hardly surpass that of the previous convolutional network-based models.In order to improve the Transformer’s ability to extract local features,this paper proposes a new self-attention mechanism for Transformer encoder,called multi-scale convolutional self-attention encoder(MCAE).Using convolution operations of different sizes to extract multi-time-scale information and by fusing features in the time and frequency domains,it enables the model to obtain a richer representation of local features,and such an encoder design is more effective for speaker verification.It is shown experimentally that the proposed method is better in terms of comprehensive performance on three publicly available test sets.The MCAE is more lightweight compared to the conventional Transformer encoder,which is more favorable for the deployment of the model in applications.
DRSTN:Deep Residual Soft Thresholding Network
CAO Yan, ZHU Zhenfeng
Computer Science. 2024, 51 (6A): 230400112-7.  doi:10.11896/jsjkx.230400112
Abstract PDF(3274KB) ( 275 )   
References | Related Articles | Metrics
When using neural network models such as deep residuals to classify images,some important features lost during feature extraction will affect the classification performance of the model.The black box problem brought about by the “end-to-end” learning mode of neural network can also limit its application and development in many fields.In addition,neural network models often require longer training time than traditional methods.In order to improve the classification effect and training efficiency of the deep residual networks,this paper introduces the model transfer method and soft thresholding method,proposes the deep residual soft thresholding network(DRSTN) network,and fine-tunes the network structure to generate different versions DRSTN network.The performance of the DRSTN networks benefit from the organic integration of three aspects:1)Visualize the feature extraction of the network through the gradients-weighted class activation mapping(Grad-CAM) method,and select further optimized ones based on the visualization results.2)Based on model transfer,researchers do not need to build a model from scratch,and can directly optimize the existing models,which can save a lot of training time.3) Soft thresholding,as a nonlinear transformation layer,is embedded into the deep residual network architecture to eliminate irrelevant features in samples.Experimental results show that under the same training conditions,the classification accuracy of the DRSTN_KS(3*3)_RB(2:2:2) network on the CIFAR-10 dataset is 15.5%,8.8% and 10.9% higher than that of SKNet-18,ResNet18 and ConvNeXt_tiny networks,respectively.The network also has a certain degree of generalization.It can achieve rapid transfer on MNIST and Fashion MNIST datasets,and the classification accuracy reaches 99.06% and 93.15% respectively.
Artificial Hummingbird Algorithm Based on Multi-strategy Improvement
LI Zhen, FENG Feng
Computer Science. 2024, 51 (6A): 230500079-9.  doi:10.11896/jsjkx.230500079
Abstract PDF(4266KB) ( 365 )   
References | Related Articles | Metrics
To address the problems of insufficient global exploration capability and slow convergence of the artificial hummingbird algorithm(AHA) in the iterative process,a multi-strategy improved artificial hummingbird algorithm(IAHA) is proposed.Firstly,a strategy combining Tent chaos sequence and reverse learning is used to initialize the population,which generates high-quality initial populations and lays a foundation for global optimization of the algorithm.Secondly,the Levy flight strategy is introduced in the foraging stage to enhance the global search ability,enabling the algorithm to quickly escape from local optima and accelerate convergence speed.Finally,the simplex method is introduced into the algorithm to process poorer quality population before each iteration ends,improving the local optimization ability of the algorithm.The IAHA is compared with 4 basic algorithms,3 single-improvement-stage artificial hummingbird algorithms,and 2 existing improved artificial hummingbird algorithms,respectively.Simulation experiments as well as Wilcoxon rank sum tests are performed on 8 benchmark test functions to evaluate the performance of IAHA and to analyze its time complexity.Experimental results show that IAHA converges faster,has better global optimization capability and better algorithmic performance than the above proposed algorithms.
Application of Subject Enhanced Cascade Binary Pointer Tagging Framework in Chinese Medical Entity and Relation Extraction
JIANG Zhihan, ZAN Hongying, ZHANG Li
Computer Science. 2024, 51 (6A): 230800179-6.  doi:10.11896/jsjkx.230800179
Abstract PDF(2823KB) ( 315 )   
References | Related Articles | Metrics
With the rapid advancement of China’s biomedical industry,the volume of Chinese medical texts is escalating at a rapid pace.Extracting valuable information from these texts can ease the learning curve for practitioners.To tackle the challenge of entity relation extraction in the realm of Chinese medicine,a series of models based on bidirectional LSTM have been previously proposed.However,to overcome the training speed bottleneck inherent to bidirectional LSTM,this study introduces the Cascade binary pointer network framework to the domain of Chinese medical filed.To address the framework’s weak capability in identifying main entities and the gradient issues arising from reusing the coding layer,this paper introduces the main entity enhancement module and employs conditional layer normalization.This paper presents the subject enhanced cascade binary pointer tagging framework for chinese medical text (SE-CAS),tailored for Chinese medical text.The subject enhancement module accurately identifies valid subjects detected by the subject recognition module and rectifies erroneously identified entities.Furthermore,the conditional layer normalization method replaces the simplistic addition between word embeddings and subject embeddings found in the original model.Experimental results demonstrate that the proposed model achieves a 5.73% enhancement in F1 measure on the CMeIE dataset.The ablation study confirms the incremental impact of each module,and these improvements exhibit a cumulative effect.
Study on Tibetan Short Text Classification Based on DAN and FastText
LI Guo, CHEN Chen, YANG Jing, QUN Nuo
Computer Science. 2024, 51 (6A): 230700064-5.  doi:10.11896/jsjkx.230700064
Abstract PDF(2772KB) ( 318 )   
References | Related Articles | Metrics
As Tibetan information continues to be integrated into social life,more and more Tibetan short text data is available on online platforms.Aiming at the low classification performance of traditional classification methods on Tibetan short texts,a Tibetan short text classification model based on DAN-FastText is proposed.The model uses the FastText network to perform unsupervised training on a large-scale Tibetan corpus to obtain the pre-trained Tibetan syllabic vector set,uses the pre-trained syllable vector set to convert the Tibetan short text information into syllable vector,sends the syllable vector into the deep averaging networks(DAN) network and fuses the sentence vector features trained by the FastText network in the output stage,and finally completes the classification through the fully connected layer and the softmax layer.On the publicly available tibetan news classification corpus(TNCC) news headline dataset,Macro-F1 is 64.53%,which is 2.81% higher than that of the TiBERT model and 6.14% higher than that GCN model,and the fusion model has a better Tibetan short text classification effect.
Text Classification Based on Invariant Graph Convolutional Neural Networks
HUANG Rui, XU Ji
Computer Science. 2024, 51 (6A): 230900018-5.  doi:10.11896/jsjkx.230900018
Abstract PDF(2129KB) ( 332 )   
References | Related Articles | Metrics
Text classification is a basic and important task in natural language processing,and graph neural networks have been applied to this task in recent years.However,the graph representation learning using graph neural networks can not well satisfy the generalization learning of new words in the task involving text classification.It is generally assumed that training and testing data come from the same distribution,which is often invalid in reality.To overcome these problems,this paper puts forward the Invariant-GCN,which is used for text categorization by GCN reported.First,to build a single figure for each document,use GCN to learn fine-grained word representation according to its local structure,which can effectivelygenerate embeddings for words not seen in the new document and then merge the word nodes as document embeddings.And then extract the maximum limit retained within the same class information expectations subgraph,use the graph to study is not affected by the distribution change.Finally,the text classification is completed by graph classification method.In four benchmark datasets,the the Invariant-GCN is compared with five classification methods,and the experimental results show that it has a good effect of text categorization.
Named Entity Recognition Approach of Judicial Documents Based on Transformer
WANG Yingjie, ZHANG Chengye, BAI Fengbo, WANG Zumin
Computer Science. 2024, 51 (6A): 230500164-9.  doi:10.11896/jsjkx.230500164
Abstract PDF(3019KB) ( 360 )   
References | Related Articles | Metrics
Named entity recognition is one of the key tasks in the field of natural language processing,and it is the foundation of downstream tasks.At present,there are relatively few research results on the judicial field,and there are still many problems need to be solved in the informatization and intelligent transformation of the judicial system.Compared with texts in other fields,judicial documents have limitations such as strong professionalism and few corpus resources,leading to low recognition results of existing judicial documents.Therefore,the research is carried out from the following three aspects.Firstly,a multi-label hierarchical iterative annotation method(ML-HIA) is proposed,which can automatically annotate the original judicial documents and effectively improve the effect of the entity recognition task of judicial documents.Secondly,an feature mixed Transformer(FM-Transformer) neural network model,which makes full use of the deep features of the inherent attributes of Chinese characters,is proposed to identify named entities of judicial documents.Finally,the proposed method and model are compared with other neural network models.The proposed method of text annotation can realize the task of judicial document annotation accurately.At the same time,compared with other models,the proposed model has a great improvement in the general dataset,and has achieved good results in the judicial datasets.
TCM Named Entity Recognition Model Combining BERT Model and Lexical Enhancement
LI Minzhe, YIN Jibin
Computer Science. 2024, 51 (6A): 230900030-6.  doi:10.11896/jsjkx.230900030
Abstract PDF(2711KB) ( 441 )   
References | Related Articles | Metrics
There are few researches on TCM named entity recognition,and most of them are based on Chinese medical cases,and they do not perform well in TCM case texts.Aiming at the characteristics of dense named entities and fuzzy boundary in TCM cases,this paper proposes a method of TCM named entity recognition,LEBERT-BILSTM-CRF,which combines lexical enhancement and pre-training model.This method is optimized from the perspective of the fusion of vocabulary enhancement and pre-training model,and the vocabulary information is input into the BERT model for feature learning,so as to achieve the purpose of dividing word class boundaries and distinguishing word class attributes,and improve the accuracy of TCM medical case named entity recognition.Experiments show that when ten entities are identified on the TCM case data set constructed in this paper,the comprehensive accuracy rate,recall rate and F1 of the TCM case named entity recognition model based on LEBERT-BILSTM-CRF is 88.69%,87.4% and 88.1%,respectively.It is higher than common named entity recognition models such as BERT-CRF and LEBERT-CRF.
Chinese Medical Named Entity Recognition with Label Knowledge
YIN Baosheng, ZHOU Peng
Computer Science. 2024, 51 (6A): 230500203-7.  doi:10.11896/jsjkx.230500203
Abstract PDF(2602KB) ( 320 )   
References | Related Articles | Metrics
Named entity recognition in the medical field is one of the important research contents of information extraction tasks.Its training data mainly comes from unstructured texts such as clinical trial data,health records,electronic medical records.However,labeling these data requires professionals to spend a lot of manpower,material resources and ime.In the absence of large-scale medical training data,named entity recognition models in the medical field are prone to recognition errors.In order to solve this problem,this paper proposes a Chinese medical named entity recognition method that integrates label knowledge,that is,after obtaining the interpretation of the text label through a professional field dictionary,the text,label and label interpretation are encoded separately,and the fusion is performed based on an adaptive fusion mechanism,to effectively balance the information flow of the feature extraction module and the semantic enhancement module,thereby improving the model performance.The core idea is that the medical entity label is obtained by summarizing a large amount of medical data,and the label interpretation is the result of scientific explanation and explanation of the label.The model incorporates these rich prior knowledge in the medical field to make it more accurate.Accurately understand the semantics of entities in the medical domain and improve their recognition.Experimental results show that the method has achieved 0.71%,0.53% and 1.17% improvement on the three baseline models of the Chinese medical entity extraction dataset(CMeEE-V2),and provides an effective method for entity recognition in small sample scenarios.
Knowledge Reasoning Model Combining HousE with Attention Mechanism
ZHU Yuliang, LIU Juntao, RAO Ziyun, ZHANG Yi, CAO Wanhua
Computer Science. 2024, 51 (6A): 230600209-8.  doi:10.11896/jsjkx.230600209
Abstract PDF(1997KB) ( 304 )   
References | Related Articles | Metrics
Knowledge reasoning technology is a method proposed to solve the problem of missing knowledge graphs and has been continuously developed in recent years.In order to solve the problems of low accuracy,poor interpretability,and weak applicability in knowledge reasoning,a knowledge reasoning model called Att-HousE,which combines HousE with Attention Mechanism,is proposed.It consists of a rule generator with attention mechanism and a rule predictor with HousE.The rule generator generates the rules required for reasoning and passes them into the predictor,which updates and then obtains scores for different rules.After that,the generator and predictor are continuously trained and optimized by the EM algorithm.Specifically,the model is based on RNNLogic and has been improved.The attention mechanism can select more noteworthy relationships as rules,improving the accuracy of the model.HousE has more flexibility in handling complex relationships and is suitable for establishing multilateral relationships.According to experimental results on public datasets,it indicates that Att-HousE’s MRR is 6.3% higher than that of RNNLogic when doing reasoning tasks on FB15K-237.For the sparse dataset WN18RR,the Hits@10 of Att-HousE is 2.7% higher than that of RNNLogic.It is demonstrated that the introduction of HousE and attention mechanism can more comprehensively grasp and form multilateral relationships,which can improve the accuracy of knowledge reasoning.
Personalized Dialogue Response Generation Combined with Conversation State Information
GUI Haitao, WANG Zhongqing
Computer Science. 2024, 51 (6A): 230800055-7.  doi:10.11896/jsjkx.230800055
Abstract PDF(2284KB) ( 272 )   
References | Related Articles | Metrics
Despite the significant achievements in personalized response generation models,existing studies have not adequately considered the impact of dialogue state information on personalized dialogue responses.To address this issue,this paper proposes a self-supervised dialogue response generation model that incorporates dialogue state to effectively generate personalized replies based on pre-trained generative models.Firstly,we integrate the dialogue state into a situational comedy dataset to enhance the model’s contextual understanding.Secondly,we employ self-supervised training techniques to imbue the pre-trained language ge-neration model with unique dialogue text features and employ various masking strategies to combine dialogue text and dialogue state,further enhancing model performance.Lastly,leveraging historical dialogues,we utilize the self-supervised generative model to produce personalized responses.Experimental results on a self-collected situational comedy dataset demonstrate that the dialogue response generation model incorporating dialogue state outperforms several strong baselines across multiple metrics,thus validating the effectiveness of incorporating dialogue state in personalized response generation models.
Construction Method of Domain Sentiment Lexicon Based on Improved TF-IDF and BERT
JIANG Haoda, ZHAO Chunlei, CHEN Han, WANG Chundong
Computer Science. 2024, 51 (6A): 230800011-9.  doi:10.11896/jsjkx.230800011
Abstract PDF(2427KB) ( 353 )   
References | Related Articles | Metrics
The construction of a domain sentiment lexicon is the foundation of domain text sentiment analysis.The existing me-thods for constructing domain sentiment lexicon have problems such as high redundancy of selected candidate sentiment words,inaccurate judgment of sentiment polarity,and high domain dependency.In order to improve the domain specificity of selected candidate sentiment words and the accuracy of judging the polarity of domain sentiment words,a domain sentiment lexicon construction method based on improved term frequency-inverse document frequency(TF-IDF) and BERT is proposed.This method improves the TF-IDF algorithm in the phase of selecting domain candidate sentiment words.The latent dirichlet allocation(LDA) algorithm is combined with the improved TF-IDF algorithm to perform domain corrections,improves the domain specificity of the selected candidate sentiment words.In the polarity judgment stage of candidate sentiment words,the semantic orientation pointwise mutual information(SO-PMI) algorithm is combined with BERT.By fine-tuning the BERT classification model using domain sentiment words,the accuracy of judging the sentiment polarity of domain candidate sentiment words is improved.Experiments are conducted on user comment datasets in different domains,and the experimental results show that this method can improve the quality of the constructed domain sentiment lexicon,and the F1 value of the domain sentiment lexicon constructed by this method for text sentiment analysis in the automotive field and mobile phone field reaches 78.02% and 88.35%,respectively.
Text Emotional Analysis Model Fusing Theme Characteristics
YANG Junzhe, SONG Ying, CHEN Yifei
Computer Science. 2024, 51 (6A): 230600111-8.  doi:10.11896/jsjkx.230600111
Abstract PDF(3199KB) ( 372 )   
References | Related Articles | Metrics
With the rapid development of large-scale language models,how to reduce the number of model parameters while ensuring model performance has become an important challenge in the field of natural language processing.However,the existing parameter compression techniques are often difficult to balance the stability and generalization ability of the model.To this end,this paper proposes a new framework for sentiment analysis that integrates topic features,aiming to use topic information to enhance the model’s ability to judge text sentiment polarity.Specifically,a method combining LDA and K-means is used to extract the topic features of the text,and it is spliced with word embeddings as a fixed-dimensional vector to obtain a new word vector representation.Sentence-level representation vectors are then constructed using average pooling techniques and fed into a fully connected layer for sentiment classification.To verify the effectiveness of the proposed model,comparative experiments with multiple benchmark algorithms are carried out on public sentiment analysis datasets.Experimental results show that the proposed model is significantly better than ALBERT in multiple data sets,with an accuracy rate increases by about 3.5%,and it maintains high stability and generalization ability with only a small increase in the number of parameters.
Remote Template Detection Algorithm and Its Application in Protein Structure Prediction
LIANG Fang, XU Xuyao, ZHAO Kailong, ZHAO Xuanfeng, ZHANG Guijun
Computer Science. 2024, 51 (6A): 230600225-7.  doi:10.11896/jsjkx.230600225
Abstract PDF(3594KB) ( 289 )   
References | Related Articles | Metrics
In the development process from traditional force field-driven protein structure prediction to current data-driven AI structure modeling,protein structure template detection is a key module in protein structure prediction,and how to detect high-precision protein structure remote templates is important to improve the prediction accuracy of structures.In this paper,a remote homology template detection algorithm ASEalign based on adaptive eigenvector extraction is proposed.Firstly,a deep learning technique of multi-feature information fusion is used to predict protein contact maps.Then,a multi-dimensional feature scoring function is designed to fuse contact maps,secondary structures,sequence profiles-profiles alignment and solvent accessibility,and the eigenvalue and eigenvector in the contact map matrix extracted by adaptive template alignment is performed.Finally,the detected high-quality templates are input to AlphaFold2 for structural modeling.Results on the test set of 135 proteins indicate that,compared to HHsearch,ASEalign improves the accuracy by 11.5%.Meanwhile,its accuracy of modeled structure is better than that of AlphaFold2.
Study on Pre-training Tasks for Multi-document Summarization
DING Yi, WANG Zhongqing
Computer Science. 2024, 51 (6A): 230300160-8.  doi:10.11896/jsjkx.230300160
Abstract PDF(2977KB) ( 300 )   
References | Related Articles | Metrics
News summarization aims to quickly and accurately extract a concise summary from the complex news text.This paper studies the multi-document summary based on the pre-training language model,focusing on the effect of model training methods combined with pre-training tasks on improving model performance,and strengthening information exchange between multiple documents to generate more comprehensive and brief summaries.For combined pre-training tasks,this paper conducts comparative experiments on the baseline model,pre-training task content,pre-training task quantity,and pre-training task order,explores and marks effective pre-training tasks,summarizes the specific methods to strengthen the information exchange between documents,and refines and proposes a concise and efficient pre-training process.Through training and testing on the public news multi-document dataset,experimental results show that the content,quantity,and order of the pre-training tasks have a certain improvement on the ROUGE value,and the specific pre-training combination proposed by integrating the conclusions of the three has a significant increase in the ROUGE value.
Cross-lingual Text Topic Discovery Based on Ensemble Learning
LI Shuai, YU Juan, WU Shaocheng
Computer Science. 2024, 51 (6A): 230300201-8.  doi:10.11896/jsjkx.230300201
Abstract PDF(3202KB) ( 294 )   
References | Related Articles | Metrics
Cross-lingual text topic discovery is an important research direction in the field of cross-lingual text mining,and it has high application value for cross-lingual text analysis and organization of various text data.Based on Bagging and cross-lingual word embedding to improve the LDA topic model,a cross-lingual text topic discovery method BCL-LDA(Bagging,cross-lingual word embedding with LDA) is proposed to mine key information from multilingual text.This method first combines the Bagging integrated learning idea with the LDA topic model to generate a mixed language subtopic set.Then it uses cross-lingual word embedding and K-means algorithm to cluster and group the mixed subtopics.Finally,the TF-IDF algorithm is used to filter and sort the subject words.The Chinese-German and Chinese-French topic discovery experiments show that this method performs well in terms of topic coherence and diversity,and can extract bilingual topics with more relevant semantics and more coherent and diverse topics.
Study on Multi-strategy Improved Salp Swarm Algorithm for Path Planning Problem
ZHAO Hongwei, DONG Changlin, DING Bingru, CHAI Hailong, PAN Zhiwei
Computer Science. 2024, 51 (6A): 230600083-9.  doi:10.11896/jsjkx.230600083
Abstract PDF(4715KB) ( 278 )   
References | Related Articles | Metrics
Aiming at the problem of finding the optimal path for mobile robots,a salp swarm algorithm BAGSSA(adaptive salp swarm algorithm with scale-free of BA network and golden sine algorithm) combining scale-free network,adaptive inertia weight and golden sine algorithm mutation strategy is proposed.First,a scale-free topology network is generated to map the relationship of followers,so as to enhance the global optimization ability of the algorithm;and the adaptive inertia weight is introduced in the followers to form a spontaneous adjustment to the overall distribution of the population and enhance the ability of local optimization.The variation of the golden sine algorithm is selected to further improve the accuracy of the solution.Secondly,through the simulation solution of 12 benchmark functions,experimental data show that the average value,standard deviation,Wilcoxon test and convergence curve are better than that ofthe standard SSA and other swarm intelligence algorithms.The proposed algorithmhas higher optimization accuracy and convergence speed.Finally,BAGSSA is applied to the path planning problem of mobile robots,and simulation experiments are carried out in two test environments.Simulation results show that the improved salp swarm algorithm is better than other algorithms in finding the path,and has certain theoretical and practical application value.
Study on Hypernymy Recognition Based on Combined Training of Attention Mechanism and Prompt Learning
BAI Yu, WANG Xinzhe
Computer Science. 2024, 51 (6A): 230700226-5.  doi:10.11896/jsjkx.230700226
Abstract PDF(1921KB) ( 290 )   
References | Related Articles | Metrics
The hypernymy between patent terms is an important semantic relationship.The identification of hypernymy between terms in patent text plays an important role in patent retrieval,query expansion,knowledge graph construction and other fields.However,due to the diversity of patent field and the complexity of language expression,the task of identifying the hypernymy between terms still faces many challenges.This paper proposes a method to recognize the hypernymy of terms by integrating prompt learning and attention mechanism.This method is based on the distantly supervised framework,and uses the shortest dependent path between terms as an auxiliary feature to integrate into the prompt template.Graph neural network is used to integrate the common information between terms into the joint training process of prompt learning and attention mechanism.Expe-rimental results on the patent text test dataset show that the AUC and f1 value of our method reache 94.94% and 89.33%,respectively,which are 3.82% and 3.17% higher than the PARE model.This method effectively removes the noise of the dataset annotated using distantly supervised methods,avoids the mismatch problem between the training target of the masked language model and downstream tasks,and fully utilizes the language knowledge information existing in the pre-trained language model.
Document-level Relation Extraction Integrating Evidence Sentence Extraction
AN Xiankua, XIAO Rong, YANG Xiao
Computer Science. 2024, 51 (6A): 230800081-6.  doi:10.11896/jsjkx.230800081
Abstract PDF(2144KB) ( 289 )   
References | Related Articles | Metrics
As a crucial task in the field of natural language processing,document-level relation extraction aims to accurately extract semantic relationships between entities from lengthy documents.Traditional document-level relation extraction methods ty-pically take the entire document as input.However,in reality,humans can predict relationships between entity pairs based on only a portion of the document,referred to as evidence sentences.In existing research,many methods start to utilize evidence sentences,but they face challenges such as incomplete evidence retrieval and difficulty in fully leveraging the advantages of these evidence sentences.To address this issue,we introduce a more efficient and accurate evidence sentence selection method.This is achieved by integrating a strategy for extracting evidence sentences through a fusion of formula-based and sentence-deletion-based approaches.We seamlessly integrate the evidence extraction with the training and inference processes,directing the document-le-vel relation extraction model to focus more on crucial sentences while still recognizing comprehensive information within the document.Experimental results demonstrate that the improved model outperforms existing models on public datasets.
New Solution for Traveling Salesman Problem Based on Graph Convolution and AttentionNeural Network
WEI Niannian, HAN Shuguang
Computer Science. 2024, 51 (6A): 230700222-8.  doi:10.11896/jsjkx.230700222
Abstract PDF(3158KB) ( 340 )   
References | Related Articles | Metrics
Traveling salesman problem is a classic combinatorial optimization problem.To solve the problem quickly,a learning branch rule is designed,which is based on a deep learning model composed of graph embedding network,graph convolutional neural network,attention neural network and multi-layer perceptron,and the traditional branch and bound algorithm is modified to improve the algorithm performance.Traveling salesman problem instances of 15 cities are supervised and trained ,and the traveling salesman problem instances of 10,15,20,25 and 30 cities are tested on the SCIP solver respectively.We find that the solution time of the branch and bound algorithm based on learning branch rule is -0.0022s,0.0178s,1.7643s,2.3074s,and 2.0538s faster than that of the algorithm based on traditional branch rules,respectively.Therefore,the selection of branch variables based on graph neural networks is effective in improving traditional branch rules and can be well normalized to traveling salesman problem instances with larger training scales.
Similarity Measure Between Picture Fuzzy Sets and Its Application in Pattern Recognition
GAO Jianlei, LUO Minxia
Computer Science. 2024, 51 (6A): 230500153-5.  doi:10.11896/jsjkx.230500153
Abstract PDF(1884KB) ( 328 )   
References | Related Articles | Metrics
Picture fuzzy sets can depict information with fuzziness,uncertainty,and inconsistency.Similarity measure is a measure of the degree of similarity between two objects.The similarity measure between picture fuzzy sets is studied in this paper.Considering the information difference of the three membership degrees of picture fuzzy sets,a new similarity measure is constructed based on exponential function.The similarity measure proposed in this paper not only satisfies the axiomatic definition of the similarity measure,but also yields reasonable computational results in practical applications.We apply the proposed similarity measure to pattern recognition,and compare it with some existing similarity measures in examples.The results show that the proposed similarity measure can not only overcome the shortcomings of some existing similarity measures,but also obtain reasonable calculation results.
Electra Based Chinese Event Detection Model with Dependency Syntax Tree
YIN Baosheng, KONG Weiyi
Computer Science. 2024, 51 (6A): 230600158-6.  doi:10.11896/jsjkx.230600158
Abstract PDF(2413KB) ( 294 )   
References | Related Articles | Metrics
Event detection is an important research direction in the field of information extraction.The existing event detection models are limited by the training targets of language models,and the dependency relationship between words can only be acquired passively,so the models pay more attention to the unrelated components during training,resulting in the wrong decetion results.Previous studies show that fully understanding contextual information is crucial for deep learning-based event detection techniques.In this paper,we introduce the KVMN network to capture the dependencies between words and enhance the semantic features of words,and a gating mechanism is adapted to weight these features.Then,in order to solve the problem of the model’sidentification of wrong decisions,negative samples are added to the input,and different levels of noise are added for different samples,so that the model could learn a better embedding representation,effectively improving the model’s ability to generalise unknown samples.Finally,experimental results on the public dataset LEVEN show that this method is superior to the existing methods and achieves a F1 score of 93.43%.
Study on Genetic Algorithm of Course Scheduling Based on Deep Reinforcement Learning
XU Haitao, CHENG Haiyan, TONG Mingwen
Computer Science. 2024, 51 (6A): 230600062-8.  doi:10.11896/jsjkx.230600062
Abstract PDF(2527KB) ( 415 )   
References | Related Articles | Metrics
Course scheduling is a routine and important matter in teaching activities.The traditional manual course scheduling method is time-consuming and laborious,and prone to errors,which cannot meet the needs of large-scale course scheduling.However,the classical course scheduling genetic algorithm has problems such as too fast convergence speed and the efficiency of course scheduling decreases with the increase of constraint factors.Aiming at the problems of existing course scheduling genetic algorithms,a self-learning course scheduling genetic algorithm(GA-DRL) based on deep reinforcement learning is proposed.GA-DRL algorithm uses Q-learning algorithm to realize the adaptive adjustment of cross parameter and variation parameter,and enhances the searching ability of genetic algorithm.By establishing a dynamic parameter adjustment model of Markov decision process(MDP),the state set of fitness function is analyzed,and the overall performance of the population is evaluated comprehensively.At the same time,the deep Q-network algorithm(DQN) is introduced into the scheduling problem to solve the problem of multiple population states and large amount of Q-table data.Experimental results show that GA-DRL algorithm improves accuracy and optimization ability compared with the classical course scheduling genetic algorithm and improved genetic algorithm.The proposed algorithm can also be applied to problems such as examination room arrangement,cinema seating and airline route planning.
Dual Direction Vectors-based Large-scale Multi-objective Evolutionary Algorithm
HAN Lijun, WANG Peng, LI Ruixu, LIU Zhongyao
Computer Science. 2024, 51 (6A): 230700155-11.  doi:10.11896/jsjkx.230700155
Abstract PDF(2758KB) ( 307 )   
References | Related Articles | Metrics
The decision space dimension of large-scale multi-objective optimization problemsis up to hundreds of dimensions.It is extremely challenging to achieve fast convergence in the huge search space while efficiently maintaining the diversity of the population.To address the above problems,a dual direction vectors-based large-scale multi-objective evolutionary algorithm(DDLE) is proposed in the paper.The main idea of the algorithm is to utilize two different types of direction vectors to guide the population evolution and improve the search efficiency of the algorithm.First,a convergent direction vector generation strategy is designed to improve the convergence speed of the algorithm.Second,a diversity direction vector generation strategy is introduced to enhance the diversity of the population.Finally,an adaptive environment-based selection operator is proposed to dynamically balance the convergence and diversity in the process of population evolution.To verify the performance of DDLE,it is compared with five state-of-the-art algorithms in experiments on 72 large-scale benchmark test problems.Experimental results show that DDLE has a significant advantage over other compared algorithms in solving large-scale multi-objective optimization problems.
Bidirectional Neural Probabilistic Transducer for Process Text Entity Recognition
LI Ruiting, WANG Peiyan, WANG Libang, YANG Danqingxin
Computer Science. 2024, 51 (6A): 230700206-8.  doi:10.11896/jsjkx.230700206
Abstract PDF(2993KB) ( 266 )   
References | Related Articles | Metrics
Process text entity recognition aims to recognize entities such as parts,materials,attributes and attribute values from texts generated or associated with the manufacturing process of products.Recently,in most domain-specific entity recognition tasks,such as process domain,prior knowledge in the form of dictionaries or rules is used to adjust neural network model results or generate pre-recognized features to incorporate into the model.However,these methods do not realize the integration of domain entity recognition knowledge and neural network models.Furthermore,the addition of domain knowledge does not reduce the training cost of the model and still need a large amount of labeled data.To address these challenges,this paper proposes a bidirectional neural probabilistic transducer(Bi-NPT) for process text entity recognition.This approach models the domain-specific prior knowledge for process text entity recognition as regular rules,and then converts these rules into a parameterized probabilistic finite state transducer.This method makes the model carry entity recognition prior knowledge before training,while being traina-ble.The model acquires the ability to recognize entities not covered by the regular rules by training on labeled data.Experimental results demonstrate that the proposed Bi-NPT performs comparably to regular rule-based entity recognition without training,suggesting that the untrained initial model already has possess entity recognition knowledge.Additionally,Bi-NPT outperforms other methods such as PER,Template-based BART,NNShot in few-shot and BiLSTM,TENER in rich-resource scenarios.
Method for Entity Relation Extraction Based on Heterogeneous Graph Neural Networks and TextSemantic Enhancement
PENG Bo, LI Yaodong, GONG Xianfu, LI Hao
Computer Science. 2024, 51 (6A): 230700071-5.  doi:10.11896/jsjkx.230700071
Abstract PDF(1988KB) ( 336 )   
References | Related Articles | Metrics
In the era of information technology,extracting structured information from massive natural language texts has become a research hotspot.The complex knowledge information in the power system needs to be solved by constructing a knowledge graph,and entity relation extraction is the upstream information extraction task,whose completeness directly affects the effectiveness of the knowledge graph.With the continuous development of deep learning,research on using deep learning techniques to solve entity relation extraction tasks has gradually been carried out and achieved good results.However,there are still problems such as incomplete application of text semantics.This paper attempts to propose an entity relation extraction method based on heterogeneous graph neural network and text semantic enhancement to address these issues.This method uses word nodes and relationship nodes to learn semantic features and obtains initial features of the two types of nodes through BRET and pre-training tasks respectively.It uses a multi-layer graph network structure for iteration and implements the interaction between the two types of nodes by using multi-head attention mechanism for information transmission in each layer.Through experimental comparison with other models on two public datasets,this model achieves the expected effect and generally outperforms other entity relationship extraction models in various scenarios.
Comparative Study on Improved Tuna Swarm Optimization Algorithm Based on Chaotic Mapping
YIN Ping, TAN Guoge, SONG Wei, XIE Taotao, JIANG Jianbiao, SONG Hongyuan
Computer Science. 2024, 51 (6A): 230600082-10.  doi:10.11896/jsjkx.230600082
Abstract PDF(6227KB) ( 338 )   
References | Related Articles | Metrics
As the current standard platform for cloud resource management,Kubernetes generally adopts improved methods based on swarm intelligence optimization algorithms for pod scheduling due to various shortcomings of its default scheduling mechanism.Tuna swarm optimization(TSO) is selected as the basic algorithm in this paper.And according to the ergodicity,randomness and other characteristics of chaos,a chaotic mapping based population initialization scheme is proposed to address the common problems of swarm intelligence optimization algorithms,such as susceptibility to initial values and premature convergence during later iterations.Various chaotic maps,such as Tent,Logistic,and so on,which are commonly involved in current research,are selected to initialize the tuna swarm respectively to improve the diversity of the initial population.Numerical experiments are conducted to compare the experimental results of the improved tuna swarm optimization algorithms based on different chaotic maps.It proves that the population initialization scheme based on chaotic maps can effectively improve the convergence speed and calculation accuracy of the original TSO algorithm.
Image Processing & Multimedia Technolog
Research Progress of Underwater Image Processing Based on Deep Learning
ZHANG Tianchi, LIU Yuxuan
Computer Science. 2024, 51 (6A): 230400107-12.  doi:10.11896/jsjkx.230400107
Abstract PDF(3749KB) ( 349 )   
References | Related Articles | Metrics
With the development of artificial intelligence and underwater equipments,autonomous underwater vehicles can conveniently obtain underwater images.Underwater images are essential for exploring and developing the ocean.However,due to the complex underwater imaging environment,the acquired underwater images have low image quality,such as low contrast,blurring,and color distortion,making it difficult to meet the requirements of underwater production activities.In recent years,the development of deep learning-based underwater image processing methods and quality evaluation metrics has received much attention from scholars.Although there have been some reviews on deep learning-based underwater image processing methods,there are still issues such as incomplete summarization and a lack of the latest research results.Therefore,this paper first analyzes the causes of underwater image degradation and proposes the necessary processing issues,and classifies underwater image processing methods based on the principles and characteristics of various algorithms.Secondly,the latest research results on deep learning-based underwater image processing are analyzed and summarized,and the main features of various algorithms are summarized.Then,existing publicly available underwater image datasets and current mainstream and latest learning-based underwater image quality evaluation metrics are detailed,and traditional algorithms and deep learning-based underwater image processing methods are compared and analyzed through experimental design.Finally,some unresolved issues in the field of underwater image proces-sing are analyzed and summarized,and future development directions are discussed.
ConvNeXt Feature Extraction Study for Image Data
YANG Pengyue, WANG Feng, WEI Wei
Computer Science. 2024, 51 (6A): 230500196-7.  doi:10.11896/jsjkx.230500196
Abstract PDF(2915KB) ( 364 )   
References | Related Articles | Metrics
Convolutional neural networks have achieved many results in computer vision tasks,both in target detection and segmentation,which depend on the extracted feature information.Some problems such as ambiguous data and varying object shapes pose great challenges for feature extraction.The traditional convolutional structure can only learn the contextual information of the neighboring spatial locations of the feature map and cannot extract the global information,while models such as the self-attentive mechanism,although having a larger perceptual field and establishing global dependencies,are insufficient due to their high computational complexity and the need for large amounts of data.Therefore,this paper proposes a model combining CNN and LSTM,which can better combine the global information of image data while enhancing the local perceptual field.It uses the backbone network ConvNeXt-T as the base model to solve the problem of different object shapes by splicing different size convolutional kernels to fuse multi-scale features,and aggregates two-way long and short-term memory networks from both horizontal and vertical directions.Focus on the interactivity of global and local information.Experiments are conducted on publicly accessible CIFAR-10,CIFAR-100,and Tiny ImageNet datasets for image classification tasks,and the accuracy of the proposed network improves 3.18%,2.91%,and 1.03% in the three datasets respectively,compared to the base model ConvNeXt-T.Experiments demonstrate that the improved ConvNeXt-T network has substantially improved the number of parameters and accuracy compared with the base model,and can extract more effective feature information.
Classification of Multiscale Steel Microstructure Images Based on Incremental Learning
ZENG Peiyi
Computer Science. 2024, 51 (6A): 230500180-8.  doi:10.11896/jsjkx.230500180
Abstract PDF(4307KB) ( 324 )   
References | Related Articles | Metrics
The mechanical properties of steels are closely related to their microstructures,so it is important to identify the microstructures of steels.The magnification of steel micrograph varies greatly,and the morphology of the same microstructure at different magnifications is also different,so the classification of the continuously expanded multi-scale steel microstructure dataset is difficult.In this paper,VGG16 and self-organizing incremental neural network(SOINN) are combined to build a classification model for multiscale steel microstructure dataset based on incremental learning.In addition,the cross entropy loss based on center distance(CELCD) and cross train strategy are proposed.Combining with cross train,CELCD and anchor loss are utilized to solve the problem of “cata-strophic forgetting” and realize the incremental learning and efficient classification for steel micrographs.The classification accuracy and “forgotten degree” of the model are compared.Experimental results show that after incremental learning,the classification accuracy of the proposed method is only 14.02% lower than that before incremental learning,which reaches 80.49% on the old data and only 5.49% lower than the upper bound,which is superior to other incremental learning methods.
Thai Speech Synthesis Based on Cross-language Transfer Learning and Joint Training
ZHANG Xinrui, YANG Jian, WANG Zhan
Computer Science. 2024, 51 (6A): 230500174-7.  doi:10.11896/jsjkx.230500174
Abstract PDF(3772KB) ( 348 )   
References | Related Articles | Metrics
With the rapid development of deep learning and neural network,end-to-end speech synthesis system based on deep neural network has become the mainstream because of its excellent performance.However,in recent years,there are not enough researches on Thai speech synthesis,which is mainly due to the scarcity of large-scale Thai datasets and the special spelling of the language.This paper studies Thai speech synthesis based on the FastSpeech2 acoustic model and StyleMelGAN vocoder under the premise of low resources.Aiming at the problems existing in the baseline system,three improvement methods are proposed to further improve the quality of Thai synthesized speech.(1)Under the guidance of Thai language experts and combined with relevant knowledge of Thai linguistics,the Thai G2P model is designed to deal with the special spelling in Thai text.(2)According to the phonemes represented by the international phonetic alphabet converted by the designed Thai G2P model,languages with similar phonemes input units and rich data sets are selected for cross-language transfer learning to solve the problem of insufficient Thai training data.(3)The joint training method of FastSpeech2 and StyleMelGAN vocoder is used to solve the problem of acoustic feature mismatch.In order to verify the effectiveness of the proposed methods,this paper measures the attention alignment map,objective evaluation MCD and subjective evaluation MOS score.Experimental results show that using the Thai G2P model designed in this paper can obtain better alignment effect and thus more accurate phoneme duration,and the system using the “Thai G2P model designed in this paper+joint training+transfer learning” method has the best speech synthesis quality,and the MCD and MOS scores of the synthesized speech are 7.43 ± 0.82 and 4.53 points,which are significantly better than the 9.47±0.54 and 1.14 points of the baseline system.
Classification and Detection Algorithm of Ground-based Cloud Images Based on Multi-scale Features
SUN Jifei, JIA Kebin
Computer Science. 2024, 51 (6A): 230400041-6.  doi:10.11896/jsjkx.230400041
Abstract PDF(2957KB) ( 312 )   
References | Related Articles | Metrics
Clouds constantly contribute significantly to climate change in addition to having a short-term impact on local temperatures.To study local cloud details,the ground-based observation is used caused by its ability of cloud image capture in high temporal and spatial resolution.The research on automatic identification of ground-based clouds is primarily focused on two areas:cloud classification and cloud detection.Traditionally,both of them are regarded as separate and unrelated tasks.Cloud classification are independent of the segmentation,and most segmentation techniques focus on binary segmentation.This making it difficult to segment regions by different cloud types when the cloud image contains multiple classes of clouds.To address this problem,this paper proposes a semantic segmentation method based on deep learning for the combination of two tasks.First,it constructs the fround-based cloud image semantic segmentation(GBCSS) dataset,which contains 3000 cloud images with a total of 11 types.All images are resized to a square format of 256×256 pixels.Then,an improved scheme based on U-shaped neural networks is designed as the semantic segmentation model for ground-based cloud images.The pyramid pooling module is combined for extracting and aggregating image features at different scales.This module improves the network’s ability to obtain global information.The developed network UNet-PPM achieves 91.5% pixel accuracy on average on the test set after being trained and assessed on GBCSS.Our suggested enhanced method outperforms the U-Net,Deeplabv3+,DANet and BiSeNetv2 in terms of pixel accuracy.Experiment results show that the pyramid pooling module contributes a lot to extract cloud contour features and restrain the overfitting problem.Our work show the feasibility of semantic segmentation application in cloud image automatic observation.
Few-shot Images Classification Based on Clustering Optimization Learning
SU Ruqi, BIAN Xiong, ZHU Songhao
Computer Science. 2024, 51 (6A): 230300227-7.  doi:10.11896/jsjkx.230300227
Abstract PDF(3123KB) ( 310 )   
References | Related Articles | Metrics
The goal of few-shot image classificationis to achieve the classification of new imagecategories on the basis of training a small number of labeled training dataset.However,this goal is difficult to achieve under existing conditions.Therefore,the current few-shot learning method mainly mainly draws on the idea of transfer learning,and its core is to construct prior knowledge by using situational meta-training,so as to realize the solution of unknown new tasks.However,the latest research shows that the embedded model learning method with strong feature representation is simpler and more effective than the complex few-shot learning method.Inspired by this,this paper proposes a novel few-shot image classification methodbased on direct clustering optimization learning.This proposed method first utilizes the internal feature structure information of sample data to realize the comprehensive representation of each category,and then optimizes the center of each category to form a more distinctive feature representation,thus effectively increasing the feature differences between different categories.A large number of experimental results demonstrate that the proposed image classification method based on the clustering optimization learningcan effectively improve the accuracy of image classification under various training conditions.
Lightweight Image Semantic Segmentation Based on Attention Mechanism and Densely AdjacentPrediction
WANG Guogang, DONG Zhihao
Computer Science. 2024, 51 (6A): 230300204-8.  doi:10.11896/jsjkx.230300204
Abstract PDF(3847KB) ( 291 )   
References | Related Articles | Metrics
A novel algorithm named as lightweight image semantic segmentation based on attention mechanism and densely adjacent prediction is proposed to avoid the disadvantages of the difficulty in highlighting important channel features for atrous spatial pyramid pooling module,higher computational complexity and lacking of sufficient detailed information for the high level semantic feature map generated by the decoder in DeepLabv3+ algorithm.The lightweight MobileNetV2 is regarded as the backbone network to reduce model parameters.After the multi-scale information is extracted by the channel atrous spatial pyramid pooling,each channel of the feature map is weighted to reinforce the learning of important channel features.Moreover,the segmentation results are refined since densely adjacent prediction is utilized to combine high-level and low-level features.Experiments are performed on the PASCAL VOC 2012 augmented dataset,and the experimental results show that both mean Intersection over union and mean pixel accuracy of the proposed method are higher than the state-of-the-art algorithms.Compared with DeepLabv3+,the parameters and calculation amount are decreased by 184.82×106 and 90.83GFLOPs respectively.The proposed algorithm not only improves the segmentation accuracy,but also reduces the computation cost compared to the baseline algorithm.
Mural Inpainting Based on Fast Fourier Convolution and Feature Pruning Coordinate Attention
ZHANG Le, YU Ying, GE Hao
Computer Science. 2024, 51 (6A): 230400083-9.  doi:10.11896/jsjkx.230400083
Abstract PDF(10264KB) ( 320 )   
References | Related Articles | Metrics
A proposed solution to the problem of high manual inpainting costs for ancient murals that have undergone varying degrees of natural weathering resulting in cracks,peeling,and other damage is to use a generative adversarial network with a framework based on fast Fourier convolution and coordinate attention.Most existing methods for mural inpainting have complex frameworks that consume a lot of computing power,and produce results that are inaccurate and of low quality.The proposed method takes the damaged mural image and mask as inputs to the network.They are then passed through an encoder and a residual module for feature inference to determine the reasonable content of the damaged area.During training,a specific discriminator that is used for inpainting tasks conducts adversarial training.Eventually,the desired inpainting effect is achieved.The feature inference portion of the proposed model consists of a residual block containing gate-controlled residual connections,six fast Fourier convolution modules,and an improved coordinate attention module for feature pruning.It has a large receptive field and the ability to extract rich features,which can solve the problem of poor inpainting results associated with current methods.Experimental results on a self-made dataset show that the proposed algorithm not only has a simpler structure but also outperforms several classic inpainting methods.Therefore,it can be applied to the inpainting of ancient murals and can save a significant amount of manual labor costs.
Improved vnet Model for 3D Liver CT Image Segmentation
YANG Shuqi, HAN Junling, KANG Xiaodong, YANG Jingyi, GUO Hongyang, LI Bo
Computer Science. 2024, 51 (6A): 230400038-6.  doi:10.11896/jsjkx.230400038
Abstract PDF(3921KB) ( 373 )   
References | Related Articles | Metrics
Segmentation of 3D medical images is an important step in radiotherapy planning.In clinical practice,computed tomography is widely used for 3D medical image segmentation of the liver and liver tumours.Due to the complex edge structure and texture features of the liver,liver segmentation is still a challenging task.To address this problem,an improved vnet model for accurate segmentation of 3D liver CT images is proposed.Firstly,the liver CT images are truncated and resampled with HU values to complete the preprocessing of the 3D dataset.Meanwhile,the convolution kernel in the vnet decoder and encoder is replaced with an SG module,which is a combination of depthwise convolution and pointwise convolution,to reduce the number of parameters in the network model.Comparative experiments with the vnet model show that the proposed method is generally superior in the evaluation of the liver segmentation dataset,with a Dice coefficient of 94.93%,an improvement of 3.49% over the vnet model,greatly reducing the number of parameters of the model,while the method also shows good robustness and achieves superior segmentation results on the MSD spleen segmentation dataset and COVID-19 dataset.
Rice Defect Segmentation Based on Dual-stream Convolutional Neural Networks
WU Yibo, HAO Yingguang, WANG Hongyu
Computer Science. 2024, 51 (6A): 230600107-8.  doi:10.11896/jsjkx.230600107
Abstract PDF(4290KB) ( 285 )   
References | Related Articles | Metrics
Currently,fine-grained assessment of rice quality cannot be achieved due to the lack of related work on fine-grained detection of rice defects.Traditional rice quality assessment is based on rough classification of defect presence or absence.To address the problem of pixel-level classification of rice defects,a deep learning-based rice defect segmentation model is proposed.The model uses an improved DoubleU-Net network as the main architecture,which consists of two parts,NETWORK1 and NETWORK2.NETWORK1 is based on a modified U-shaped network structure of VGG-19,while NETWORK2 is based on a modified U-shaped network structure of Swin Transformer.The two parts are concatenated,and the advantages of CNN local information extraction and Transformer global information extraction are integrated to better capture the contextual information of images.In addition,multiple loss functions are used,including weighted binary cross-entropy loss,weighted intersection-over-union loss,and an intelligent loss network that does not require training,to improve the stability of the model training and further improve the accuracy of model segmentation.The proposed model is trained and tested on a densely annotated rice defect dataset,and achieves better segmentation performance than other methods,with robustness and good generalization ability.
Fine-grained Colon Pathology Images Classification Based on Heterogeneous Ensemble Learningwith Multi-distance Measures
LIANG Meiyan, FAN Yingying, WANG Lin
Computer Science. 2024, 51 (6A): 230400043-7.  doi:10.11896/jsjkx.230400043
Abstract PDF(3407KB) ( 294 )   
References | Related Articles | Metrics
Fine-grained classification of colon pathology images is of great significance for both symptomatic treatment and prognosis assessment.However,the histopathological subtyping images of colon are extremely similar in morphology.It is a challenging task for manual methods to obtain high-precision predictions.Computer-aided diagnosis methods based on a single model also suffer from predictive bias in histological subtyping.Therefore,the fine-grained classification algorithm based on heterogeneous ensemble learning with multi-distance measures is proposed to predict the microsatellite state of colon pathology images.This method ensembles the predictions of the base learners by measuring the distance between the output confidence scores and the labels in latent space using Cosine distance,Manhattan distance,and Euclidean distance,respectively.Then,these distances are used to improve the overall decision performance of the model.The results show that the classification accuracy,precision,recall and F-1 score can reach 94% in the fine-grained classification,which provides a new perspective for subtype classification of pathological images.
UMGN:An Infrared and Visible Image Fusion Network Based on Unsupervised Significance MaskGuidance
LI Dongyang, NIE Rencan, PAN Linna, LI He
Computer Science. 2024, 51 (6A): 230600170-5.  doi:10.11896/jsjkx.230600170
Abstract PDF(3975KB) ( 319 )   
References | Related Articles | Metrics
In challenging shooting environments,it is difficult to capture clear and detailed texture information and thermal radiation information using a single infrared or visible image.However,infrared and visible image fusion allows the preservation of thermal radiation information in infrared images and texture details in visible light images.Many existing methods directly generate fused images in the fusion process,ignore the estimation of pixel-level weight contribution of source images,and emphasize the learning between different source images.For this reason,an infrared and visible image fusion based on unsupervised significance mask guidance network is proposed,which uses DenseNet structure to extract comprehensive features from source images.It produces a weight estimation probability to evaluate the contribution of each source image to the fused image.Since infrared and visible images lack ground truth,it is difficult to use supervised learning.UMGN also introduces the significance mask to facilitate the network to focus on learning the thermal radiation information and visible light texture information of infrared images.A weighted fidelity term and gradient loss are also introduced in the training process to prevent gradient degradation.A large number of comparative experiments with other advanced methods prove the superiority and effectiveness of the proposed UMGN method.
MCC-based Back-end Optimization Method and Its Application in ORB-SLAM2
WANG Ting, CHENG Lan, XU Xinying, YAN Gaowei, REN Mifeng, ZHANG Zhe
Computer Science. 2024, 51 (6A): 230600081-7.  doi:10.11896/jsjkx.230600081
Abstract PDF(4037KB) ( 255 )   
References | Related Articles | Metrics
Autonomous localization and environment awareness are prerequisites for robots to achieve complex tasks,and vision simultaneous localization and mapping(VSLAM) technology is an effective solution.In VSLAM,sensor errors and environmental noise,etc.,affect the localization and mapping accuracy,resulting in cumulative errors.Back-end optimization plays a key role in eliminating the accumulated error in VSLAM.Existing back-end optimization algorithms are usually premised on Gaussian noise and belong to the back-end algorithms as per the MSE standard.However,due to the non-convex nature of images and non-Gaussian noise generated in real scenes,the Gaussian noise assumption does not always valid,leading to performance degradation of existing algorithms when running in real scenes.In view of this,a back-end optimization method based on the MCC criterion is proposed by taking advantage of the maximum correlation entropy(MCC) criterion in dealing with non-Gaussian noise,and the proposed method is applied to the ORB-SLAM2 framework to test the performance of the proposed method in terms of localization and image building accuracy.Finally,experiments are conducted on EuRoC and KITTI public datasets,and the experimental results show that the proposed method outperforms the Huber-based back-end optimization algorithm as well as the Cauchy-based back-end optimization algorithm in the original ORB-SLAM2 for the majority of sequences,both indoor and outdoor.
Intelligent Diagnosis of Brain Tumor with MRI Based on Ensemble Learning
LI Xinrui, ZHANG Yanfang, KANG Xiaodong, LI Bo, HAN Junling
Computer Science. 2024, 51 (6A): 230600043-7.  doi:10.11896/jsjkx.230600043
Abstract PDF(3249KB) ( 326 )   
References | Related Articles | Metrics
Brain tumors are high-risk diseases caused by cancerous changes in the internal tissues of the brain,and timely diagnosis of brain tumors is crucial for their treatment and prognosis.At present,different network models have different classification effects,and a single network model is difficult to achieve outstanding performance on multiple evaluation indicators.This paper proposes a Treer-Net model with powerful classification function based on ensemble learning,which is based on TransFG,ResNet50,EfficientNet B4,EfficientNet B7 and ResNeXt101,and is obtained through the weighted average combination strategy of ensemble learning.This paper trains it to complete the classification tasks on the publicly available datasets of brain tumor MRI binary,tertiary and quaternary classifications.Experimental data and results show that the accuracy,precision recall and AUC of the Treer-Net model in the three classification datasets of brain tumors are up to 99.15%,99.16%,99.15% and 99.87% respectively.Through comparative analysis,it fully verifies that the ensemble learning method in this paper has the advantages of accuracy and speed,and is more suitable for clinical auxiliary diagnosis of brain tumors.
Classification Model of Heart Sounds in Pulmonary Hypertension Based on Time-Frequency Fusion Features
WANG Yanlin, SUN Jing, YANG Hongbo, GUO Tao, PAN Jiahua, WANG Weilian
Computer Science. 2024, 51 (6A): 230800091-7.  doi:10.11896/jsjkx.230800091
Abstract PDF(3051KB) ( 314 )   
References | Related Articles | Metrics
Pulmonary hypertension associated with congenital heart disease has a high mortality rate,and early screening and identification of it is particularly important for cure.At present,diagnosis is made by right heart catheterization,which is an invasive examination,it is not easy to use in screening,and has high risk and high cost.Therefore,it is urgent to study a non-invasive and convenient method for identification.In this paper,a time-frequency fusion heart sound classification model is established.First,the heart sound signal is preprocessed,then the signal is converted,and the dynamic time-frequency characteristics are obtained by using the fusion filter bank.Finally,the obtained fusion feature parameters are input into the TabPFN network for classification and recognition.Experimental results indicate that the algorithm has average accuracy,precision,sensitivity,specificity,and F1 scores of 92.21%,92.15%,92.15%,96.11%,and 92.14% respectively in normal,CHD-PAH,and CHD.It is important for the early screening and identification of pulmonary hypertension associated with congenital heart disease.
Medical Image Reversible Contrast Enhancement Based on Adaptive Histogram Equalization
TAN Peng, OU Bo
Computer Science. 2024, 51 (6A): 230700124-7.  doi:10.11896/jsjkx.230700124
Abstract PDF(3561KB) ( 319 )   
References | Related Articles | Metrics
At present,some reversible data hiding algorithms usually conduct the histogram-equalization like data hiding to achieve the contrast enhancement effect for the image.The advantage is that the algorithm is easy to design and conduct.How-ever,it lacks of the optimization objective function,and cannot determine the suitable parameters to optimize the reversible contrast enhancement.As a result,it may suffer the problems of insufficient or excessive enhancement,etc.In order to improve the reversible contrast enhancement effect after data embedding,this paper proposes a reversible data hiding algorithm for medical image combined with contrast enhancement based on adaptive histogram equalization.In the proposed method,the reversible data embedding is implemented by using prediction-error expansion.The objective function of adaptive histogram equalization is designed to optimize the prediction-error histogram modification and determine the optimal embedding positions,by considering the low-distortion embedding and the better contrast enhancement.Experimental results show that compared with other methods,the proposed method can further achieve contrast enhancement effect after reversible data embedding,and therefore improve the target identification of medical image.
Object Tracking of Structured SVM Based on DIoU Loss and Smoothness Constraints
SUN Ziwen, YUAN Guanglin, LI Congli, QIN Xiaoyan, ZHU Hong
Computer Science. 2024, 51 (6A): 230700113-8.  doi:10.11896/jsjkx.230700113
Abstract PDF(5877KB) ( 283 )   
References | Related Articles | Metrics
Object tracking based on structured support vector machine has been widely concerned because of its excellent performance.However,the existing methods have the problems of imprecise loss function and model drift.To solve these two pro-blems,firstly,a structured SVM model is proposed based on DIoU loss and smoothness constraints.Secondly,DIoU function and L2 norm of the difference between wt and wt-1 are used respectively as the loss functions and the smoothness constraints in the model.Thirdly,the algorithm for the proposed model is designed with the dual coordinate descent principle.Finally,a multi-scale object tracking method is implemented via the proposed structured SVM on the basis of DIoU loss and smoothness constraints.The proposed object tracking method is experimentally validated on the OTB100 and VOT-ST2021 datasets,and the experimental results show that the tracking success rate of the Scale-DCSSVM on the OTB100 is 1.1% higher than the DeepSRDCF,and the EAO on VOT-ST2021 is 1.2% higher than the E.T.Track.The proposed object tracking method has superior performance.
Remote Sensing Image Fusion Combining Multi-scale Convolution Blocks and Dense Convolution Blocks
HOU Linhao, LIU Fan
Computer Science. 2024, 51 (6A): 230400110-6.  doi:10.11896/jsjkx.230400110
Abstract PDF(4347KB) ( 311 )   
References | Related Articles | Metrics
The aim of remote sensing image fusion is to obtain high-resolution multispectral images with the same spectral resolution as multispectral images and the same spatial resolution as panchromatic images.Although deep learning has achieved remarkable results in remote sensing image fusion,the network cannot fully extract the rich spatial information in the image due to the limitation of the deep model network,which leads to the lack of spatial information in the fused image and low quality of the fusion result.Therefore,this paper introduces multi-scale blocks,where image features at different scales can be learned by convolutional kernels of different sizes,thus increasing the richness of the extracted features.Dense convolutional blocks are then introduced to achieve feature reuse through dense connections,reducing the loss of shallow feature information when the network is deep.In the feature fusion stage,the proposed method uses feature maps from different levels of the network as input to the feature fusion layer to improve the quality of the fused images.Comparison experiments are performed with six fusion algorithms on GE1 and QB datasets,and the experimental results show that the fused images of the proposed method retain spatial and spectral information better,and outperform the comparison methods in both subjective and objective evaluations.
Occluded Video Instance Segmentation Method Based on Feature Fusion of Tracking and Detection in Time Sequence
ZHENG Shenhai, GAO Xi, LIU Pengwei, LI Weisheng
Computer Science. 2024, 51 (6A): 230600186-6.  doi:10.11896/jsjkx.230600186
Abstract PDF(3416KB) ( 347 )   
References | Related Articles | Metrics
Video instance segmentation is a visual task that has emerged in recent years,which introduces temporal characteristics on the basis of image instance segmentation.It aims to simultaneously segment objects in each frame and achieve inter frame object tracking.A large amount of video data has been generated with the rapid development of mobile Internet and artificial intelligence.However,due to shooting angles,rapid motion,and partial occlusion,objects in videos often split or blur,posing significant challenges in accurately segmenting targets from video data and processing and analyzing them.After consulting and practicing,it is found that existing video instance segmentation methods perform poorly in occluded situations.In response to the above issues,this paper proposes an improved occlusion video instance segmentation algorithm,which improves segmentation performance by integrating the temporal features of Transformer and tracking detection.To enhance the learning ability of the network for spatial position information,this algorithm introduces the time dimension into the Transformer network and considers the interdepen-dence and promotion relationship between object detection,tracking,and segmentation in videos.A fusion tracking module and a detection temporal feature module that can effectively aggregate the tracking offset of objects in videos are proposed,improving the performance of object segmentation in occluded environments.The effectiveness of the proposed method is verified through experiments on the OVIS and YouTube VIS datasets.Compared to the current benchmark method,the proposed method exhibits better segmentation accuracy,further demonstrating its superiority.
Multiple Attention-guided Mechanisms for Ultrasound Breast Cancer Tumor Image Segmentation
GUO Hongyang, CHENG Qian, KANG Xiaodong, YANG Jingyi, YANG Shuqi, LI Fang, ZHANG Rui
Computer Science. 2024, 51 (6A): 230500004-6.  doi:10.11896/jsjkx.230500004
Abstract PDF(3702KB) ( 330 )   
References | Related Articles | Metrics
There are some problems such as single prediction scale and information loss in traditional U-Net ultrasound breast image segmentation tasks.To solve these problems,a multi-attention-guided U-Net ultrasound image segmentation method for breast tumors is proposed.Firstly,multiple SEattention module are introduced into the encoding structure of U-Net to extract multi-level semantic information from the input breast tumor images,which guides the encoder to focus on the features of breast tumor and reduces the interference caused by redundant background information.Secondly,by designing a feature fusion processing module,the complex semantic feature fusion processing is carried out on the feature graph from the encoder.Finally,in the decoder part,the pyramid structure is added to capture global spatial information to improve the multi-scale feature extraction ability of the model for tumor images,so as to improve the expression ability and segmentation performance of the whole network.The proposed method is simulated on breast tumor image data set,and the results show that compared with other U-Net improved strategies,the proposed method has better accuracy and robustness.
Study on Monocular Vision Vehicle Ranging Based on Lower Edge of Detection Frame
LIU Hongli, WANG Yulin, SHAO Lei, LI Ji
Computer Science. 2024, 51 (6A): 231000077-6.  doi:10.11896/jsjkx.231000077
Abstract PDF(4259KB) ( 318 )   
References | Related Articles | Metrics
The study on vehicle ranging is a hot research direction in the field of driving today,and aiming at the problems that the ranging accuracy of traditional ranging methods is affected by the size of the model and the X-axis offsetof the vehicle in front,a vehicle ranging model based on the center point of the lower edge of the detection frame is proposed.The model uses a monocular vision camera and vehicle detection algorithm to obtain the position information of the vehicle in front,and establishes a vehicle ranging model by comprehensively establishing the vehicle ranging model through the coordinates of the center point of the lower edge obtained by the vehicle detection frame and the pitch angle information installed by the camera,which solves the error problem caused by the size of the model,and solves the problem of X-axis component of the preceding vehicle relative to the experimental vehicle by constructing the trigonometric model,and optimizes and improves the determination method of the safety distance of the preceding vehicle.At the same time,the ratio λ of the abscissa of the center point of the rear rectangular frame to the width of the external rectangular frame of the vehicle is set,and the situation is discussed according to the λ value,so that the model is more in line with the needs of scene applications.An inverse perspective transformation model based on the key points of ranging is proposed to reduce the ranging error.Experiments show that the ranging accuracy of the improved ranging model is not affected by the size of the model and can take into account the X-axis component of the front vehicle position,and the ranging error of the improved ranging model is reduced by about 1.5% compared with the traditional ranging model,and the ranging accuracy of the improved ranging method is significantly improved.
Gaussian Enhancement Module for Reinforcing High-frequency Details in Camera ModelIdentification
HUANG Yuanhang, BIAN Shan, WANG Chuntao
Computer Science. 2024, 51 (6A): 230700125-5.  doi:10.11896/jsjkx.230700125
Abstract PDF(2364KB) ( 286 )   
References | Related Articles | Metrics
In multimedia forensics,a high-pass filter is one of the commonly used pre-processing layers by convolutional neural network to depress the impact of image content and only highlight high-frequency features.However,some other useful information containing forgery traces would also be removed indiscriminately in the meantime.To address this issue,in this paper,a simple yet effective Gaussian enhancement module is proposed to extract “extended” high-frequency features,namely,reinforce high-frequency details while maintaining the original feature strength.The GEM comprises two successive low-pass Gaussian filters to acquire a blurry version of the feature map and further get the corresponding extended high-frequency residual.It can strengthen fragile and subtle low-level forgery features adaptively and prevent feature attenuation as well.Experiments are conducted on the camera-model identification dataset by plugging the module into several mainstream backbone networks,indicating that it supports “plug and play” and is non-related to the specific network architecture.The proposed GEM brings a significant improvement both in the performance and the robustness of networks with the slightly increased complexity of models.
Small Object Detection for Fish Based on SPD-Conv and NAM Attention Module
CHEN Yuzhang, WANG Shiqi, ZHOU Wen, ZHOU Wanting
Computer Science. 2024, 51 (6A): 230500176-7.  doi:10.11896/jsjkx.230500176
Abstract PDF(5404KB) ( 345 )   
References | Related Articles | Metrics
In order to solve the problem of low image resolution due to the degradation of underwater imaging environment and low detection accuracy caused by small fish targets,an improved YOLOv7 detection algorithm combining SPD-Conv structure and NAM attention mechanism is proposed.Firstly,the space-to-fepth(SPD) structure is used to improve the head network,which replaces the original straddle convolution structure in the network,retains more fine-grained information,improves the efficiency of feature learning,and improves the detection effect of the network on low-resolution images.Then,the normalization-based attention module(NAM) attention mechanism is introduced into the network,and the module integration method of CBAM is adopted,and the BN scaling factor is used to calculate the attention weight,which suppresses the insignificant features and improves the accuracy of small target detection.Finally,for underwater imaging degradation,the detection image is deconvolved and preprocessed,which reduces the impact of underwater imaging degradation factors on detection.Experimental results show that in the WildFish dataset,the overall accuracy of the model reaches 97.2%,which is 7.6% higher than that of the YOLOv7 algorithm,the accuracy rate is increased by 8.5%,and the recall rate is increased by 9.8%,compared with the Efficientdet,SSD,YOLOv5 and YOLOv8 algorithms,the accuracy of the proposed model is improved by 12.6%,17.8%,4% and 2.9%,respectively.The overall accuracy of the model reaches 80.5%,which is 18.4%,11.6%,6.9%,2.0% and 2.7% higher than that of Efficientdet,SSD,YOLOv5,YOLOv7 and YOLOv8,respectively,which can meet the needs of underwater fish identification.
Multi Feature Fusion for Road Panoramic Driving Detection Based on YOLOP-L
LYU Jialu, ZHOU Li, JU Yongfeng
Computer Science. 2024, 51 (6A): 230700185-8.  doi:10.11896/jsjkx.230700185
Abstract PDF(4238KB) ( 365 )   
References | Related Articles | Metrics
In recent years,traffic image detection technology from the driver’s perspective has become an important research direction in the field of transportation,and extracting various features such as vehicles,roads,and traffic signs has become an urgent task for drivers to understand the diversity of road information.Previous studies have made significant progress in feature extraction for single class object detection.However,these studies cannot be well applied to other feature detection with significant differences,and the accuracy of individual feature detection will be lost during fusion training.In response to the diverse and complex road information within the driver’s field of view,this paper proposes a detection model YOLOP-L based on multi feature fusion training.It can simultaneously fuse and train multiple different feature traffic targets,while ensuring the accuracy of individual detection tasks.The results indicate that YOLOP-L can effectively solve the problems of insufficient detection accuracy and missing segmentation in complex scenes on the challenging BDD100K dataset,improving the accuracy and robustness of vehicle recognition,lane line detection,and joint training of road driving areas.Finally,comparative experiments show that YOLOP-L runs faster than the original YOLOP network.The recall rate increases by 2.2% under the vehicle target detection task.In the lane detection task,the accuracy improves by 2.8%,and the IoU value of the lane line decreases by 2.45% compared to the HybridNets network,but increases by 1.95% compared to the YOLOP-L network.Its overall detection performance improves by 1.1% under the task of driving area segmentation.The results indicate that YOLOP-L can effectively solve the problems of insufficient detection accuracy and missing segmentation in complex scenes on the challenging BDD100K dataset,improving the accuracy and robustness of vehicle recognition,lane line detection,and joint training of road driving areas.
Study on Super-resolution Image Reconstruction Using Residual Feature Aggregation NetworkBased on Attention Mechanism
SUN Yang, DING Jianwei, ZHANG Qi, WEI Huiwen, TIAN Bowen
Computer Science. 2024, 51 (6A): 230600039-6.  doi:10.11896/jsjkx.230600039
Abstract PDF(2823KB) ( 337 )   
References | Related Articles | Metrics
To address the problem of the local effect of the output features of cascaded residual blocks in single image super-resolution algorithms,a residual feature aggregation network combined with attention mechanism is proposed.The network aggregates the features of different levels output by each residual block through skip connections to the end of the residual group,achieves sufficient feature extraction and reuse,expands the receptive field of the network and enhances the expression ability of features.Meanwhile,to improve the spatial correlation of feature information,an enhanced spatial attention mechanism is introduced to improve the performance of the residual blocks.Extensive experiments demonstrate that the proposed model achieves good super-resolution performance.Compared with state-of-the-art methods such as RCAN,SAN,and HAN,the proposed method demonstrates significant effectiveness and advancement in the task of ×4 super-resolution.On five benchmark datasets,our method achieves an average improvement of 0.07dB,0.06dB,and 0.006dB in peak signal-to-noise ratio,as well as an average improvement of 0.001 2,0.001 1,and 0.0008 in structural similarity index.The reconstructed images exhibit a notable increase in quality,with more abundant details.These results verifies he efficacy and advancement of the proposed method.
Object Detection with Receptive Field Expansion and Multi-branch Aggregation
QUE Yue, GAN Menghan, LIU Zhiwei
Computer Science. 2024, 51 (6A): 230600151-6.  doi:10.11896/jsjkx.230600151
Abstract PDF(3159KB) ( 326 )   
References | Related Articles | Metrics
Object detection aims to achieve accurate recognition and localization of objects in images and is an important research area in computer vision.Deep learning-based object detection has made great progress,but there are still shortcomings.The semantic information brought by large down-sampling coefficients is beneficial to image classification,but the down-sampling process inevitably brings information loss,resulting in insufficient model feature extraction and thus a decrease in detection accuracy.To address these problems,this paper proposes a receptive field enhancement and multi-branch aggregation network for object detection.First,the receptive field enhancement module is designed to expand the receptive field of the backbone network.This module can acquire object context cues and can alleviate the problem of object information loss during down-sampling because it does not change the feature spatial resolution.Then,in order to take full advantage of the localization of convolutional neural networks and the long-range feature-dependent property of the self-attention mechanism,the receptive field expanding composite backbone network is constructed to retain local features as well as to improve the global feature perception capability of the model.Finally,a multi-branch aggregation detection head network is proposed to form information flow between three prediction branches and fuse feature information between branches to improve the detection capability of the model.Validation experiments are carried out on MS COCO datasets,and the results show that the average accuracy of the proposed model is better than that of many mainstream object detection models.
SAR Image Target Recognition Based on Cross Domain Few Shot Learning
SHI Songhao, WANG Xiaodan, YANG Chunxiao, WANG Yifei
Computer Science. 2024, 51 (6A): 230800136-7.  doi:10.11896/jsjkx.230800136
Abstract PDF(3162KB) ( 327 )   
References | Related Articles | Metrics
Due to the difficulty in acquiring SAR images and the scarce number of samples available for research,solving the SAR image target recognition problem under few shot conditions has become a community-recognized challenge.With the development of deep learning in the field of computer vision,a variety of few-shot image classification methods have been derived,so a cross-domain few-shot learning paradigm is considered to solve the few-shot SAR image target recognition problem.Concretely,the feature extractors of different domains are first trained inmultiple source domains,while a generalized feature extractor is obtained by knowledge distillation.In this stage,the central kernet alignment method is used to map the extracted features to a higher dimensional space,so as to better distinguish the nonlinear similarity between the original features.Then the target domain image features are extracted by the generalized feature extractor obtained in the previous stage.Finally,a prototype network approach to predict the class of the sample.The experiment proves that the method obtains 88.61% accuracy while reducing the model parameters,which provides a new method for solving the target recognition problem of SAR images with scarce samples.
Denoising Autoencoders Based on Lossy Compress Coding
YUAN Zhen, LIU Jinfeng
Computer Science. 2024, 51 (6A): 230400172-7.  doi:10.11896/jsjkx.230400172
Abstract PDF(4874KB) ( 302 )   
References | Related Articles | Metrics
The performance of image preprocessing algorithms is directly related to the effect of image post-processing,such as image segmentation,target detection,edge extraction,etc.In order to obtain high-quality digital images,image noise reduction has become an essential pre-step.Image noise reduction aims to maintain the integrity of the original information(i.e.,the main features) as much as possible,while being able to remove the useless information in the signal.To this end,this paper proposes a lossy compression coding based convolutional auto-encoders(AutoEnconders,AE) denoising model.According to the principle of maximal coding rate reduction(MCR2),a new loss function is designed to replace the mean squared error(MSE) loss commonly used in mainstream deep learning algorithms to improve the robustness and adaptability of the model.The model first processes the noisy image through an encoder to obtain the hidden variables,and then decodes it using a decoder to remove the noise and obtain the reconstructed image.Next,keeping the encoder unchanged,the reconstructed image is fed into the encoder so that the encoder continues to learn and obtains the reconstructed hidden variables.Finally,the error between the reconstructed image and the original image is indirectly measured by calculating the distance between the hidden variable and the reconstructed hidden variable,which is used as the convergence cost for model training.The proposed model is validated extensively on thumbnails128x128 and CBSD68 datasets,and the experimental results show that the self-encoder framework(AE-MCR2) exhibits good performance under different types of noise(Gaussian,Bernoulli,and Poisson) and has some interpretability.
Ships Detection in Remote Sensing Images Based on Improved FCOS
CHEN Tianpeng, HU Jianwen
Computer Science. 2024, 51 (6A): 230700166-7.  doi:10.11896/jsjkx.230700166
Abstract PDF(3328KB) ( 279 )   
References | Related Articles | Metrics
Ships in remote sensing images are arranged in arbitrary directions.The general target detection algorithm based on deep learning use horizontal bounding box to locate object,which will select a large number of backgrounds when detecting ships.The ships detection performance based on general object detection method is not good.An improved ships detection algorithm based on fully convolutional one-stage(FCOS) object detection network is proposed.Taking FCOS as the baseline,an offset regression branch is added to detection head,and a rotating bounding box is generated by shifting the upper midpoint and the right midpoint of the horizontal bounding box.The ships usually have high aspect ratio,and the angle deviation between the predicted bounding box and the real bounding box has a great influence on the intersection over union(IoU),which damages the detection accuracy of the model.In order to solve this problem,a weighting factor related to the aspect ratio of ships is introduced to calculate the offset loss,so that the target with a large aspect ratio can obtain relatively large offset loss.The proposed method and several mainstream rotating target detection algorithms are tested on HRSC2016 dataset.The results show that the average accuracy of the proposed method is 89.00% and the detection speed is 19.8FPS.Compared with the same type of algorithms without anchor,the proposed method has superior detection speed and accuracy.
Fast Algorithm for Affine Motion Estimation Based on Statistical Analysis
ZHONG Yucheng, HUANG Xiaofeng, NIU Weihong, CUI Yan
Computer Science. 2024, 51 (6A): 230400081-8.  doi:10.11896/jsjkx.230400081
Abstract PDF(3116KB) ( 299 )   
References | Related Articles | Metrics
To reduce the computational complexity of the new generation video coding standard-versatile video coding(VVC),a fast affine motion estimation(AME) calculation method based on statistical analysis is proposed.In the proposed method,we first abandon the integer pixel and 1/16-pixel accuracy,while retaining 1/4-pixel accuracy of the three motion vector(MV) accuracies.Secondly,we build the relationship between the iterations and quantization parameters(QP),slice type,and coding unit(CU) size to obtain an adaptive formula for reducing the number of iterations in AME.Then,the four integer pixels in the four corners of CU in the fine granularity search(FGS) algorithm are replaced by two diagonal sub pixels.Finally,the sum of absolute transform difference(SATD) cost is used to replace the rate distortion optimization(RDO) cost.Experimental results show that compared with the H.266/VVC reference software VTM-10.0,the proposed algorithm saves 8.34% and 8.83% of time in low delay B(LDB) and random access(RA) configurations,while the performance loss is only 0.10% and 0.12%,respectively.
Ship Detection and Recognition of Optical Remote Sensing Images for Embedded Platform
HE Xinyu, LU Chenxin, FENG Shuyi, OUYANG Shangrong, MU Wentao
Computer Science. 2024, 51 (6A): 230700117-7.  doi:10.11896/jsjkx.230700117
Abstract PDF(4164KB) ( 281 )   
References | Related Articles | Metrics
The construction of a maritime power is a current strategic direction for China’s vigorous development.In response to the low detection and classification recognition rate and slow operation speed of existing deep learning-based remote sensing image ship target detection and classification algorithms on embedded platforms,this paper proposes an improved Mix-YOLO network model based on the Cambricon-MLU220 embedded platform.The model is based on the YOLOv7-tiny network as the basic framework.Firstly,the MobileNet series network module is introduced to replace the feature extraction network partially,reducing the network parameter volume.Then,the ULSAM attention mechanism is introduced to enhance the network’s learning and classification ability,reducing the false alarm rate.Finally,in order to make the detection speed improvement effect more obvious on the embedded platform,the network model is programmed by splitting the large module into small modules.Experimental results show that the Mix-YOLO algorithm reduces the parameter volume and calculation by 39.70% and 29.70%,respectively,on the basis of the original YOLOv7-tiny network.The processing frame rate is increased from 97.27 fps to 120.88 fps,and the accuracy is improved by 7.7%.It can achieve real-time detection and recognition of ship targets in remote sensing images.
Modality Fusion Strategy Research Based on Multimodal Video Classification Task
WANG Yifan, ZHANG Xuefang
Computer Science. 2024, 51 (6A): 230300212-5.  doi:10.11896/jsjkx.230300212
Abstract PDF(2566KB) ( 361 )   
References | Related Articles | Metrics
Despite the success of AI-related technologies in many fields,they usually simulate only one type of human perception,which means that they are limited to process information from a single modality.Extracting features from multiple modal information and fusing them effectively is important for developing general AI.In this paper,a comparative study of different multimodal information fusion strategies based on an encoder-decoder architecture with early feature fusion for feature encoding of multimodal information,late decision fusion for prediction results of each modal information,and a combination of both is conducted on a video classification task.This paper also compares two ways to involve audio modal information in modal fusion,i.e.,directly encoding audio with features and then participating in modal fusion or audio by speech-to-text and then participating in modal fusion in the form of text.Experiments show that decision fusion of the prediction results of text and audio modalities alone with those of the fused features of the other two modalities can further improve the classification prediction accuracy under the experimental approach of this study.Moreover,converting speech into text modal information by ASR(Automatic Speech Recognition) can make fuller use of the semantic information contained in it.
Person Re-identification Method Based on Multi-scale Local Feature Fusion
WU Lei, WANG Hairui, ZHU Guifu, ZHAO Jianghe
Computer Science. 2024, 51 (6A): 230300236-6.  doi:10.11896/jsjkx.230300236
Abstract PDF(3227KB) ( 300 )   
References | Related Articles | Metrics
Aiming at the problems of feature misalignment,ignoring semantic correlation between adjacent regions,background clutter and low training efficiency when extracting pedestrian features in existing person re-identification methods,a multi-scale local feature fusion method is proposed.Firstly,the spatial transformation network is introduced to perform adaptive affine transformation on the image to realize the alignment of pedestrian spatial features.Secondly,the feature maps of different scales are segmented horizontally,and the adjacent local blocks are spliced in different ways to make up for the lack of correlation information of adjacent blocks caused by cutting.Then,the correlation between global features and local features is mined.At the same time,the random erasure method is incorporated to process the data set to prevent the model from overfitting.And a variety of loss functions are used to train the network model to improve the intra-class compactness and inter-class diversity of the model.Finally,experiments are carried out on Market-1501 and DukeMTMC-ReID datasets,the Rank-1 reaches 95.0% and 88.8%,and the mAP reaches 89.2% and 78.9%,respectively.The results show that the proposed method can extract more discriminative pedestrian features.
Mark Line Image Enhancement Method in Complex Illumination Environment
WU Jing, FAN Shaosheng, HU Chengyang
Computer Science. 2024, 51 (6A): 230300187-5.  doi:10.11896/jsjkx.230300187
Abstract PDF(3803KB) ( 294 )   
References | Related Articles | Metrics
In the process of driving,autonomous vehicles need to recognize road sign lines to ensure that they stay in the lane.Substation inspection robots realize accurate inspection by recognizing road sign lines.However,due to the influence of complex lighting environment,road sign line information is difficult to be accurately extracted.However,the traditional image enhancement methods can not produce good enhancement effect on all road sign line images in complex lighting environment,so this paper proposes a road sign line image enhancement method in complex lighting environment.The luminance difference of the luminance image in the HSV color gambit space is processed by layers.The image with high luminance difference is enhanced by the method of adaptive gamma correction.For the image with low luminance difference,histogram conical stretching is first used to enlarge the image gray level,and then adaptive gamma correction is used to enlarge the image contrast.Experimental results show that this algorithm can effectively solve the problem of road sign line recognition caused by low illumination,exposure and other complex lighting environment,and is an effective image enhancement method.
Direction-aware Pyramidal Aggregation Network for Road Centerline Extraction
ZHANG Xiaoqing, WANG Qingwang, QU Xin, SHEN Shiquan, WU Changyi, LIU Ju
Computer Science. 2024, 51 (6A): 230400101-7.  doi:10.11896/jsjkx.230400101
Abstract PDF(2895KB) ( 282 )   
References | Related Articles | Metrics
As an abstract class,road centerlines have no explicit features,which in turn causes the model fail to extract road centerlines accurately.To address this problem,this paper models road centerline extraction as a semantic segmentation task,and proposes a direction-aware pyramidal aggregation network(DAPANet) based on the spatial linear structure of road centerlines.Firstly,for the spatial distribution characteristics and structural features of road centerlines,this paper designs the direction-aware module(DAM) to extract the features of road centerlines using four direction-aware layers on each of the four layers of the final output of the backbone network(ResNet18).Then,it further designs the pyramid aggregation module(PAM) to fuse the structural features extracted from the four layers to obtain a more robust road centerline feature.Experiments are conducted on real data collected under the UAV platform,and the experimental results show that the proposed DAPANet achieves 84.7% of mIoU and 98.6% of Precision,in which the IoU of road centerline reaches 77.28%,outperforming other advanced comparative methods and proving the effectiveness of the proposed method.
Container Lock Hole Recognition Algorithm Based on Lightweight YOLOv5s
LI Yuanxin, GUO Zhongfeng, YANG Junlin
Computer Science. 2024, 51 (6A): 230900021-6.  doi:10.11896/jsjkx.230900021
Abstract PDF(4569KB) ( 411 )   
References | Related Articles | Metrics
In order to improve the efficiency of container lock hole recognition and reduce the number of algorithm parameters and model size,a container lock hole recognition algorithm based on lightweight YOLOv5s is proposed.This algorithm replaces the Backbone feature extraction network of YOLOv5s with a lightweight neural network model MobileNetV3,and further optimizes the feature fusion structure of the neck part,which reduces the number of parameters and calculation amount of the model and improves the detection speed.The accuracy and efficiency of detection are improved by introducing the attention mechanism SimAM layer.After the model is reconstructed with different improvement methods,the training and testing are carried out on the self-built container lock hole data set,and the comparison test is carried out with the improved YOLOv5s.The results show that the size of the improved model is only 2.4MB,the average detection time of each image is 5.1ms,and the average detection accuracy is 97.3%.Compared with the original target detection model,the size of the model is reduced by 82.8%,and the detection speed is increased by 39%,showing strong real-time algorithm on the premise of ensuring high detection accuracy.
Study on Intelligent Defect Recognition Algorithm of Aerial Insulator Image
DAI Yongdong, JIN Yang, DAI Yufan, FU Jing, WANG Maofei, LIU Xi
Computer Science. 2024, 51 (6A): 230700172-5.  doi:10.11896/jsjkx.230700172
Abstract PDF(3682KB) ( 314 )   
References | Related Articles | Metrics
Since power line insulator defects can easily lead to transmission system failures,it is critical to study defect detection algorithms.Traditional detection methods can only accurately locate insulators and detect faults with sufficient prerequisite knowledge,low interference,or under specific conditions.For automatically locating insulators and detecting insulator defects in UAV aerial images,we propose a novel deep Convolutional Neural Network(CNN) architecture that not only locates insulators but also detects insulator defects.The architecture is divided into two modules,the first module for insulator localization is responsible for detecting all insulators in the image,and the second module for insulator defect detection is responsible for detecting all insulator defects in the image,using a CNN with a Region Proposal Network(RPN) to convert insulator defect detection into a two-level object detection problem.Finally,we perform experiments using real datasets with defect detection accuracy and recall rates of 91.2% and 95.6%,respectively,satisfying the robustness and accuracy requirements.
Steel Defect Detection Based on Improved YOLOv7
HUANG Haixin, WU Di
Computer Science. 2024, 51 (6A): 230800018-5.  doi:10.11896/jsjkx.230800018
Abstract PDF(3684KB) ( 360 )   
References | Related Articles | Metrics
Steel surface defect detection is very important in actual production.In order to accurately detect defects,this paper designs a steel surface defect detection model based on improved YOLOv7.Firstly,the Ghost module is introduced into the backbone network structure to enhance the ability of the model to extract features and identify small features while reducing the number of model parameters.Secondly,the attention mechanism is embedded in the pooling module.Finally,the loss function is improved by introducing EIOU,so as to better optimize the YOLOv7 network model,which can better deal with the imbalance of samples,so as to achieve better optimization similarity.Experimental results show that,compared with the original model,the mAP of the proposed model increases by 4.2% to 76.9%.The model can meet the needs of accurate detection and identification of steel surface defects.
Single Stage Unsupervised Visible-infrared Person Re-identification
LOU Ren, HE Renqiang, ZHAO Sanyuan, HAO Xin, ZHOU Yueqi, WANG Xinyuan, LI Fangfang
Computer Science. 2024, 51 (6A): 230600138-7.  doi:10.11896/jsjkx.230600138
Abstract PDF(2959KB) ( 382 )   
References | Related Articles | Metrics
The unsupervised visible-infrared multi-modal person re-identification can alleviate the problem that a lot of manual labeling is required in the intelligent monitoring scene.Common multi-stage models are used to process different modal data separately.This paper proposes an effective single-stage unsupervised cross-modal pedestrian recognition method,and designs a clustering algorithm based on confidence factor and a cross-modal feature processing method based on graph embedding to solve the unlabeled problem and cross-modal problem respectively.Experimental results show that compared with the existing algorithms,the proposed algorithm has achieved an improvement of at least 7% in the case of r=1.
Study on Defect Detection Algorithm of Transmission Line in Complex Background
WU Chunming, WANG Tiaojun
Computer Science. 2024, 51 (6A): 230500178-6.  doi:10.11896/jsjkx.230500178
Abstract PDF(3309KB) ( 300 )   
References | Related Articles | Metrics
Regular inspection of transmission lines is of great significance to ensure the safe and stable operation of power systems.For the problems of complex background of transmission line aerial images,large changes in target scale and many small targets,a transmission line target detection algorithm based on YOLOv5s is proposed.The algorithm adopts feature refinement module to optimize tiny target features,and embeds SimAM attention module in the network to optimize the feature extraction of the model by means of unified weights of energy functions,and finally introduces NWD loss function to weaken the sensitivity of the model to small target position deviation and improve the recognition and detection ability of the model for small targets.Experimental results show that the average detection accuracy of the model for transmission line targets is as high as 98.8%,which is 1.2% higher compared with the benchmark model.
Detection Method for Workers’ Illegal Operation Behavior in PackagingWorkshop of CigaretteFactory
LIU Heng, LIN Hongyu, WU Tao
Computer Science. 2024, 51 (6A): 230700123-8.  doi:10.11896/jsjkx.230700123
Abstract PDF(5402KB) ( 289 )   
References | Related Articles | Metrics
Small object detection has always been a difficult point in the field of object detection.In response to the high installation of cameras in cigarette factory packaging rooms,low accuracy of small object detection,and overall low detection accuracy,an improved YOLOv8n object detection algorithm YOLOv8n FIAL has been proposed.Firstly,the C2fg module with added channel rearrangement mechanism is used to replace the original C2f module to improve feature learning ability.The adaptive channel feature fusion module is used to replace the Concate operation in the Neck section of the YOLOv8n algorithm,making feature fusion more comprehensive;then,add a small target detection layer to improve the accuracy of small target detection and reduce the missed detection rate;finally,the Focal EIOU loss function is used to replace the original CIOU loss function.The number of high-quality anchor boxes with a large overlap between the balanced anchor box and the real box is much less than the problem of imbalanced training instances of low-quality anchor boxes.The experimental results show that on the self-made cigarette factory worker violation operation dataset,the YOLOv8n FIAL detection method proposed in this article has an overall average accuracy improvement of 7.6% compared to the original YOLOv8n method.The average accuracy improvement for the three types of small targets,namely mouth,nose,handheld phone,and clothing collar,is the largest,with increases of 8.3%,8%,and 9.6%,respectively;On the public dataset VOC2007,the YOLOv8n FIAL algorithm has an overall average accuracy improvement of 1.6% compared to the YOLOv8n algorithm.
Iron Ore Image Classification Method Based on Improved Efficientnetv2
LYU Yiming, WANG Jiyang
Computer Science. 2024, 51 (6A): 230600212-6.  doi:10.11896/jsjkx.230600212
Abstract PDF(3102KB) ( 293 )   
References | Related Articles | Metrics
With the rapid development of the world today,a variety of high-rise buildings,the demand for iron and steel is increasing,and the demand for iron ore is also rising year by year.Because the iron ore industry is the exploitation of non-renewable resources,it is extremely important to classify iron ore and improve its utilization efficiency.In order to improve the classification speed and accuracy of iron ore,an iron ore image classification method based on convolutional neural network and attention me-chanism is proposed.It does not need to manually extract features from the input images.Through the deep learning model framework,it makes up for the shortcomings of traditional image processing algorithms,realizes accurate and efficient classification of iron ore,and can better identify various types of iron ore.It has good classification effect and accuracy for the three basic types of iron ore.Experiments show that the accuracy of the proposed method on the data set reaches 87.46%.Compared with other algorithm models,the model training time is shorter and the performance is better.Using deep learning methods to deploy automated iron ore classification models is of great significance to social development.
Residual Dense Convolutional Autoencoder for High Noise Image Denoising
ZHANG Jie, LU Miaoxin, LI Jiakang, XU Dayong, HUANG Wenxiao, SHI Xiaoping
Computer Science. 2024, 51 (6A): 230400073-7.  doi:10.11896/jsjkx.230400073
Abstract PDF(3987KB) ( 319 )   
References | Related Articles | Metrics
In the field of high noise image denoising,traditional convolutional auto-encoders face challenges in extracting meaningful depth feature information,resulting in poor image reconstruction quality.To address this issue and improve the reconstruction quality of high noise images,this paper proposes a residual-density convolutional auto-encoder network model.The model firstly uses convolutional operations instead of pooling operations to improve the characterisation of high noise images.Moreover,a three-stage dense residual network structure is designed for effective image feature mining during the coding and decoding stages.Finally,an optimised loss function is designed to further improve the quality of the reconstructed images.Experimental results show that the denoising method presented in this paper is capable of reconstructing high quality images from high noise images while preserving more detailed feature information.It confirms the effectiveness of the algorithm in image denoising.The proposed method effectively addresses the challenge of denoising high noise images and has significant practical value.
Complex Environment License Plate Recognition Algorithm Based on Improved Image Enhancement and CNN
YANG Xiuzhang, WU Shuai, REN Tianshu, LIAO Wenjing, XIANG Meiyu, YU Xiaomin, LIU Jianyi, CHEN Dengjian
Computer Science. 2024, 51 (6A): 220200162-7.  doi:10.11896/jsjkx.220200162
Abstract PDF(4043KB) ( 405 )   
References | Related Articles | Metrics
Traditional image recognition and deep learning models are difficult to detect license plates in complex environments.Their scene applicability and accuracy are low,which seriously threatens traffic safety and affects the development of intelligent transportation.This paper proposes a complex environment license plate recognition algorithm based on improved image enhancement and CNN.First,after calculating the average gray value of the target image,we use the ACE algorithm and the dark channel prior dehazing algorithm to perform image enhancement on the license plate dataset in complex environments.Then,a license plate area localization algorithm that combines the key features of color and the peak is proposed,effectively locating the license plate area by eight-core steps in a complex environment.Finally,a five-layer convolutional neural network model is constructed to recognize the license plate character.Experimental results show that the proposed algorithm can effectively identify the license plates of vehicles in complex environments.The precision of the algorithm’s license plate area location in complex environments is 86.04%,the recall is 82.60%,and the F1-score is 84.29%.Among them,the F1-score of the proposed algorithm is 47.29% higher than the traditional image processing algorithm,24.73% higher than the SSD algorithm,26.37% higher than the YOLO algorithm and 17.15% higher than the YOLOv3 algorithm.At the same time,the time complexity of the proposed method is low,and it belongs to a lightweight license plate recognition method.Also,it can eliminate noise and realize license plate character re-cognition.Therefore,it has specific application prospects and practical value and provides a theoretical basis for intelligent transportation research.
Conveyor Belt Defect Detection Network Combining Attention Mechanism with Line Laser Assistance
SONG Zhen, WANG Jiqiang, HOU Moyu, ZHAO Lin
Computer Science. 2024, 51 (6A): 230800115-6.  doi:10.11896/jsjkx.230800115
Abstract PDF(3491KB) ( 324 )   
References | Related Articles | Metrics
Aiming to the problems of a wide variety of conveyor belt defects,a small proportion of defect feature pixels,and the low detection accuracy of traditional algorithms,random affine transformation is used to expand the sample dataset.The influence of the correlation between each channel and its contribution value on the model feature extraction is analyzed,and a channel correlation weighted attention mechanism is proposed.The correlation degree and contribution weight of each channel are calculated by correlation convolution and full connection,and the proportion of corresponding channel information is adjusted to improve the detection accuracy of the model.The influence of upsampling and convolution block on the size of the output feature map is analyzed.The original feature pyramid feature convolution block and upsampling structure are improved to enhance the feature extraction and defect detection ability of the algorithm for small targets.Finally,the test is conducted on the conveyor belt defect data set.The results show that the improved algorithm model can effectively identify the typical defect features such as foreign body insertion,breakage,and tearing of the conveyor belt.The recognition precision can reach 99.7%,the recall rate is increased to 99.5%,and the mean average precision is 99.5%.
Method for Lung Nodule Detection on CT Images Using Improved YOLOv5
WU Chunming, LIU Yali
Computer Science. 2024, 51 (6A): 230500019-6.  doi:10.11896/jsjkx.230500019
Abstract PDF(3643KB) ( 362 )   
References | Related Articles | Metrics
To address the problem of poor detection results of lung nodules in CT images by YOLOv5 algorithm,an improved YOLOv5-based lung nodule detection method is proposed.The feature pyramid of the Neck part of the YOLOv5 network is improved to weighted bidirectional feature pyramid network.In the YOLOv5 network,the Backbone part adds an efficient channel attention mechanism and a coordinate attention mechanism.Experiments are conducted on the LIDC-IDRI dataset and the results show that the average detection accuracy id up to 80.2%,and the recall is up to 90.75%,so this method can effectively detect lung nodules.Compared with the YOLOv5 algorithm,the improved algorithm improves 7.7% in mAP and 5.5% in recall.
Clothing Image Segmentation Method Based on Deeplabv3+ Fused with Attention Mechanism
XIAO Yahui, ZHANG Zili, HU Xinrong, PENG Tao, ZHANG Jun
Computer Science. 2024, 51 (6A): 230900153-7.  doi:10.11896/jsjkx.230900153
Abstract PDF(3840KB) ( 345 )   
References | Related Articles | Metrics
Aiming at the problems of rough edge segmentation and low segmentation accuracy caused by color,texture,background and multi-object occlusion in clothing image segmentation,an image semantic segmentation method(FFDNet) based on Deeplabv3+ with attention mechanism is proposed.Firstly,the backbone network of the model uses the ResNet101 network.The feature-enhanced attention module(FEAM) is added at the end of it.The feature map is weighted from the two dimensions of channel and spatial to mine and enhance the feature information and optimize the segmentation edge to improve network clarity.Secondly,a feature align module(FAM) is introduced as a novel upsampling method to address the problem of segmentation errors and low efficiency caused by misalignment between features during the fusion of different scale features,so as to to improve the accuracy and robustness of clothing image segmentation.Finally,the mean intersection over union of the proposed method reaches 55.2% and 79.4% on Deepfashion2 and PASCAL VOC2012,respectively.In terms of parameter size,the model only increases by 0.61MB compared to the original model on Deepfashion2.The segmentation performance of the FFDNet is superior to the existing state-of-the-art network models,which can effectively capture image local detail information and reduce pixel classification errors.
Detection of Pitting Defects on the Surface of Ball Screw Drive Based on Improved Deeplabv3+ Algorithm
LANG Lang, CHEN Xiaoqin, LIU Sha, ZHOU Qiang
Computer Science. 2024, 51 (6A): 240200058-6.  doi:10.11896/jsjkx.240200058
Abstract PDF(3613KB) ( 292 )   
References | Related Articles | Metrics
Aiming at the problems of complex background environments,small pitting defect targets,and difficulty in detection on the surface of ball screw drives,an improved Deeplabv3+ algorithm for segmenting surface defects of ball screw drives is proposed.This algorithm adopts Re2Net-50 to replace the backbone network of Deeplabv3+,significantly enhances the ability to recognize small-sized defect targets.Additionally,by integrating feature pyramid networks(FPN) into the backbone network,the algorithm effectively extracts multi-scale information,thereby improving the precise localization of defect targets.Finally,the coordinate attention mechanism is introduced after the ASPP module of the Deeplabv3+ network,enhancing the model’s focus on spatial dimensions within the image and effectively capturing long-range spatial dependencies.Experimental results demonstrate that,compared to the original Deeplabv3+,the proposed algorithm shows a 4.38% improvement in the mean intersection over union(MIoU) metric,a 5.52% increase in accuracy,and a 2.74% rise in F1-score.Furthermore,when compared with other classic semantic segmentation algorithms,the proposedalgorithm also exhibits certain superiority.
Facial Expression Recognition Integrating 3D Facial Dynamic Information and Optical Flow Information
ZHANG Huazhong, PAN Yuekai, TU Xiaoguang, LIU Jianhua, XU Luopeng, ZHOU Chao
Computer Science. 2024, 51 (6A): 230700210-7.  doi:10.11896/jsjkx.230700210
Abstract PDF(3757KB) ( 304 )   
References | Related Articles | Metrics
Facial expression recognition has achieved excellent results in static images,but when these methods are applied to vi-deos or image sequences,their accuracy and robustness are often affected.Traditional methods cannot usually recognize facial expressions based on spatial information and optical flow information.However,these auxiliary recognition information are all two-dimensional information,without considering that facial expression changes are a three-dimensional change process.In order to fully mine the deep semantic information of facial expression recognition,this paper proposes a fusion expression recognition method based on the combination of 3D facial dynamic information and optical flow information.This method constructs a multi stream convolutional neural network based on facial depth images,optical flow images,and RGB images,and integrates information from three modalities for facial expression recognition.The proposed method has been fully validated on CAER and RAVDESS datasets,and experimental results show that it outperforms current mainstream methods in facial expression recognition performance,which proves its effectiveness.
Improved YOLOV7 for Fall Detection
ZHAO Junjie, ZHOU Xiaojing, LI Jiaxing
Computer Science. 2024, 51 (6A): 230800039-6.  doi:10.11896/jsjkx.230800039
Abstract PDF(4346KB) ( 352 )   
References | Related Articles | Metrics
With the advent of the aging population,it is increasingly important for the elderly to be detected and treated in time after falling.In order to improve the detection accuracy and speed of the original YOLOV7,a series of improvements are made to YOLOV7,and a new YOLOV7 structure,namely YOLOV7-CMJ structure,is proposed.Firstly,the collected pictures are processed,and some pictures are preprocessed with rotation,brightness and other preprocessing,and calibrated to obtain sample datasets.Secondly,the CBAM attention mechanism is introduced to enhance channel attention and spatial attention,thereby improving the accuracy of the model.Finally,the original PANet feature fusion in YOLOV7 is changed to MJPANet,that is,multi-beating sign fusion structure,and the previous Concat is replaced by weighting,so as to improve the YOLOV7-CMJ structure.By comparing with the original YOLOV7,it can be seen that the accuracy of the improved algorithm is increased by 7.4%,the recall rate is increased by 7.1%,and the average accuracy is increased by 7.1%,which proves the effectiveness of the improved algorithm and better meets the requirements of fall detection.
Big Data & Data Science
CTGANBoost:Credit Fraud Detection Based on CTGAN and Boosting
ZHUO Peiyan, ZHANG Yaona, LIU Wei, LIU Zijin, SONG You
Computer Science. 2024, 51 (6A): 230600199-7.  doi:10.11896/jsjkx.230600199
Abstract PDF(2382KB) ( 290 )   
References | Related Articles | Metrics
In the financial industry,credit fraud detection is an important task,which can reduce a lot of economic losses for banks and consumer institutions.However,there are problems of class imbalance and overlapping features of positive and negative samples in credit data,which lead to low sensitivity of minority class recognition and low data discrimination.To address these pro-blems,a CTGANBoost method is proposed for credit fraud detection.First,in each Boosting iteration of AdaBoost,the conditional tabular generative adversarial network(CTGAN) method based on class label information constraint is introduced to learn feature distribution for minority class data augmentation.Secondly,based on the enhanced data set synthesized by CTGAN,a weight normalization method is designed to ensure that the distribution characteristics and relative weights of the original data set are maintained during the sample weighting process.Experimental results on three open source datasets show that CTGANBoost outperforms other mainstream credit fraud detection methods,with AUC values increase by 0.5%~2.0% and F1 values increase by 0.6%~1.8%,which verifies the effectiveness and generalization ability of CTGANBoost method.
Study on Client Selection Strategy and Dataset Partition in Federated Learning Basedon Edge TB
ZHOU Tianyang, YANG Lei
Computer Science. 2024, 51 (6A): 230800046-6.  doi:10.11896/jsjkx.230800046
Abstract PDF(3558KB) ( 325 )   
References | Related Articles | Metrics
Federated learning is one of the applications of distributed machine learning in reality.In view of the heterogeneity in Federated learning,based on FedProx algorithm,this paper proposes a client selection strategy that preferentially selects the client with large near end items.The effect is better than the common client selection strategy that selects the client with large local loss value,which can effectively improve the Rate of convergence of FedProx algorithm under heterogeneous data and systems,and improve the accuracy within limited aggregation times.According to the hypothesis of heterogeneous data in federated learning,a set of heterogeneous data partition process is designed,and the heterogeneous federated dataset based on the real image dataset is obtained as the experimental dataset.Using the open-source distributed machine learning framework Edge-TB as the experimental testing platform and the heterogeneous partitioned Cifar10 as the dataset,the experiment proves that,using the new client selection strategy,the accuracy of the improved FedProx algorithm improves by 14.96%,and the communication overhead reduces by 6.3% compared to the original algorithm in a limited number of aggregation round.Compared with the SCAFFOLD algorithm,the accuracy is improved by 3.6%,communication overhead is reduced by 51.7%,and training time is reduced by 15.4%.
Study on Industrial Defect Augmentation Data Filtering Based on OOD Scores
YIN Xudong, CHEN Junyang, ZHOU Bo
Computer Science. 2024, 51 (6A): 230700111-7.  doi:10.11896/jsjkx.230700111
Abstract PDF(4415KB) ( 291 )   
References | Related Articles | Metrics
In deep learning-based industrial defect detection,data augmentation plays a crucial role in mitigating the scarcity of defect data.However,the effective selection of augmented data from a vast pool of candidates remains an unexplored area,hampering the performance enhancement of industrial detection models.To address this issue,this study focuses on the research of industrial defect augmentation data filtering based on out-of-distribution(OOD) scores.The proposed approach involves the generation of industrial enhancement data using the pix2pix network.Subsequently,OOD scores are computed using a deep ensemble-based scoring method,which facilitates the grouping of augmented data based on their OOD scores.Furthermore,the distribution of the augmented data is analyzed through dimensionality reduction and projection views.Finally,defect detection of the grouped augmented data is performed using object detection algorithms,while investigating the impact of the out-of-distribution degree on the quality of the augmented data through the accuracy gain of the object detection model.Experimental results demonstrate a substantial difference in the distribution between industrial defect augmented data with higher OOD scores and the training data.Incorporating this subset of augmented data for training data expansion enhances the generalization of the model and significantly improves the detection accuracy of the object detection algorithm.
Cancer Subtype Prediction Based on Similar Network Fusion Algorithm
ZHANG Xiaoxi, LI Dongxi
Computer Science. 2024, 51 (6A): 230500006-7.  doi:10.11896/jsjkx.230500006
Abstract PDF(3298KB) ( 256 )   
References | Related Articles | Metrics
Mining the interaction relationship between genes from gene expression data and construct gene regulatory network is one of the important research topics in bioinformatics.However,the current popular neural network only considers the interaction and association between genes in its architecture,and does not consider the interaction and association between patients.Therefore,a cancer subtype prediction model based on the fusion algorithm of weighted gene similarity network and sample similarity network,namely WGCSS,is proposed in this paper.In this method,the fusion of feature space and sample space information is realized,and the interaction between genes and samples is considered,and the graph convolutional network is used for prediction.Aggregating information in two spaces will lead to a serious oversmoothing problem.Therefore,a residual layer is introduced in the model to alleviate the oversmoothing problem.This method can make the prediction of cancer subtypes more accurate by aggregating the data information in the two spaces.To verify the generalization performance of the method,datasets of invasive breast carcinoma(BRCA),glioblastoma multiforme(GBM),and LUNG(LUNG) are used for analysis,and the resulting high classification accuracy demonstrates the superiority of the method.Survival analysis is also performed on three types of data sets,and it is proved that the method has significant differences in the survival curves of cancer subtypes in three cancer datasets.
Attention-based Multi-scale Distillation Anomaly Detection
QIAO Hong, XING Hongjie
Computer Science. 2024, 51 (6A): 230300223-11.  doi:10.11896/jsjkx.230300223
Abstract PDF(4867KB) ( 322 )   
References | Related Articles | Metrics
In the anomaly detection method based on knowledge distillation,the teacher network is much larger than the student network,so that the obtained feature representation has different visual fields corresponding to the image at the same position.In order to solve this problem,the structure of student network and teacher network can be the same.However,However,in the testing phase,the same student network and teacher network will lead to too small difference in their feature representation,which will affect the performance of anomaly detection.In order to solve this problem,ECA based multi-scale knowledge distillation anomaly detection(ECA-MSKDAD) is proposed,and a relative distance loss function is proposed based on data enhancement operation.The pre-trained network is used as the teacher network,and the network with the same network structure as the teacher network is used as the student network.In the training stage,the data enhancement operation is adopted for the training samples to expand the scale of the training set,and the efficient channel attention(ECA) module is introduced into the student network to increase the difference between the teacher network and the student network,increase the reconstruction error of the abnormal data and improve the detection performance of the model.In addition,the relative distance loss function is used to transfer the relationship between data from the teacher network to the student network,and the network parameters of the student network are optimized.Experiments on MVTec AD show that compared with nine related methods,the proposed method achieves better performance in anomaly detection and anomaly localization.
Imbalanced Data Oversampling Method Based on Variance Transfer
ZHENG Yifan, WANG Maoning
Computer Science. 2024, 51 (6A): 230400198-6.  doi:10.11896/jsjkx.230400198
Abstract PDF(2961KB) ( 299 )   
References | Related Articles | Metrics
Resampling is an important method to solve imbalanced data classification problem.However,when the size of data set is very small,undersampling will lose important information of the data set,so oversampling is the research focus of imbalanced data classification.Although the existing oversampling methods partially solve the problem of imbalance between classes,they essentially do not introduce additional information to minority class,and there is still a risk of overfitting.To solve these problems,VTO,an oversampling method based on variance migration of the majority class,is proposed in this paper.In this method,a shift vector is extracted from majority class,and the feature weight matrix of the minority class and the majority class is used for adjustment.Furthermore,the shift vectors filtered by the confidence conditions are superimposed to the center of the minority class,so as to introduce the majority class variance in the generation process of new minority class samples,then enrich the minority class feature space.In order to verify the effectiveness of the proposed algorithm,decision tree is used as classification model to train on 6 KEEL data sets.Compared with SMOTEENN and other over-sampling methods,with F-score and PR-AUC values as evaluation indexes,the results show that VTO is more advantageous in dealing with imbalanced data classification.
K-step Reachability Query Algorithm for Large Graphs
TONG Zhengnan, BU Tianming
Computer Science. 2024, 51 (6A): 230500031-10.  doi:10.11896/jsjkx.230500031
Abstract PDF(2449KB) ( 290 )   
References | Related Articles | Metrics
The k-step reachable query is used to answer whether there exists a path of length not exceeding k between two points in agiven directed acyclic graph(DAG).To address the problems of large index size and low query processing efficiency of existing methods,this paper proposes a multiplicative index built on a large graph based on tree cover to improve index query efficiency,and combines GRAIL algorithm and the improved FELINE algorithm for pruning the point pairs of inherently unreachable queries.The paper conducts experimental tests based on 19 real datasets and compares with existing algorithms in three metrics:index size,index time,and query time.The experimental results verify the efficiency of the proposed algorithm in this paper.
User Interest Recognition Method Incorporating Category Labels and Topic Information
KANG Zhiyong, LI Bicheng, LIN Huang
Computer Science. 2024, 51 (6A): 230500169-8.  doi:10.11896/jsjkx.230500169
Abstract PDF(2114KB) ( 296 )   
References | Related Articles | Metrics
The discovery of social media user interest is of great significance in information overload alleviation,personalized recommendation,and positive guidance of information dissemination.Existing research of interest recognition fails to consider the help of topic information and corresponding category labels information for model learning text features at the same time.Therefore,a user interest recognition method incorporating category labels and topic information is proposed.Firstly,semantic features of text and label sequences are extracted separately by using the BERT pre-trained model,BiLSTM model,and multi-head self-attention mechanism.Then,a label attention mechanism is introduced to make the model pay more attention to the words related to the text’s corresponding category label.Secondly,text topic features are obtained by using the LDA topic model and Word2Vec model.Subsequently,a gating mechanism is designed for feature fusion to enable the model to adaptively merge multiple features,thereby realizing text interest classification.Finally,the number of texts published by users in each interest category is counted,and the interest category with the highest count is determined as users’ interest recognition results.To verify the effectiveness of the proposed method,a Weibo users’ interest recognition dataset is constructed.Experimental results show that the model achieves optimal performance in Weibo text classification and user interest recognition tasks.
Diversified Recommendation Based on Light Graph Convolution Networks and ImplicitFeedback Enhancement
HUANG Chungan, WANG Guiping, WU Bo, BAI Xin
Computer Science. 2024, 51 (6A): 230900038-11.  doi:10.11896/jsjkx.230900038
Abstract PDF(3648KB) ( 273 )   
References | Related Articles | Metrics
In recent years,researchers have been striving to improve the accuracy of recommendation systems while ignoring the critical impact of diversity on user satisfaction.Most current diversifiedrecommendation algorithms impose diversity constraints after the accuracy candidate list generated by traditional post-processing algorithms.However,this decoupled design consistently results in a sub-optimal system.Meanwhile,although the effectiveness of recommendation algorithms using graph convolution networks(GCN) in improving recommendation accuracy has been demonstrated,the applicability and diversity design for recommendation remain neglected.In addition,recommendation algorithms employing a single explicit user feedback of purchasing inevitably fall into “recommendation overload”.Therefore,an end-to-end diversified light graph convolution networks recommendation(DLGCRec) is proposed to overcome these drawbacks.Firstly,GCN is simplified to light graph convolution networks(LGCN) to be suitable for recommendation,and LGCN is utilized to push diversity upstream to the recommendation process of accuracy match.Then,in the sampling phase of LGCN,diversity-boosted negative sampling that introduces user implicit feedback is utilized to explore the user’s diversified preferences.Finally,a multi-layer feature fusion strategy is utilized to capture the complete feature embedding of the nodes to enhance the recommendation performance.Experimental results on real datasets validate the effectiveness of DLGCRec in applying in recommendations and enhancing diversity.Further ablation studies confirm that DLGCRec effectively mitigates the accuracy-diversity dilemma.
Hierarchical Traffic Flow Prediction Model Based on Graph Autoencoder and GRU Network
ZHAO Ziqi, YANG Bin, ZHANG Yuanguang
Computer Science. 2024, 51 (6A): 230400148-6.  doi:10.11896/jsjkx.230400148
Abstract PDF(3354KB) ( 312 )   
References | Related Articles | Metrics
Accurate traffic flow prediction information not only provides traffic administrator with a strong foundation for traffic decisions,but also eases congestion.In traffic flow forecasting tasks,obtaining valid spatiotemporal characteristics of the traffic flow is a prerequisite to ensure the effectiveness of the forecast.Most of the existing methods use data from future moments for supervised learning,and the extracted features have limitations.To address the problem that existing prediction models cannot fully exploit the spatiotemporal characteristics of traffic flows,this paper proposes a hierarchical traffic prediction model based on an improved graph autoencoder and gated recurrent unit.The graph attention autoencoder is first used to deeply explore the spatial characteristics of the traffic flow in an unsupervised manner,and then the gated recurrent unit is used to extract temporal features.The hierarchical structure uses separate training for learning spatio-temporal dependencies,aiming to capture the naturally existing spatial topological features of the road network and make it compatible with traffic flow prediction tasks at different time steps.Extensive experiments demonstrate that the proposed GAE-GRU model achieves excellent performance in traffic prediction tasks on different datasets,with MAE,RMSE and MAPE outperforming the baseline model.
Study on Three-level Short Video User Portrait Based on Improved Topic Model Method
HUANG Yumin, ZHAO Chanchan
Computer Science. 2024, 51 (6A): 230800093-7.  doi:10.11896/jsjkx.230800093
Abstract PDF(3055KB) ( 395 )   
References | Related Articles | Metrics
Aiming at the problem of how to quickly extract accurate user interests from massive short video data,user data and interactive data,a three-level label user portrait construction method based on topic model is proposed.Based onthe topic construction method,the video topic words obtained by the fused LDA and GSDMM topic models are used as user interest expression vectors.Firstly,an LDA filter is built to eliminate the topic-independent text information by comparing the threshold,so as to reduce the scale of the text and reduce the influence of non-main corpus on the generation of interest expression vector.Then,the construction method of the feature word weight matrix combining semantic information and context information is proposed.The Bi-GRU neural network is used to calculate the context feature of the word vector as the context feature,and the word frequency weight calculated by the TF-IDF algorithm is used as the semantic feature.Combining context and semantic features to expand the meaning of feature words.Finally,the GSDMM model with interest weight distribution is used to learn the feature vector weight matrix,and the user interest tag generation and the interest weight correction under the influence of different user preferences are realized.Experiments show that this method can represent user portraits more completely and accurately,which is better than single topic construction method,and performs well in clustering effect.By constructing a complete user portrait,the user’s pain points could be accurately grasp,so as to provide services for subsequent personalized recommendation.
Ontology-driven Study on Information Structuring of Aeronautical Information Tables
LAI Xin, LI Sining, LIANG Changsheng, ZHANG Hengyan
Computer Science. 2024, 51 (6A): 230800150-7.  doi:10.11896/jsjkx.230800150
Abstract PDF(3804KB) ( 289 )   
References | Related Articles | Metrics
The aeronautical information publication(AIP) is the main carrier recommended by ICAO to present aeronautical information of all countries,in which a large amount of aeronautical data and aeronautical operation restriction information exists in the form of table information.In order to achieve intelligent querying of AIP and to facilitate the extraction and utilization of static data within it,it is necessary to perform feature extraction and structural processing on the tabular information within AIP.In this paper,an ontology-driven structured extraction method for aeronautical information tabular data is proposed,taking tabular data in AIP as the research object.Firstly,the ontology framework of aeronautical information is constructed to realize a unified and standardized description of domain knowledge.Secondly,the layout structure of form documents is studied and preprocessed using Document AI,and the feature entity extraction is verified and analyzed using random forest algorithm and conditional random field model(CRF).Experimental results show that the proposed method can effectively extract the feature entities in AIP,and provide reference for the in-depth mining of static data in the field of aeronautical information.
RM-RT2NI:A Recommendation Model with Review Timeliness and Trusted Neighbor Influence
HAN Zhigeng, ZHOU Ting, CHEN Geng, FU Chunshuo, CHEN Jian
Computer Science. 2024, 51 (6A): 230800160-7.  doi:10.11896/jsjkx.230800160
Abstract PDF(2500KB) ( 300 )   
References | Related Articles | Metrics
While recommendation models based on matrix factorization can handle high-dimensional rating data,they are prone to challenges posed by data sparsity in ratings.Recommendation models that incorporate both ratings and reviews alleviate the sparsity issue by incorporating latent user preferences and item attribute information embedded in reviews.However,these models often neglect the review timeliness and the trusted neighbor influence during feature extraction,resulting in limited acquisition of comprehensive user and item characteristics.In order to enhance accuracy further,a novel recommendation model named RM-RT2NI is proposed,which integrates the review timeliness and the trusted neighbor influence.Built upon the rating matrix,this model employs matrix factorization to extract shallow features representing user preferences and item attributes.It employs cloud mo-deling,a refined user similarity assessment model,and a newly constructed credibility assessment model to capture the trusted neighbor influence.Leveraging the textual content of reviews,BERT is utilized to obtain latent representations of individual reviews.Bi-directional GRU is employed to capture inter-review relationships,while an attention mechanism incorporating timeliness is introduced to evaluate the timeliness contribution of each review,thus deriving deep features for users and items.Subsequently,the shallow and deep user features,along with the credibility-enhanced neighboring influence features,are fused to form comprehensive user representations.Similarly,shallow and deep item features are merged with this fused representation to gene-rate comprehensive item representations.These representations are then fed into a fully connected neural network to predict user-item ratings.Experimental evaluation is conducted on five publicly available datasets.The results demonstrate that,in comparison to seven baseline models,RM-RT2NI exhibits superior rating prediction accuracy,yielding an average RMSE reduction of 3.0657%.
Study on Communication Simulation of Online Hot Search Topics Based on SEIR Model
YIN Yanyan, WANG Keke, TIAN Jiaojiao, LI Mo, XUE Yaxin, LU Chunyu, ZHAO Yunpeng
Computer Science. 2024, 51 (6A): 230500107-6.  doi:10.11896/jsjkx.230500107
Abstract PDF(2902KB) ( 297 )   
References | Related Articles | Metrics
Online hot search topics have the phenomenon of dissemination and diffusion.The current research on online hot search topics mainly focuses on the evaluation of communication effects,prediction of communication trends,social impact evaluation and public opinion guidance,while the research on online hot search topics fails to reveal the impact of communication dynamics parameters on the communication process.In this paper,the SEIR model is used to construct the dynamic model of the online hot search topic propagation,and the influence of the network average degree,the distrust probability,the immediate transmission probability after contact,the infection rate,the cure rate and the recurrence rate on the model are analyzed.
Improved K-means Photovoltaic Energy Data Cleaning Method Based on Autoencoder
PENG Bo, LI Yaodong, GONG Xianfu
Computer Science. 2024, 51 (6A): 230700070-5.  doi:10.11896/jsjkx.230700070
Abstract PDF(2046KB) ( 331 )   
References | Related Articles | Metrics
The development of smart grids has brought about a massive amount of energy data,and data quality is the foundation for tasks such as data value mining.However,during the collection and transmission process of large-scale photovoltaic energy data from multiple sources,it is inevitable to encounter abnormal data,thus requiring data cleaning.Currently,traditional statistical machine learning-based data cleaning models have certain limitations.This paper proposes an improved K-means clustering model based on the Transformer autoencoder structure for energy big data cleaning.It adaptively determines the number of clusters using the elbow method and utilizes autoencoder networks to compress and reconstruct data within clusters,thereby detecting and recovering abnormal data.Additionally,the proposed model employs the multi-head attention mechanism of Transformer to learn the relevant features among the data,enhancing the screening capability for abnormal data.Experimental results on a publicly available photovoltaic power generation dataset demonstrate that,compared to other methods,the proposed model achieves better performance in detecting abnormal data,with a screening accuracy of over 96%.Moreover,it is capable of recovering abnormal data to a certain extent,providing effective support for the application of energy big data.
Network & Communication
Research Progress of Anomaly Detection in IaaS Cloud Operation Driven by Deep Learning
SI Jia, LIANG Jianfeng, XIE Shuo, DENG Yingjun
Computer Science. 2024, 51 (6A): 230400016-8.  doi:10.11896/jsjkx.230400016
Abstract PDF(2795KB) ( 359 )   
References | Related Articles | Metrics
Anomaly detection is an important task in the operation and maintenance of IaaS cloud systems.Through early warning and intervention,serious accidents such as system crashes can be effectively avoided.However,compared to traditional data centers,IaaS cloud systemshave the characteristics of large-scale computing nodes,complex node topology,large monitoring data vo-lume,and lack of data labels,which bring new challenges for IaaS cloud anomaly detection.Starting from the technical framework of deep learning,this paper analyzes the difficulties faced by anomaly detection problems,and summarizes common anomaly detection algorithms and related technologies in IaaS cloud systems.This paper investigates deep learning driven solutions for two typical problems:node anomalies and system anomalies.For node anomalies,detection algorithms driven by temporal data are studied for time-dependent data.For system anomalies,detection algorithms driven by graph data in network topology modeling are investigated.Finally,new issues and challenges in data-driven anomaly detection in IaaS cloud systems are proposed.
Deep Reinforcement Learning Based Thermal Awareness Energy Consumption OptimizationMethod for Data Centers
LI Danyang, WU Liangji, LIU Hui, JIANG Jingqing
Computer Science. 2024, 51 (6A): 230500109-8.  doi:10.11896/jsjkx.230500109
Abstract PDF(2863KB) ( 279 )   
References | Related Articles | Metrics
With the continuous expansion of the scale of data centers,the problems of high energy consumption,high operating costs and environmental pollution are becoming more and more serious,which seriously affect the sustainability of data centers.Most data center energy consumption optimization methods focus tasks on as few servers as possible,so as to reduce computing energy consumption.However,this often leads to the generation of data center hotspots and increases cooling energy consumption.In order to solve this problem,this paper first models the data center,and models the total energy consumption optimization problem of the data center as a task scheduling problem,and requires that no data center hotspots are generated.This paper proposes a task scheduling method based on deep reinforcement learning for data centers,and uses reward shaping to optimize the method to reduce the total energy consumption of data centers without generating hotspots.Finally,experiments are carried out through simulation environment and real data center load trace data.The simulation results show that the proposed method can reduce the total energy consumption of the data center better than other existing scheduling methods,and can reduce the total energy consumption by up to 25.5%.In addition,the proposed optimization method does not generate hot spots yet,which further proves its superiority.
Study on Key Platform of Edge Computing Server Based on ARM Architecture
LIU Dong, WANG Ruijin, ZHAO Yanjun, MA Chaoyang, YUAN Haonan
Computer Science. 2024, 51 (6A): 230600119-8.  doi:10.11896/jsjkx.230600119
Abstract PDF(4322KB) ( 299 )   
References | Related Articles | Metrics
In view of the problems that the existing edge computing server cannot meet the characteristics of security,stability,reliability and strong universality,and cannot support the computing tasks of edge computing scenarios,this paper designs and implements the iBMC software and hardware architecture suitable for edge computing scenario for the first time based on Changhong Tiangong edge computing server TG225B1,which is based on Kunpeng 920 chip of ARM architecture.The architecture adopts arm domestic hardware base,supports edge gateway hardware management,and realizes the adaptive interaction framework of industrial control multi-protocol.After testing its compliance,function,performance,ease of use,maintainability,reliability and compatibility,it is concluded that iBMC architecture can better meet the requirements of edge computing server.
Delay and Energy-aware Task Offloading Approach for Orbit Edge Computing
WANG Zhongxiao, PENG Qinglan, SUN Ruoxiao, XU Xifeng, ZHENG Wanbo, XIA Yunni
Computer Science. 2024, 51 (6A): 240100188-9.  doi:10.11896/jsjkx.240100188
Abstract PDF(3741KB) ( 458 )   
References | Related Articles | Metrics
The rapid growth of smart devices around the world has created a huge demand for computing resources to sink to the edge,giving rise to the emergence of the edge computing paradigm.At the same time,the demand for computing power in remote areas where computing resources are scarce has driven the concept of orbit edge computing(OEC).In the OEC scenario,users in remote areas can offload computing tasks to edge servers deployed on LEO satellites for processing and execution through the communication link between ground station and satellite and the communication link between satellites in constellation,so as to provide low-latency and high-reliability services for users in remote areas by utilizing satellite computing resources.However,the satellite arithmetic in the OEC scenario is constrained by the limited load and solar energy conversion efficiency,and there is also the limitation of limited available time slots due to highly dynamic satellite-ground connection caused by LEO satellites circling around the earth,which is faced with the challenge of the scarcity of computational resources and the limited available communication latency.Therefore,excellent task offloading decision algorithms are needed to ensure the efficient operation of OEC systems.However,most of the current task offloading approaches for OEC scenario are unable to take into account the delay cost and energy cost when processing tasks,and the traditional approaches also lack the consideration of task diversity.To address the above problems,an adaptive large neighborhood search-based task offloading method for orbit edge computing,OEC-ALNS,is proposed,which takes the task processing cost weighted by task type as the optimization objective,and consists of destruction and repair operators based on the minimization of latency.Experimental results based on Walker Delta LEO satellite constellation and real computing task data show that,compared with the traditional OEC-TA(OEC task allocation)approach,the proposed OEC-ALNS approach could achieve at most 42.22% reduction on the weighted task processing cost and most 42.46% reduction of the average latency cost in OEC scenarios with heterogeneous multiple task sets.
WiCare:Non-contact Fall Monitoring Model for Elderly in Toilet
DUAN Pengsong, DIAO Xianguang, ZHANG Dalong, CAO Yangjie, LIU Guangyi, KONG Jinsheng
Computer Science. 2024, 51 (6A): 230700044-8.  doi:10.11896/jsjkx.230700044
Abstract PDF(3884KB) ( 291 )   
References | Related Articles | Metrics
The fall down behavior of elderly people in the bathroom poses a risk of serious harm due to poor timely rescue.Therefore,efficient and rapid monitoring of fall down in toilet is of great significance.A non-contact fall down in toiletmonitoring model WiCare,which integrates convolutional neural network(CNN),Bi-directional long short-term memory(BiLSTM),and self-attention mechanism,is proposed to address the issues of insufficient feature extraction and limited monitoring accuracy in current fall monitoring methods based on Wi-Fi perception,which are greatly affected by noise.Firstly,the amplitude is extracted from the original CSI data as the basic data.Secondly,multi-level discrete wavelet transform and soft threshold processing are used to reduce perceived data noise.Then,the perceptual data is reconstructed in multiple dimensions to more accurately characterize the characteristics of fall behavior.Finally,WiCare is used to extract effective features in the perception data,and then realize the function of monitoring toilet fall behavior in the toilet.Experimental results show that the accuracy of WiCare in monitoring fall behavior in the home bathroom environment is 99.41%.Compared with other similar models,WiCare has high recognition accuracy,low model complexity,and stronger generalization ability.
Reliability-aware VNF Instance Placement in Edge Computing
LIANG Jingyu, MA Bowen, HUANG Jiwei
Computer Science. 2024, 51 (6A): 230500064-6.  doi:10.11896/jsjkx.230500064
Abstract PDF(2612KB) ( 324 )   
References | Related Articles | Metrics
Mobile edge computing(MEC) has emerged as a promising computing paradigm to solve the conflict between the growing number of latency-sensitive applications and user demands and the constrained computing resources.To provide users with a more efficient and scalability service function chain(SFC) to satisfy users’ requests by deploying virtual network functions(VNF) in the edge environment.Unreliable service or serious service failure in the process of providing service may lead to great loss to users,so the network service provider must ensure the provision of constant and reliable service.Considering the reliability of edge servers for this problem,the gate recurrent unit(GRU) supported by computational unified device architecture(CUDA) is used to predict the availability of VNF,and the VNF instances are backed up in advance through the prediction results,avoiding the problem of excessive cost caused by over-redundant backups.The storage resources of the servers are limited,and VNF instance availability placement(RVP) algorithm is proposed to optimize the cost of service providers.Finally,performance evaluation is performed,and the experimental results show the excellence of the proposed RVP algorithm.
Method for Homologous Spectrum Monitoring Data Identification Based on Spectrum SIFT
LU Dongsheng, LONG Hua
Computer Science. 2024, 51 (6A): 230300177-7.  doi:10.11896/jsjkx.230300177
Abstract PDF(3630KB) ( 294 )   
References | Related Articles | Metrics
With the popularity of various radio applications,different kinds of monitoring data in the process of ultra-short wave monitoring is susceptible to the influence of non-homologous signals of the same frequency or adjacent frequency within a limited space.It is impossible to determine whether the signals are homologous or not merely relying on the frequency spectrum data in conventional monitoring,so that the data obtained from different monitoring stations lack of correlation and the data analysis results may be misleading,even affecting work efficiency.Based on the experience of manual monitoring,this paper attempts to analyze the frequency spectrum and time-frequency spectrum with computer vision technology,and introduces angle threshold to improve the feature point matching mode of SIFT algorithm in combination with the spectrum characteristics,so as to meet the needs of radio monitoring data analysis.Meanwhile,this paper puts forward a method to comprehensively evaluate the consistency of the homologous determination results of two kinds of spectra by using the Kappa on the premise of the matching rate of image feature point detection.Through experimental simulation and case validation,the Kappa of the homologous result is 0.7605,which is highly consistent.At last,the proposed methodcan improve work efficiency in practice,and has operational feasibility and practical significance.
Dynamic Spectrum Allocation Strategy for Cognitive Radio Based on Game Theory
TENG Zhijun, ZHANG Ailing, FU Yushan
Computer Science. 2024, 51 (6A): 230500138-5.  doi:10.11896/jsjkx.230500138
Abstract PDF(2582KB) ( 324 )   
References | Related Articles | Metrics
For the defects of low system revenue and unsatisfactory spectrum utilization in the process of spectrum allocation in wireless network,the interference price is introduced to control the interference caused by cognitive users’ transmission power,a spectrum leasing model is established,and a dynamic spectrum allocation strategy under non-cooperative game is proposed to improve spectrum utilization and system revenue.Creating the utility function under the non-cooperative game,deducing the Nash equilibrium solution,and determining the utility weight factor after weighing the network utility.Experimental results show that the optimal transmission power of the proposed algorithm is small,the spectrum utilization rate is high,and better system benefits can be obtained.
Optimal Station Layout Method for 3D Indoor Positioning System Based on Error Probability
GU Yutai, ZHAO Jingyi, YANG Teng, CHEN Chong
Computer Science. 2024, 51 (6A): 230700148-7.  doi:10.11896/jsjkx.230700148
Abstract PDF(4111KB) ( 323 )   
References | Related Articles | Metrics
With the continuous development of intelligent and automation technology,the application scenarios of indoor positioning technology are becoming more and more extensive,and how to improve the accuracy and reliability of indoor positioning system has always been a hot research issue.Optimization of the base station layout to improve the overall positioning accuracy of the system is one of the existing optimization methods for indoor positioning systems.For this method,the existing studies generally choose to adopt the existing methods in other similar fields.Among the main two methods,the dilution of precision used to evaluate satellite conjugation ignores the distance between the base station and the tag;the error probability used as the evaluation factor for the accuracy of fall points in missile and inertial navigation system does not consider the influence of the geometric structure of the base stations.In this regard,an accuracy evaluation model based on dilution of precision and error probability is proposed for deriving the optimal base station layout of 3D indoor positioning system,which can be well applied to the field of indoor positioning by considering the influence of both distance and geometric structure on indoor positioning accuracy.The proposed method achieves excellent results in both program simulation and actual positioning experiments.In the program simulation,the average value of positioning error of the optimal base station layout system is reduced by about 14.38% compared with that of the traditional four-top-angle positioning system,and in the actual positioning experiments,the positioning accuracy and effectiveness of the optimal layout are significantly improved.Experimental results verify the accuracy and practicality of the accuracy evaluation model.The proposed optimal base station layout method has high application value and universality in the field of 3D indoor positioning,and can effectively improve the accuracy and effectiveness of positioning systems.
Computer Software & Architecture
Design and Implementation of Hot-swappable Plugin for Enterprise API Gateway
WANG Shengyi
Computer Science. 2024, 51 (6A): 230800176-7.  doi:10.11896/jsjkx.230800176
Abstract PDF(3293KB) ( 275 )   
References | Related Articles | Metrics
Order to solve the problems of traditional API gateways with weak scalability and inability to hot update under the microservice architecture,this paper studies and analyzes the scalability of API gateways and introduces a hot-swappable mechanism to realize hot-swappable plugins for enterprise API gateways.At the same time,a hot-swappable plugin solution for enterprise API gateways is proposed.Experimental results show that the proposed solution will not affect the overall performance of the API gateway or the stability of business functions when supporting hot update of gateway plugins.At present,the enterprise API gateway has been applied in dozens of large enterprises,providing more than 30 kinds of hot-swappable plugins such as identity authentication,current and speed limiting,protocol conversion,and request rewriting.The enterprise API gateway completely solves the problems of the original API gateway being unable to be hot updated,hot deployed,and difficult to expand,reducing 40% of repeated development work,and saving 30% of operation and maintenance costs.It provides a useful reference for the further development and application of enterprise API gateways.It also provides new ideas for building efficient,secure,and sca-lable enterprise API gateways.
Dynamic Analysis Method for Memory Safety of Multithreaded C Programs
YAN Rui, CHEN Zhe
Computer Science. 2024, 51 (6A): 230900115-6.  doi:10.11896/jsjkx.230900115
Abstract PDF(1915KB) ( 319 )   
References | Related Articles | Metrics
As software results become increasingly complex and require higher levels of concurrency,more and more multithrea-ded programs are emerging.At the same time,C language programs lack the ability to detect memory security,which may lead to more hidden vulnerabilities in C language implemented programs.Therefore,memory security detection for C language multithreaded programs is particularly important.At present,the most cutting-edge and reliable technology for detecting memory security is dynamic analysis technology,and the tools for detecting memory safety in C language multithreaded programs are not particularly perfect.Therefore,this paper proposes a pointer based dynamic analysis technology,and combines lockless technology and source code instrumentation technology to implement the tool Movec to detect the memory security of C language multi-threaded programs.And by selecting a professional test set for experiments,it is verified that this tool is effective in detecting memory security in C language multithreaded programs and has excellent performance.
Optimum Proposal to secGear Based on Skiplist
TANG Xin, DI Nongyu, YANG Hao, LIU Xin
Computer Science. 2024, 51 (6A): 230700030-5.  doi:10.11896/jsjkx.230700030
Abstract PDF(2126KB) ( 267 )   
References | Related Articles | Metrics
Confidential computing has been an important method to protect the cloud computing security since it is proposed.It can provide an isolated trusted execution environment(TEE) for user space on computing platform to ensure the confidentiality and integrity of critical user code and data.However,the current mainstream confidential computing technology has performance bottlenecks such as slow I/O.Therefore,how to improve the performance of confidential computing has become a research hotspot.Existing researches haven’t thought of data itself,thus can’t work well in complex practical scenes.A skiplist data structure that can organize and manage data efficiently in TEE is proposed to optimize the operational efficiency of confidential computing and reduce overhead of processing data in TEE.Finally,comparison experiments are conducted using secGear to prove that comparing with red-black tree,the skiplist can improve the efficiency of confidential computing for 13.5%,10.5% and 1.9% when conducting insertion,deleting and searching respectively,and shows obvious improvement for random insertion when comparing with list.It shows that this proposal can improve the operational efficiency of confidential computing and has practicability.
Buggy File Identification Based on Recommendation Lists
WANG Zhaodan, ZOU Weiqin, LIU Wenjie
Computer Science. 2024, 51 (6A): 230600088-8.  doi:10.11896/jsjkx.230600088
Abstract PDF(2383KB) ( 304 )   
References | Related Articles | Metrics
Bug localization is a key step for bug fixing but also a tedious software activity.Existing static defect location techniques typically treat defect location as a search task,generating a list of recommended documents for each defect report in descending order of program entity relevance to the defect.However,developers still need to manually review each file to find the ones that are actually defective,which increases the time and cost of locating them.To solve this problem,this paper proposes a solution.Firstly,running state-of-the-art information-retrieval-based(IR-based) bug localization techniques to obtain an initial buggy files recommendation list.Then,three domain characteristics are proposed according to the characteristics of the problem,and a machine learning model is built based on these three characteristics,trying to identify the truly buggy files from the list.Preliminary experiments verify that the proposed approach is reasonable and actionable in practice.Experiments are carried out on four open source projects with 2558 bugs(ZooKeeper,OpenJPA,Tomcat,AspectJ) and the results show that it could obtain 72.6%~80.7% prediction accuracy initially recommending the buggy code files in the list.At the same time,we explore the three feature subsets and the importance of each feature in predicting the truly buggy files,and find that the feature of the relationship between the bug report and the source code is more important.
Time Cost Model and Optimal Configuration Method for GPU Parallel Computation of Matrix Multiplication
LEI Chao, LIU Jiang, SONG Jiawen
Computer Science. 2024, 51 (6A): 230300200-8.  doi:10.11896/jsjkx.230300200
Abstract PDF(5030KB) ( 416 )   
References | Related Articles | Metrics
Horizontal matrix & vertical matrix multiplication(HVM) is one of the fundamental calculations in scientific computing and engineering,as it largely affects the computational efficiency of higher-levet algorithms.GPU parallel computing has become one of the mainstream parallel computing method,and its underlying design makes it highly suitable for large-scale multiplication calculations.Numerous studies have focused on designing matrix structures and optimizing matrix multiplication using GPU parallel computing frameworks.However,there has been a lack of GPU parallet algorithms and optimization methods specifically targeting HVM.Furthermore,the configuration of GPU kernel functions directly affects computational efficiency,but studies on the optimal configuration of kernel functions have been extremely limited,typically requiring researchers to heuristi-cally set them based on the specific computational characteristics of the algorithm.This paper designs a parallel HVM algorithm,PHVM,based on the GPU’s thread and memory model.The numerical experimental results show that when the horizontal dimension of the left matrix is much larger than the vertical dimension,PHVM significantly outperforms the general matrix multiplication in the NVIDIA cuBLAS library.Furthermore,this paper establishes an optimal theoretical model for kernel function configuration of PHVM runtime based on GPU hardware parameters.The numerical experimental results indicates that this theoretical model accurately reflects the trend of changes in PHVM algorithm runtime with kernel function configuration(grid size and thread block size) variations.
Test Input Prioritization Approach Based on DNN Model Output Differences
ZHU Jin, TAO Chuanqi, GUO Hongjing
Computer Science. 2024, 51 (6A): 230600121-8.  doi:10.11896/jsjkx.230600121
Abstract PDF(3498KB) ( 253 )   
References | Related Articles | Metrics
Deep neural network(DNN) testing requires a large amount of test data to ensure the quality of DNN.However,most test inputs lack annotation information,and annotating test inputs is costly.Therefore,in order to address the issue of annotation costs,researchers have proposed a test input prioritization approach to screen high priority test inputs for annotation.However,most prioritization methods are influenced by limited scenarios,such as difficulty in filtering out high confidence misclassified inputs.To address the above challenges,this paper applies differential testing technology to test input prioritization and proposes a test input prioritization method based on DNN model output differences(DeepDiff).DeepDiff first constructs a contrast model that has the same functionality as the original model,then calculates the output differences between the test inputs on the original model and the contrast model,and finally assigns higher priority to the test inputs with larger output differences.For empirical evidence,we conduct a study on four widely used datasets and the corresponding eight DNN models.Experimental results demonstrate that DeepDiff is 13.06% higher on average in effectiveness compared to the baseline approaches on the original test set and 39.69% higher on the mixed test set.
FPGA Efficient Scalability Optimization of Dilithium
YAN Yunfei, LI Bin, WEI Yuanxin, ZHANG Bolin, MA Tianyi, ZHOU Qinglei
Computer Science. 2024, 51 (6A): 230800138-9.  doi:10.11896/jsjkx.230800138
Abstract PDF(2933KB) ( 267 )   
References | Related Articles | Metrics
To improve the operational efficiency of Dilithium in practical applications,an efficient field programmable gate array(FPGA) implementation of the Dilithium algorithm is proposed.Optimization is carried out in several aspects,including combining the Karatsuba-Offman algorithm(KOA) with the fast modular reduction algorithm to create a fast modular multiplication unit,optimizing the extensive polynomial multiplication achieved through number theoretic transform(NTT) implementation.Multiple RAM accesses are employed for polynomial coefficient operations,and a coefficient reading strategy tailored to the characteristics of the Dilithium algorithm is designed to achieve rapid and accurate reading of polynomial coefficients from RAM.For the sampling and hashing tasks in the scheme,the characteristics of the SHAKE algorithm series are analyzed,leading to the development of a low-latency and scalable Keccak hardware architecture,allowing it to execute different SHAKE algorithms based on the input signal.Experimental results demonstrate that the working frequency of the proposed algorithm is increased by 60.7%~131.9%,while balancing hardware resource consumption and execution efficiency.
Bug Report Severity Prediction Based on Fine-tuned Embedding Model with Domain Knowledge
CHEN Bingting, ZOU Weiqin, CAI Biyu, LIU Wenjie
Computer Science. 2024, 51 (6A): 230400068-7.  doi:10.11896/jsjkx.230400068
Abstract PDF(2873KB) ( 250 )   
References | Related Articles | Metrics
Accurately predicting the severity of bug reports is crucial for efficiently assigning them and facilitating developers to timely detect and fix software bugs.However,existing severity prediction methods based on traditional information retrieval or general pre-training models have limitations in prediction accuracy due to the ignorance of context semantics or bug report characteristics.To address this problem,this paper proposes a severity prediction method based on domain knowledge fine-tuning.A BERT pre-trained model that can fully consider the semantic context of text is used,and the model is fine-tuned with bug report data to learn relevant domain knowledge.The fine-tuned BERT model is then used to extract semantic features of bug reports,and a support vector machine is employed to construct a severity prediction model.Experimental results on 15 projects,including Mozilla,Eclipse,and Apache,demonstrate that compared with traditional information retrieval methods,the proposed method can improve the accuracy,recall,and F1 score by 4.5% to 22.0%,3.0% to 22.0%,and 4.0% to 22.0%,respectively.Compared with the general BERT model,the fine-tuned BERT model can improve the accuracy,recall,and F1 score by 2.0%~5.1%,1.9%~5.1%,and 1.8%~5.0%,respectively.
Soft Real-time Cloud Service Request Scheduling and Multiserver System Configuration for ProfitOptimization
WANG Tian, SHEN Wei, ZHANG Gongxuan, XU Linli, WANG Zhen, YUN Yu
Computer Science. 2024, 51 (6A): 230900099-10.  doi:10.11896/jsjkx.230900099
Abstract PDF(2835KB) ( 303 )   
References | Related Articles | Metrics
In cloud computing,there has been considerable research on multi-server systems based on the continuous innovation of multi-core technology.Establishing multi-server systems to provide cloud services to users and optimizing cloud service profitability is a hot topic in cloud computing.Research on these issues drives the continuous development of cloud computing technology.However,existing studies on multi-server systems either focus on optimizing cloud service profitability through the configuration of multi-server computing resources,neglecting the schedulability of cloud service requests themselves,or concentrate on developing service request scheduling strategies to improve cloud service profitability while overlooking the dynamic scalability of multi-server systems.However,when using coordinated optimization of cloud service scheduling and multi-server configuration to enhance cloud service profitability,the complexity of the problem increases exponentially.Therefore,it is essential to design a cloud service scheduling and multi-server configuration method for providers targeting soft real-time cloud service requests.Besides,existing research on configuring multi-server systems often overlooks the transient faults in processing cloud service requests.Numerous studies have demonstrated that soft real-time tasks can be affected by transient faults,leading to variations in the execution results of service requests and impacting cloud service profitability.In this study,we focus on soft real-time cloud service requests and develop a depth-search-based grey wolf algorithm to jointly optimize cloud service request scheduling and multi-server configuration,considering the prevalent computational performance heterogeneity of server resources in cloud environments,aiming to maximize cloud service profitability.Finally,extensive experiments validate the effectiveness of the proposed method.The empirical results demonstrate that,compared with the existing benchmark methods,the cloud service profits obtained by the proposed method increase by an average of 6.83%.
Fuzz Testing Method of Binary Code Based on Deep Reinforcement Learning
WANG Shuanqi, ZHAO Jianxin, LIU Chi, WU Wei, LIU Zhao
Computer Science. 2024, 51 (6A): 230800078-7.  doi:10.11896/jsjkx.230800078
Abstract PDF(2459KB) ( 353 )   
References | Related Articles | Metrics
Vulnerability mining is the main research direction in the field of computer software security,in which fuzz testing is an important dynamic mining method.In order to solve the problems such as time-consuming and low efficiency of fuzz testing caused by the large volume of assembly code,a novel binary code vulnerability mining technology based on deep reinforcement learning is proposed.The fuzz testing process is modeled as a multi-step Markov decision-making process oriented to reinforcement learning.The selection of fuzz testing mutation strategy is optimized by building a deep reinforcement learning model to achieve dynamic optimization.Then design and build a binary code fuzz testing platform based on deep reinforcement learning,use AFL to implement fuzz testing environment,and use Keras RL2 library and OpenAI Gym framework to implement deep reinforcement learning algorithm and reinforcement learning environment.Finally,the effectiveness and applicability of the proposed method and testing platform are verified through experimental analysis.Experimental results show that the deep reinforcement learning model can assist the fuzz testing process to quickly cover more paths,expose more vulnerabilities and defects,and significantly improve the efficiency of binary code vulnerability mining and location.
Information Security
Study on Data Security Framework Based on Identity and Blockchain Integration
ZHU Jun, ZHANG Guoyin, WAN Jingjing
Computer Science. 2024, 51 (6A): 230400056-5.  doi:10.11896/jsjkx.230400056
Abstract PDF(4244KB) ( 309 )   
References | Related Articles | Metrics
Industrial Internet sign analysis system has become an important new infrastructure to support industrial digital transformation.Combined with the current data security issues,this paper combs the identity parsing architecture and proposes the data security framework of identity and blockchain integration based on the distributed topology of blockchain and its information security features.The collaborative data security framework is the data security framework of supervision amount integration.This paper focuses on the whole process security of data collection,storage,transmission,sharing,right confirmation and transaction.Finally,from the dimensions of confidentiality,integrity,availability,traceability and so on,a security index system of identification data monitoring is proposed.Finally,it is proposed that the protection of data security should be changed from the asset protection of fixed location data to the processing protection of business system data,from the concern of attack behavior to the life cycle of data,and from the idea of preventing and filling loopholes to data management and data management.
Overview of Unmanned Aerial Vehicle Systems Security
WANG Zhen, ZHOU Chao, FAN Yongwen, Shi Pengfei
Computer Science. 2024, 51 (6A): 230800086-6.  doi:10.11896/jsjkx.230800086
Abstract PDF(2039KB) ( 390 )   
References | Related Articles | Metrics
In recent years,with the increasing popularity of unmanned aerial vehicle(UAV),UAVs have enormous potential in various industries such as military,agriculture,transportation,film,supply chain,and surveillance.Despite the various conve-niences provided by UAVs,security incidents related to UAVs are constantly emerging today.Malicious parties may attack UAVs and use them for life-threatening activities.Therefore,governments around the world have begun to regulate the use of UAVs.UAVs require an intelligent and automated defense mechanism to ensure the safety of humans,property,and the UAV itself.The protection of UAV operating systems is an important part of preventing intrusion attacks.Firstly,a brief introduction to the structure of UAVs is given,and then the security of existing operating systems for consumer and commercial UAVs is studied.Finally,various security issues and possible solutions for the UAV operating system are investigated.
Survey on Application of Searchable Attribute-based Encryption Technology Based on Blockchain
LAN Yajie, MA Ziqiang, CHEN Jiali, MIAO Li, XU Xin
Computer Science. 2024, 51 (6A): 230800016-14.  doi:10.11896/jsjkx.230800016
Abstract PDF(2165KB) ( 540 )   
References | Related Articles | Metrics
With the vigorous development of information sharing,the problem of data privacy security has gradually become prominent,which has spawned the rapid development of blockchain technology and searchable attribute encryption technology.As a decentralized and immutable technology,blockchain ensures the security and integrity of search data,and searchable attribute encryption technology effectively prevents illegal users from accessing queries.However,with the increase of data size and complexity,there are some problems,such as low retrieval efficiency,complicated query result verification,and difficult distribution of attribute permissions.Firstly,in view of the above problems,the research status of the application of blockchain-based searchable encryption technology,blockchain-based attribute encryption technology and blockchain-based searchable attribute encryption technology is summarized respectively.Secondly,the advantages and emphases of these three are compared and analyzed.Finally,the paper focuses on the application of blockchain-based searchable attribute encryption technology in keyword retrieval,attribute permission management and data integrity verification,as well as the problems and challenges faced.It also hopes to provide more secure technical application support for more secure,efficient and decentralized data storage and sharing.
Survey of Application of Differential Privacy in Edge Computing
SUN Jianming, ZHAO Mengxin
Computer Science. 2024, 51 (6A): 230700089-9.  doi:10.11896/jsjkx.230700089
Abstract PDF(3553KB) ( 352 )   
References | Related Articles | Metrics
In order to address the latency and bandwidth limitations of the traditional cloud computing model and to cope with the demands of the Internet of Things and the big data era,edge computing is making its appearance and gaining widespread attention.In the edge computing environment,the privacy of user data has become an important research hotspot.The combination of differential privacy techniques,which have a solid mathematical foundation,has been widely used in edge computing as an effective privacy-preserving algorithm to improve the problem of low privacy protection and high computational cost.The problems brought about by the development of the Internet are firstly introduced,followed by the basic concepts,features and components of edge computing,and the advantages compared with traditional cloud computing are outlined.The basic concepts and principles of differential privacy are again outlined,followed by a detailed description of the three perturbation methods and common implementation mechanisms of differential privacy,and finally the research on the application of differential privacy under edge computing is reviewed.Finally,the research on the application of differential privacy under edge computing is reviewed and future research directions are pointed out.In conclusion,the application of differential privacy techniques to edge computing scenarios is an effective means to protect privacy and data sharing.
Study on Kcore-GCN Anti-fraud Algorithm Fusing Multi-source Graph Features
LIU Wei, SONG You, ZHUO Peiyan, WU Weiqiang, LIAN Xin
Computer Science. 2024, 51 (6A): 230600040-7.  doi:10.11896/jsjkx.230600040
Abstract PDF(2214KB) ( 267 )   
References | Related Articles | Metrics
Financial fraud has brought many negative impacts to society,and a variety of AI and financial anti-fraud algorithms have been applied to practical anti-fraud business scenarios and have achieved good results.These anti-fraud algorithms either perform fraud detection from the perspective of individual users,or perform fraud detection from the perspective of topological relationship between nodes and network,or perform fraud detection by learning the graph embedded representation of nodes,which are limited in their starting perspectives and cannot perform a complete fraud detection analysis.To address the above problems,this paper designs a Kcore graph convolutional neural network anti-fraud algorithm based on the fusion of multi-source graph features.The innovation of this algorithm lies in the fact that it can efficiently mine the topological relationships at the node level in the network and the topological relationships at the global network level to build a wide-field feature system,and complete the propagation and aggregation of deep-level graph structure features through the graph convolutional neural network based on the Kcore algorithm The final result is the detection of fraud risk.Experimental results show that the method has a large improvement in the evaluation indexes compared with related machine learning algorithms and graph neural network algorithms,including a 12% improvement in the AUC value compared with LightGBM algorithm and a 6% improvement in the AUC value compared with GCN algorithm.
Federated Learning Scheme Based on Differential Privacy
SUN Min, DING Xining, CHENG Qian
Computer Science. 2024, 51 (6A): 230600211-6.  doi:10.11896/jsjkx.230600211
Abstract PDF(3117KB) ( 424 )   
References | Related Articles | Metrics
One of the characteristics of federated learning is that the server being trained does not directly contact the data,so federated learning itself has the characteristics of protecting data security.However,research shows that federated learning has privacy leakage problems in local data training and central model aggregation.Differential privacy is a noise augmentation technique that adds appropriate noise to prevent an attacker from distinguishing user information.We study a hybrid noise adding algorithm based on local and central differential privacy(LCDP-FL),which can provide local or hybrid differential privacy protection for each client according to its different weights and privacy requirements.It’s shown that the algorithm can provide users with the privacy they need with minimal computational overhead.The algorithm is tested on the MNIST dataset and CIFAR-10 dataset,and compared with local differential privacy(LDP-FL) and central differential privacy(CDP-FL) algorithms,and the results show that the hybrid algorithm has improved accuracy,loss rate and privacy security,and its algorithm performance is the best.
Differential Privacy Federated Learning Method Based on Knowledge Distillation
TAN Zhiwen, XU Ruzhi, WANG Naiyu, LUO Dan
Computer Science. 2024, 51 (6A): 230600002-8.  doi:10.11896/jsjkx.230600002
Abstract PDF(3282KB) ( 347 )   
References | Related Articles | Metrics
Differential privacy technology,as a privacy protection method,has been widely applied in federated learning.The existing research on the application of differential privacy in federated learning either fails to consider unlabeled public data or the difference in data volume between clients,which limits its application in real-world scenarios.This paper proposes a differential privacy federated learning method based on knowledge distillation,which introduces unlabeled public datasets and considers the differences in data volume between clients.A dedicated differential privacy scheme is designed for this scenario.Firstly,the clients are grouped into “large data clients” and “general clients” based on the size of the data.The teacher model is trained using the data from the large data clients,and the teacher model adds pseudo labels to the public dataset.Then,the public dataset is used as a “special client” to jointly conduct federated training with the “general client”.Adopting differential privacy technology to ensure the data privacy of clients,as the data of special clients only involves privacy with labels,more privacy budgets are allocated to them in federated training compared to general clients.Limit the total amount of privacy budget,set the privacy budget for the federal training stage as a fixed value,and adjust the privacy budget for the pseudo label addition stage based on the client’s privacy needs and the parallel combination property of privacy budget.Experiments on the MNIST and SVHN datasets show that,under the same privacy budget consumption,the trained model has higher accuracy than traditional methods.This scheme has scalability,and its high flexibility of privacy budget allocation enables it to meet complex privacy needs.
DUWe:Dynamic Unknown Word Embedding Approach for Web Anomaly Detection
WANG Li, CHEN Gang, XIA Mingshan, HU Hao
Computer Science. 2024, 51 (6A): 230300191-5.  doi:10.11896/jsjkx.230300191
Abstract PDF(2207KB) ( 271 )   
References | Related Articles | Metrics
When the existing deep-learning model-based word embedding methods are used to detect Web anomalies,the vocabulary not appearing in the corpus is usually called out of vocabulary(OOV) and is set as unknown,and given zero or random vector as the input of the depth model for training without considering the context of unknown word in the web request.In the process of code development,in order to increase the readability of code,programmers often design request path code based on a certain pattern which usually makes web requests semantically related.Considering that there are certain request patterns in web requests and pattern correlation between semantics,this paper studies and proposes a dynamic unknown word embedding method DUWe based on Word2vec,which assigns unknown word representation through word context inference.Evaluation on CSIC-2010 and WAF dataset shows that adding unknown word embedding methods have better performance than word2vec feature extraction methods.The accuracy,precision,recall rate and F1-Score are improved,and the maximum reduction in training time is 1.14 times.
Design and Implementation of SNMPv3 Security Mechanism Based on National Security SM3 andSM4 Algorithms
TIAN Hao, WANG Chao
Computer Science. 2024, 51 (6A): 230500209-7.  doi:10.11896/jsjkx.230500209
Abstract PDF(3244KB) ( 363 )   
References | Related Articles | Metrics
With the rapid development of network technology and the increasing popularity of 5G technology,the number of devices accessing the network is increasing exponentially,the network structure is becoming increasingly complex,and malicious network attacks are frequent.How to securely and efficiently manage the large number of complex network devices is becoming a new challenge for network management.Compared with v1 and v2,SNMP v3 adds a user-based security model that provides security services such as data confidentiality,integrity,and anti-replay.However,SNMPv3 still has problems,such as the default authentication algorithm and encryption algorithm strength,which are not high,and the cryptographic algorithm does not fully support the national standard for commercial confidentiality algorithms.Based on the analysis of the existing security mechanism of SNMPv3 protocol,this paper proposes an optimization scheme for the existing problems of SNMPv3 based on user security model,embedded SM3 and SM4 national security algorithms into SNMPv3 security mechanism,and designs HMAC-SM3-192 authentication protocol and PRIV-CBC-SM4 encryption protocol for SNMP protocol based on SM3 and SM4 national security algorithms.Without significantly increasing the response time,it improves the ability to resist security threats such as forgery,information tampering and information leakage during SNMP message transmission,and achieves the optimization of SNMP protocol in terms of security.
Improving Transferability of Adversarial Samples Through Laplacian Smoothing Gradient
LI Wenting, XIAO Rong, YANG Xiao
Computer Science. 2024, 51 (6A): 230800025-6.  doi:10.11896/jsjkx.230800025
Abstract PDF(1915KB) ( 340 )   
References | Related Articles | Metrics
Deep neural networks are vulnerable to adversarial sample attacks due to the fragility of the model structure.Existing adversarial sample generation methods have a high white box attack rate,but their transferability is limited when attacking other DNN models.In order to improve the success rate of black box migration attack,this paper proposes a migration counterattack method using Laplacian smooth gradient.This method is improved on the gradient-based black box migration attack method.Firstly,Laplacian smoothing is used to smooth the gradient of the input image,and the smoothed gradient is input into the attack method using gradient attack for further calculation,aiming to improve the migration ability of the adversary-sample between different models.The advantage of Laplacian smoothing is that it can effectively reduce the impact of noise and outliers on the data,thus improving the reliability and stability of the data.The approach does further improve the migration success of adversarial samples by evaluating them on multiple models,with the best migrable success rate 2%,higher than the baseline attack method.The results show that this method is of great significance to enhance the migration performance of adversarial attack algorithms,and provides a new idea for further research and application.
Study on Smart Grid AMI Intrusion Detection Method Based on Federated Learning
LIU Dongqi, ZHANG Qiong, LIANG Haolan, ZHANG Zidong, ZENG Xiangjun
Computer Science. 2024, 51 (6A): 230700077-8.  doi:10.11896/jsjkx.230700077
Abstract PDF(3522KB) ( 367 )   
References | Related Articles | Metrics
Advanced metering infrastructure(AMI) is a key link in building smart grid and ubiquitous electric IoT.With the application of mass terminal access and heterogeneous communication network components,the risk of network attacks on AMI is greatly increased.For the problems of traditional AMI network attack intrusion detection methods,such as excessive computing pressure of the main station,weak disaster resistance ability and insufficient recognition accuracy,an AMI intrusion detection method based on federated learning is proposed.Firstly,the federated learning intrusion detection model for AMI is constructed,and the federated learning framework is integrated into the model.Then,a lightweight intrusion detection algorithm that integrates decision tree on the edge side is designed,and a cross-platform cloud-edge collaborative joint training method is proposed to realize cross-platform experience sharing and improve intrusion detection performance.Finally,based on the NSL-KDD dataset,simulation results show that compared with the centralized and federated learning fusion neural network intrusion detection models,the accuracy of the proposed method can reach 99.76%,and the false positive rate is only 0.17%.At the same time,the detection time is reduced,the communication efficiency is improved.It also ensures that data does not leave the local area,reducing the risk of data privacy disclosure.
Malicious Attack Detection in Recommendation Systems Combining Graph Convolutional Neural Networks and Ensemble Methods
LIU Hui, JI Ke, CHEN Zhenxiang, SUN Runyuan, MA Kun, WU Jun
Computer Science. 2024, 51 (6A): 230700003-9.  doi:10.11896/jsjkx.230700003
Abstract PDF(3320KB) ( 404 )   
References | Related Articles | Metrics
Recommendation systems have been widely used in most Internet platforms,such as e-commerce,social media,and information sharing,which effectively solve the problem of information overload.However,these platforms are open to all Internet users,leading to illegal manipulation of rating data through malicious interference and deliberate attacks by unscrupulous users using system design flaws,affecting the recommendation results and seriously jeopardizing the security of recommendation ser-vices.Most existing detection methods are based on manually constructed features extracted from rating data for shilling attack detection,which is challenging to adapt to more complex co-visitation injection attacks,and manually constructed features are time-consuming and need more differentiation capability.In contrast,the scale of attack behavior is much smaller than normal behavior,bringing imbalanced data problems to traditional detection methods.Therefore,the paper proposes stacked multilayer graph convolutional neural networks end-to-end to learn multi-order interaction behavior information between users and items to obtain user embeddings and item embeddings,which are used as attack detection features,and convolutional neural networks are used as base classifiers to achieve deep behavior feature extraction,combined with ensemble methods to detect attacks.Experimental results on real datasets show that the method better detects co-visitation injection attacks and overcomes the imbalanced data problem to a certain extent compared with popular malicious attack detection methods for recommendation systems.
Scheme for Maximizing Secure Communication Capacity in UAV-assisted Edge Computing Networks
XUE Jianbin, DOU Jun, WANG Tao, MA Yuling
Computer Science. 2024, 51 (6A): 230800032-7.  doi:10.11896/jsjkx.230800032
Abstract PDF(3137KB) ( 375 )   
References | Related Articles | Metrics
Aiming at the problem that user information is easy to be leaked in UAV-assisted mobile edge computing system,based on non-orthogonal multiple access(NOMA) technology,a secure communication scheme for UAV-assisted mobile edge computing system is proposed.While ensuring the minimum secure computation requirement for each ground user,the average secure computation capability of the system is maximized by jointly optimizing the channel coefficients,the transmit power,the computation frequency of the central processing unit,the local computation and the UAV trajectory.Due to the uncertainty of the eavesdropper location,the coupling of multiple variables and the non-convexity of the problem,successive convex approximation and block coordinate descent method are used to solve the problem.Simulation results show that compared with the benchmark scheme,the proposed scheme outperforms the benchmark scheme in terms of system secure computation performance.
Dual-color Image Encryption System with Improved Lifting-like Scheme
WANG Bin, LI Haixiao, CHEN Rongrong
Computer Science. 2024, 51 (6A): 230500007-11.  doi:10.11896/jsjkx.230500007
Abstract PDF(7911KB) ( 337 )   
References | Related Articles | Metrics
Today’s image information security is facing a severe challenge,and image encryption technology is one of the most effective means to solve this problem.Since the lifting scheme has faster encryption and decryption speed and good security in image encryption,more and more encryption systems based on lifting schemes have been proposed.This paper proposes a dual-color image encryption system with an improved class-boosting scheme.First,the color image is divided into three channels:R channel,G channel and B channel.Then each image is regarded as six faces of the Rubik’s Cube,and the rotation of the Rubik’s Cube is controlled by a random sequence,to achieve the effect of scrambling and encrypting the image.Secondly,in order to make the entire system more secure,the update and prediction functions of the improved lifting-like scheme are replaced by the perceptron-like network(PLN).Compared with the original simple linear function,PLN has more complex calculations and unpredictability.The encrypted image obtained by the structure proposed in this paper has higher encryption quality,so the image information can be better diffused to each pixel.Experiment results show that the system can resist various attacks very well and has high security.And this system is very sensitive to common images and keys,so it can be applied to actual image encryption.
Data Security Management Scheme Based on Editable Medical Consortium Chain
TAN Jingqi, XUE Lingyan, HUANG Haiping, CHEN Long, LI Yixuan
Computer Science. 2024, 51 (6A): 240400056-8.  doi:10.11896/jsjkx.240400056
Abstract PDF(3460KB) ( 361 )   
References | Related Articles | Metrics
Secure management of medical data is crucial for ensuring the security of patient privacy and effectively conducting medical treatments and related research.However,most existing medical data management solutions have the problems of low transparency,inadequate sharing and fail to secure patient privacy.By designing an editable blockchain model with a “main-side” chain structure,a secure and accountable healthcare data management scheme for healthcare consortium chain is proposed.Its main chain stores current information about the patient’s electronic medical record,while the side chain records historical proof of modification,enabling the transparency and traceability of data management.Introducing chameleon hash function,secret sharing and verifiable signature technology,only verified system administrators of healthcare organizations are able to construct the complete trapdoor,which ensures the correctness and security of the medical data modification process.Security analysis and related experiment simulation show that,the proposed scheme performs well in computational overhead and communication load,as well as storage efficiency,which demonstrates its effectiveness and feasibility.
Face Anti-spoofing Method with Adversarial Robustness
WANG Chundong, LI Quan, FU Haoran, HAO Qingbo
Computer Science. 2024, 51 (6A): 230400022-7.  doi:10.11896/jsjkx.230400022
Abstract PDF(2791KB) ( 321 )   
References | Related Articles | Metrics
The existing face anti-spoofing methods based on deep neural networks perform excellently now,but they are absolute weak when facing adversarial examples.To solve the problem,capsule network(CapsNet) is introduced to propose an adversarial robust method called FAS-CapsNet.The capsule structure and reconstruction mechanism of CapsNet are utilized to retain the correlation between features and filter the adversarial perturbations in images.The Retinex algorithm is utilized to enhance illumination features which show the difference of reflection properties between skin and planar medium,increasing the between-class distance of living and spoof faces and destroying the very adversarial perturbation modes in images,improving the accuracy and robustness of FAS-CapsNet.Experiments on CASIA-SURF show that the spoofing detection accuracy of FAS-CapsNet is 87.344%,and the highest accuracy of comparison models is 78.917%,which demonstrates that FAS-CapsNet is capable to solve general face anti-spoofing problems.This paper further generates two adversarial datasets from CASIA-SURF validation set to verify the robustness of each model.The accuracy of FAS-CapsNet on the two datasets is 84.552% and 79.042% respectively,which decreases by 3.197% and 9.505% compared to the previous results.The highest accuracy of comparison models on adversarial datasets is 74.938% and 41.667% respectively,which is 5.042% and 47.201% lower than that of the conventional detection.It proves that FAS-CapsNet is significantly robust in adversarial attacks.
Study on Optimization of Abnormal Traffic Detection Model Based on Machine Learning
CHEN Xiangxiao, CUI Xin, DU Qin, TANG Haoyao
Computer Science. 2024, 51 (6A): 230700051-5.  doi:10.11896/jsjkx.230700051
Abstract PDF(2092KB) ( 410 )   
References | Related Articles | Metrics
Anomaly traffic detection methods in software defined network(SDN) have some problems in practice,such as high false alarm rate and frequent false alarms.In response to abnormal traffic attacks in the network,researchers have started to explore machine learning methods for abnormal traffic detection.However,machine learning methods face the challenges of large data sets and high data dimensionality,which affect the efficiency and accuracy of its performance,and thus require data reduction processing.Principal component analysis(PCA),as a linear transformation-based downscale algorithm,has certain limitations and cannot effectively estimate the principal components.To overcome this challenge,this paper proposes an improved dimensionality reduction algorithm,namely C-means Gaussian kernel principal component analysis(CGKPCA),which extend the capability of non-linear transformation.Also,this paper improves on the classification model by proposing an improved stacking model SVMS(support vector machine stacking).To validate the effectiveness of the proposed algorithms,experimental validation is conducted using the open source datasets KDDCPU99 and UNSW-NB15.The testing results indicate that the binary classification detection model proposed in this paper is significantly ahead of other models in terms of performance metrics.
New Design of Redactable Consortium Blockchain Scheme Based on Multi-user Chameleon Hash
KANG Zhong, WANG Maoning, MA Xiaowen, DUAN Meijiao
Computer Science. 2024, 51 (6A): 230600004-6.  doi:10.11896/jsjkx.230600004
Abstract PDF(2224KB) ( 295 )   
References | Related Articles | Metrics
Due to the lack of supervision strategies,the inclusion of suspicious or harmful information,and the inability to modify data after being uploaded to the chain,the existing blockchain architecture is likely to become an extrajudicial place for low-cost cybercrime,thus limiting its usability.The redactable blockchain scheme is considered to be an effective way to solve this pro-blem,but how to combine this concept with the advantages of the consortium blockchain is an unresolved technical problem.To this end,in this paper,a new cryptographic scheme is put forward,which extends the concept of chameleon hash functions to multi-user scenarios by introducing the group key,and improves the solution to the problem of centralized modification rights caused by a single user holding the whole trapdoor key.On this basis,a consortium-oriented redactable blockchain scheme is proposed,which adopts a two-stage model of request-verification to complete the modification.Under the general model and random oracle model,based on the discrete logarithm assumption,it is proved that the scheme is collision-free and multi-user secure.Simulation experiments and comparative analysis also demonstrate the effectiveness and usability of the scheme.
Operational Consistency Model Based on Consortium Blockchain for Inter-organizational Data Exchange
GENG Qian, CHUAI Ziang, JIN Jian
Computer Science. 2024, 51 (6A): 230800145-9.  doi:10.11896/jsjkx.230800145
Abstract PDF(2960KB) ( 278 )   
References | Related Articles | Metrics
In the context of inter-organizational data sharing and exchange,maintaining strong operational consistency is a crucial technical foundation to achieve effective data synchronization.Based on blockchain technology,a model to enhance the operational consistency for inter-organizational data exchange is proposed.The model leverages a consortium blockchain as a write-ahead log,enabling databases to backtrack incomplete synchronizations.Within the consortium blockchain,a pointer structure is implemented to reduce the time required for backtracking.Additionally,data structures and consensus algorithms are designed for on-chain storage of requests.An access control mechanism is set up to verify user identities,ensuring data security and user privacy.Finally,simulation experiments are conducted to validate the reliability of the proposed model and explore the impact of a series of parameters on different evaluation metrics.The results indicate that under various parameter settings,the proposed model ensures strong operational consistency for inter-organizational data exchange.Furthermore,compared to existing baselines,it exhibits higher request execution efficiency.Moreover,the pointer structure further enhances the model’s performance,especially when there is a large amount of data stored on the chain.
Study on Fingerprint Recognition Algorithm for Fairness in Federated Learning
WANG Chenzhuo, LU Yanrong, SHEN Jian
Computer Science. 2024, 51 (6A): 230800043-9.  doi:10.11896/jsjkx.230800043
Abstract PDF(3240KB) ( 328 )   
References | Related Articles | Metrics
Most existing fingerprint recognition methods rely on machine learning,which neglects the privacy and heterogeneity of the data when training on massive databases,resulting in user information leakage and reduced recognition accuracy.To cooperatively optimize model accuracy under privacy protection,this paper proposes a novel fingerprint recognition algorithm based on federated learning,termed federated learning-fingerprint recognition(Fed-FR).Firstly,the algorithm iteratively aggregates parameters from each terminal through federated learning,thereby improving the performance of the global model.Secondly,sparse representation theory is applied to low-quality fingerprint image denoising to enhance the texture structure of the fingerprint.Thirdly,in response to the allocation inequity issue caused by client heterogeneity,this paper proposes a client scheduling strategy based on reservoir sampling.Finally,experimental results on three real-world databases show that Fed-FR significantly outperforms local learning by 5.32% and federated average by 8.56%,approaching the accuracy of centralized learning.The results demonstrate the effectiveness of Fed-FR in privacy protection,accuracy evaluation,and scalability.This study demonstrates for the first time the feasibility of combining federated learning with fingerprint recognition,enhancing the security and scalability of fingerprint recognition algorithms,and providing a reference for the application of federated learning in biometric technologies.
Multimedia Harmful Information Recognition Method Based on Two-stage Algorithm
SHI Xiaosu, LI Xin, JIAN Ling, NI Huajian
Computer Science. 2024, 51 (6A): 231000052-6.  doi:10.11896/jsjkx.231000052
Abstract PDF(2662KB) ( 348 )   
References | Related Articles | Metrics
In the application scenarios of Internet content security supervision and combating and rectifying Internet crimes,existing multimedia harmful information identification methods generally have problems such as low computational efficiency,inability to accurately identify local sensitive information,and identification capabilities are limited to a single type of cyber crimes.In order to solve the above problems,the paper proposes a multimedia harmful information recognition model based on a two-stage algorithm.This method processes information filtering and content detection in stages,and splits the tasks of scene recognition and element target detection.The first stage uses EfficientNet-B2 to build a high-throughput pre-filter model to quickly filter out 80% of images and short videos with normal content.In the second stage,three modules with different network structures are built based on Meal-V2,Faster RCNN,and NetVLAD networks to adapt to the recognition requirements of multi-dimensional scenes and multi-feature elements.The results show that the model’s computing efficiency reaches 57FPS(frames per second) on the T4 card,and the recognition accuracy and recall rate of multimedia harmful information exceed 97%.Compared with traditional mo-dels,the recognition accuracy rate on the NPDI dataset and the self-built test dataset increases by 3.09% and 19.26% respectively.
Quantum Circuit Optimization of Camellia Cryptographic Algorithm S-box
LYU Yi, LUO Qingbin, LI Qiang, ZHENG Yuanmeng
Computer Science. 2024, 51 (6A): 230900051-6.  doi:10.11896/jsjkx.230900051
Abstract PDF(3643KB) ( 307 )   
References | Related Articles | Metrics
S-box is an important nonlinear component of Camellia cryptographic algorithm.In this paper,Toffoli gate,CNOT gate and NOT gate are used to construct the quantum circuit of Camellia cryptographic algorithm S box.In order to reduce the computational complexity,according to the algebraic expression of the S-box,the multiplication inversion operation in the finite domain GF(28)isisomorphic to the operation in the complex domain GF((24)2,and finally the quantum circuit diagram of Camellia cipher algorithm S box is synthesized.In optimization,the affine matrix,isomorphic matrix and a group of matrices corresponding to CNOT gates are first multiplied and then synthesized,and the quantum circuit of multiplication inversion in GF((24)2 is optimized using DORCIS tool,and the quantum circuit of matrix operation is optimized using W-Type algorithm.The resulting quantum circuit of the S-box uses only 20 qubits,52 Toffoli gates,178 CNOT gates,and 13 NOT gates,Toffoli-depth is 40,with a circuit depth of 130.The correctness of the quantum circuit is verified by IBM’s Aer simulator.Compared with the existing results,the quantum resources used in this paper are further reduced.
Fine Grained Security Access Control Mechanism Based on Blockchain
TIAN Hongliang, XIAN Mingjie, GE Ping
Computer Science. 2024, 51 (6A): 230400080-7.  doi:10.11896/jsjkx.230400080
Abstract PDF(3425KB) ( 308 )   
References | Related Articles | Metrics
To solve the problems of huge data scale,poor access security and privacy security in industrial IoT,a data security access control mechanism based on blockchain combined with zero-knowledge token is proposed,while IPFS interstellar file system is applied for off-chain storage to expand the storability of blockchain.A blockchain network is built and smart contracts are deployed through the Hyperledger Fabric platform to define a formal representation of the access process to achieve local and global access authorization in a more fine-grained model,while the model and process of access control are elaborated.Finally,the security and effectiveness of the model are compared and analyzed,and the latency of the blockchain network for access authorization is illustrated through experiments.The results show that the proposed mechanism has security,effectiveness and low latency in IoT access control.
Airborne Software Audit Method Based on Trusted Implicit Third Party
YUE Meng, ZHU Shibo, HONG Xueting, DUAN Bingyan
Computer Science. 2024, 51 (6A): 230400088-6.  doi:10.11896/jsjkx.230400088
Abstract PDF(2266KB) ( 282 )   
References | Related Articles | Metrics
The distributed cloud storage technology provides a new distribution and storage method for an increasingly large number of airborne software.This means that airlines have lost direct control over the software,therefore the security of airborne software has become one of the most concerned issue of airlines.In order to improve the security of airborne software in the cloud storage environment,an airborne software audit method based on trusted implicit third party is proposed.Trusted hardware deployed in the cloud is used to audit instead of users,which solves the problem that the third party auditor is not completely trusted in the publicly verifiable audit mechanism,and records the audit results in the form of logs for users to query,which not only reduces users’ computing costs,but reduces users’ online time.Compared with other trusted implicit third party audit methods,it saves 10% of the time consumption in the audit calculation process.
Privacy Data Editing Mechanism Based on Distributed Chameleon Hash Function
HUANG Shoumeng, YANG Boxiong, YANG Ming
Computer Science. 2024, 51 (6A): 240100157-5.  doi:10.11896/jsjkx.240100157
Abstract PDF(2124KB) ( 285 )   
References | Related Articles | Metrics
With the widespread application of blockchain technology in various fields,data security and user privacy are faced with many unknown threats and challenges.For illegal transaction data that maliciously carries user privacy or illegal attack code,a chameleon hash collision data editing mechanism (DecPRB) based on multi-party monitoring is designed through attribute strategy and chameleon hash algorithm.This DecPRB mechanism is based on the chameleon hash editing mechanism,optimized and designed with a trapdoor hash function that is easy to manage.By calculating hash collisions,blockchain historical data editing is achieved,illegal data (especially private data or attack code) that is publicly available on the blockchain can be deleted.Of course,during this update and editing process,all modification permissions are jointly monitored by all nodes on the chain.Finally,through security analysis,it is inferred that the DecPRB mechanism does not change the security attributes of the blockchain and has strong anti attack capabilities.Simulation experiments are conducted to verify the effectiveness of the DecPRB mechanism,which meets data security requirements.DecPRB mechanism can effectively protect data security and privacy issues in complex distributed network environments,especially distributed cloud computing and blockchain systems,and make a certain contribution to the development of the digital economy era.
Forward and Backward Secure Dynamic Searchable Encryption Schemes Based on vORAM
SHAO Tong, LI Chuan, XUE Lei, LIU Yang, ZHAO Ning, CHEN Qing
Computer Science. 2024, 51 (6A): 230500098-9.  doi:10.11896/jsjkx.230500098
Abstract PDF(2798KB) ( 309 )   
References | Related Articles | Metrics
To solve the problem of keyword retrieval caused by encrypting and storing sensitive data on the cloud platform,a forward and backward secure dynamic searchable encryption scheme FBDSE-I is proposed by introducing a new oblivious data structure.By using the history-independence and secure deletion of the oblivious data structure,FBDSE-I scheme realizes the direct deletion of keyword/file-identifier pairs,ensures the security of data updating,and simplifies the dynamic update process.Furthermore,an improved scheme,FBDSE-II,is proposed to achieve more efficient query operation.The map dictionary structure is used to decouple the oblivious primitives and search results,so as to reduce the number to access vORAM in the query process.In addition,the formal security proof is given.It is proved that FBDSE-I and FBDSE-II schemes respectively satisfy Type-I and Type-III backward security while ensuring forward security.Experimental results show that FBDSE-I and FBDSE-II schemes are more efficient than the forward and backward secure dynamic searchable encryption schemes at the same level.In particular,the larger the scale of data sets,the more significant the advantage becomes.
Study on Cryptographic Verification of Distributed Federated Learning for Internet of Things
ZANG Hongrui, YANG Tingting, LIU Hongbo, MA Kai
Computer Science. 2024, 51 (6A): 230700217-5.  doi:10.11896/jsjkx.230700217
Abstract PDF(2514KB) ( 309 )   
References | Related Articles | Metrics
Artificial intelligence is combined with the Internet of Things(IoT) to enhance application usage.In the IoT,data sharing can improve the quality of applications,but it also brings data security issues,such as data leakage and inability to verify du-ring the sharing process.This paper proposes a scheme combining distributed federated learning with blockchain and encryption verification to protect the privacy and effectiveness of shared data in the IoT.First,federated learning and blockchain are used to transform the direct sharing of original data into shared encryption model parameters in the IoT.Next,this paper proposes a method with encryption verification to verify and select on chain parameters during the model aggregation stage.Finally,the proposed method is compared with other methods,and experimental results show that our method can effectively ensure data privacy and achieve verification of encrypted data,ensuring the accuracy of the final model,and providing guarantees for high-quality data sharing in the IoT.
Interdiscipline & Application
Mathematical Principles of Machine Thinking
ZHU Ping, ZOU Weiming, LYU Pohua, SHI Jin, JIANG Xuetao, MA Yirong
Computer Science. 2024, 51 (6A): 230900104-8.  doi:10.11896/jsjkx.230900104
Abstract PDF(2552KB) ( 278 )   
References | Related Articles | Metrics
Constructing machine thinking mechanisms that are understandable to humans is the ultimate goal of this paper.Ma-thematic is an important thinking tool for humans to describe the state and running laws of the objective world,and is also a tool for machines to automatic resolving,interpretable running,and intermediate steps generating.The description language of the objective world has diverse forms,huge scale,and sparse features.Its semantic representation,semantic accumulation,semantic analyzing,and the implementation of machine thinking mechanisms are all based on progressive clarity and perfection by use cases.In the field of automatic humanoid solving elementary mathematic application problem,machine thinking mainly relies on basic mathematical concepts and their underlying computation theories,including set,proportion(fraction),unequal relationship,enumeration,and data induction and derivation(trend discrimination).Taking the semantic gradual accumulation and recognition of set elements and their proportions as the example,this paper discusses the application technology of mathematical principles in machine thinking systems from the perspective of machine thinking system implementation.Finally,an example is presented to illustrate the complete process and intermediate steps of machine automated humanoid resolving a specific elementary mathematic application problem.The methods and prospects of using mathematical tools such as inequality,enumeration,number axis,coordinate system,and mathematical induction and deduction in machine thinking are discussed.
Modeling and Analysis of Implementation Process for Civil Aircraft Certification Test Flight Based on Stochastic Petri Net
DENG Hannian, ZHOU Jie, YANG Bo, YI Lili, FU Guang, ZHOU Peng
Computer Science. 2024, 51 (6A): 230700050-6.  doi:10.11896/jsjkx.230700050
Abstract PDF(2449KB) ( 266 )   
References | Related Articles | Metrics
Certification test flight is an important activity for civil aircraft to obtain a type certificate,characterized by high cost and high risk.Studying the implementation process of certification test flight is conducive to promoting the orderly conduct of test flight,thereby reducing the test flight cycle and cost.Currently,research on the test flight process is limited to process description and qualitative analysis,lacking formal modeling and performance analysis of it,which hinders the examination of key links in the process.In order to solve the above problems,the implementation process in the three stages of certification test flight is studied,and the simulation model of it is constructed by using stochastic Petri ne(SPN).By establishing a Markov chain isomorphic to the model,the performance analysis of the implementation process is conducted to identify the time-consuming key links in the process.Furthermore,the impact of the implementation rates of key links on the average running time of the process is analyzed.Finally,the feasibility of the model and method is validated through a case study.The results show that the manufacturing compliance check and test flight data processing are the time-consuming key links in the process and should be the focus of process optimization.Compared to improving the speed of individual link,enhancing the implementation rates of these two key links at the same time has lower cost and greater improvement in process efficiency.
Forecasting Teleconsultation Demand Based on LSTM and Attention Mechanism
ZHAI Yunkai, QIAO Zhengwen, QIAO Yan
Computer Science. 2024, 51 (6A): 230800119-7.  doi:10.11896/jsjkx.230800119
Abstract PDF(3169KB) ( 308 )   
References | Related Articles | Metrics
To predict the demand for teleconsultation more accurately and improve the efficiency of resource allocation for teleconsultation,this paper introduces multiple linear regression and attention mechanism to optimize Long Short-term Memory network.Firstly,according to the holiday effect existing in the teleconsultation demand,the holiday index is generated,and the index with high significance is selected as the model input through multiple regression analysis.Then,according to the long-term short-term memory network to learn the internal complex mapping relationship of the input indicators,the attention mechanism is used to assign different weights to the indicators.Finally,the prediction results are input according to the weight and LSTM hidden layer.Based on the actual historical teleconsultation data of the National Telemedicine Center,this paper studies the predictive ability of MLR-Attention-LSTM,and compares it with the ARIMA,SVR,KNN,BP neural network and long short-term memory network.The results show that the improved LSTM model has the highest prediction accuracy.Furthermore,this paper explores the impact of holiday indicators on the performance of the model.The results show that the input of holiday indicators can further improve the prediction accuracy of the model.It verifies the feasibility and applicability of MLR-Attention-LSTM and holiday-related variable input in the field of teleconsultation demand prediction,and provides theoretical support and practical guidance for the practical application of telemedicine centers.
Supply Chain Decisions Considering Supplier Loss Aversion and Financial Constraints
LI Liying, ZHOU Jun, WANG Min
Computer Science. 2024, 51 (6A): 230800134-7.  doi:10.11896/jsjkx.230800134
Abstract PDF(2504KB) ( 300 )   
References | Related Articles | Metrics
With increasingly competitive markets,it is particularly common for suppliers to be underfunded in the presence of yield uncertainty.Based on this,a Stackelberg game model is constructed with the lead of risk-neutral manufacturer and the follower of loss-averse supplier.The optimal production input quantity decision of the loss-averse supplier and the optimal loss-sharing decision of the manufacturer are given under the two modes of bank financing and manufacturer’s advance payment financing,respectively.The theoretical and numerical analyses show that,when each parameter takes different values within a certain range,the supplier and the manufacturer may tend to choose different financing modes;in the advance payment financing mode,the more loss-averse the supplier is,the more conservative production strategy she will adopt,and then the manufacturer will incentivize the supplier to increase the amount of production inputs by assuming more loss-sharing ratios.
Optimization of Distributed Naval Warfare Supply Path Based on Relay Guarantee
ZENG Xiangbing, ZENG Bin, LI Houpu
Computer Science. 2024, 51 (6A): 230600113-9.  doi:10.11896/jsjkx.230600113
Abstract PDF(2468KB) ( 271 )   
References | Related Articles | Metrics
Distributed naval warfare is an inevitable trend in the development of future warfare patterns,compared with the traditional naval warfare style,distributed naval warfare combat deployment is more flexible,the combat area is more extensive.Under the condition of distributed naval warfare,the logistics support area is widely distributed,the support radius is large,and the degree of dispersion is high,so the traditional accompanying security approach is difficult to adapt to the development of warfare patterns.The resupply mode of relying on the relay security base for progressive security and using supply ship formations for material transfer is more suitable for the characteristics of distributed naval warfare.Considering the great influence of complex and variable ocean weather on long-distance material delivery,this paper firstly simulates the effective wave height based on the historical weather information of different geographical locations,and then measures the degree of influence of weather factors on the ship’s speed and supply expenses based on the effective wave height,and finally triggers the feedback between the optimization model and the simulation model and get the optimization scheme.The optimized solution is compared with the ideal optimization result without triggering the simulation model,so as to verify the performance of different strategies and obtain the optimized solution that fits the actual application environment,thus improving the robustness of the optimal path and providing better decision support for distributed naval warfare supplies supply.
Deep Learning Prediction of Stock Market Combining Media Information and Signal Decomposition
LIU Guang, YI Hong
Computer Science. 2024, 51 (6A): 230600102-12.  doi:10.11896/jsjkx.230600102
Abstract PDF(6029KB) ( 322 )   
References | Related Articles | Metrics
Accurate prediction of future returns and risks in the stock market not only helps rational investors to invest more reasonably and effectively,but also provides useful guidance for policy makers and investors.Applying financial news headlines,this paper constructs an investor sentiment index that takes into account the cumulative effects of news using text analysis methods such as word embedding and machine learning.The Shanghai Composite Index is used as an example,and the empirical analysis decomposes the index’s fluctuation data into various inherent modes using the variational mode decomposition(VMD) method.Finally,the bidirectional gated recurrent unit(BiGRU) is introduced as a deep learning model for price fluctuation prediction.The results show that the investor sentiment index significantly affects the fluctuation of the Shanghai Composite index,and the influence of positive emotions and negative emotions is asymmetric.Considering the investor sentiment indicators and conducting the signal decomposition can effectively improve the prediction performance of stocks,and improve the prediction effect by up to 20%.In the benchmark scenario,the performance of VMD-BiGRU models is better than that of multiple econometric models and machine learning models,with higher accuracy and effectiveness,and the general performance of yield and volatility prediction is improved by more than 40%.The performance of model promotion in three stocks,Wuliangye,Industrial and Commercial Bank of China and IFLYTEK,maintain the same stable and accurate prediction effect.
Literature Classification of Individual Reports of Adverse Drug Reactions Based on BERT and CNN
MENG Xiangfu, REN Quanying, YANG Dongshen, LI Keqian, YAO Keyu, ZHU Yan
Computer Science. 2024, 51 (6A): 230400049-6.  doi:10.11896/jsjkx.230400049
Abstract PDF(2512KB) ( 288 )   
References | Related Articles | Metrics
Clinically,the death caused by adverse drug reactions and the sharp increase in hospitalization and outpatient expenses caused by improper drug use have become one of the main problems faced by clinical safe and rational drug use.At present,the research of adverse drug reactions retrospective analysis and literature analysis is mostly based on published literature information.Academic literature is one of the important sources of data,and how to automatically process data in batches is particularly important.According to the unique expression of traditional Chinese medicine text,based on BERT and its combination algorithm,through the comparison experiment of text classification technology,an efficient and fast classification method for the literature data of adverse drug reactions case reports is established,and then the types of adverse drug reactions are distinguished.Experimental results show that the classification accuracy of BERT algorithm reaches 99.75%,which can accurately and efficiently classify the reported literature of adverse drug reactions,and has important value and significance for auxiliary medical treatment and constructing structured data of medical texts.
Study on Object Image Structure Algorithm of 3D Human Body Measurement System Based on Personalized Intelligent Customization of Clothing
WANG Yue, REN Jun, MA Fei, WU Long
Computer Science. 2024, 51 (6A): 230600233-5.  doi:10.11896/jsjkx.230600233
Abstract PDF(3099KB) ( 269 )   
References | Related Articles | Metrics
At present,the concept of personalized intelligent customization based on clothing in China is becoming increasingly perfect.However,in terms of automatic measurement of customer body size,China still lacks practical equipment and technology.The fundamental reason is that China’s clothing industry still lacks practical 3D human body measurement systems.Human body measurement has its own unique characteristics compared to other industrial product measurements.Based on the analysis of exis-ting optical 3D human body scanning measurement systems,this paper addresses the common problems of complex structural design and large volume in existing laser 3D human body scanning measurement systems.The principle design of a portable human body scanning measurement system is carried out,and a mathematical model of the system is established.This algorithm adopts an active pitch rotation scanning measurement method,which enables the system to obtain a larger measurement range in the vertical direction and obtain satisfactory 3D human point cloud data with the least number of devices,which is beneficial for saving space and reducing volume.
Intelligent Fault Diagnosis Method for Rolling Bearing Based on SAMNV3
ZHANG Lanxin, XIANG Ling, LI Xianze, CHEN Jinpeng
Computer Science. 2024, 51 (6A): 230700167-6.  doi:10.11896/jsjkx.230700167
Abstract PDF(3953KB) ( 294 )   
References | Related Articles | Metrics
In order to accurately identify the fault categories of rolling bearings,which are essential components of mechanical equipment,this paper proposes a SAMNV3 intelligent fault diagnosis model for rolling bearings that integrates the self-attention(SA) mechanism and the lightweight network MobileNetV3.This model takes advantage of the adaptive weighting of the features by the self-attention mechanism and the small size of the lightweight network MobileNetV3 to achieve end-to-end rolling bearing intelligent fault diagnosis by directly inputting the original vibration signals from two different datasets into the SAMNV3 model for feature extraction and fault identification and classification.The results of the validation of the two different datasets show that the model has high accuracy and low computational complexity,which can effectively improve the accuracy and reliability of rolling bearing fault diagnosis.
Survivability Evaluation of National Defense Engineering Power System Grid Considering MultipleAttack Strategies
LI Fei, CHEN Tong
Computer Science. 2024, 51 (6A): 230700171-8.  doi:10.11896/jsjkx.230700171
Abstract PDF(3085KB) ( 299 )   
References | Related Articles | Metrics
The survivability evaluation of the national defense power grid is an important part of the military technical perfor-mance evaluation of the national defense power system.In this paper,a comprehensive multi-indicator evaluation index system for grid survivability is established by considering several possible enemy attack strategies and analyzing various characteristics of the grid.The survivability evaluation is quantified in 3 dimensions:invulnerability,security,and recoverability.The simulation analysis is performed for a specific type of power grid.Simulation experiment results show that the assessment method is effective and can provide a convenient method for grid optimization of national defense power grid.
Implementation and Application of Chinese Grammatical Error Diagnosis System Based on CRF
LI Bin, WANG Haochang
Computer Science. 2024, 51 (6A): 230900073-6.  doi:10.11896/jsjkx.230900073
Abstract PDF(2223KB) ( 316 )   
References | Related Articles | Metrics
With the improvement of China’s international influence and the worldwide status of Chinese,the number of foreigners who learn Chinese as a second language increases year by year,and Chinese has become one of the most popular languages in the world.Based on this,the research of Chinese grammatical error diagnosis has attracted much attention.This paper first summarizes the current research status from the definition of Chinese grammatical error diagnosis.Secondly,through the analysis of various Chinese grammatical error diagnosis methods,a Chinese grammatical error diagnosis system based on conditional random field (CRF) is constructed to explore the Chinese grammar automatic error detection system and its specific application process,so as to assist Chinese learners in improving their learning efficiency.Experimental results on the CGED2016 dataset show that the system performs well in the detection and identification levels and needs to be improved in the position level.
Adaptive Modification Turbulence Model for Flow Field of Aircraft Calculating in Three Dimensions
PENG Ge, XU Xinggui, LI Zhongwu, REN Weihe, LI Kang, ZHENG Guoxian, DENG Hongyan
Computer Science. 2024, 51 (6A): 230900053-9.  doi:10.11896/jsjkx.230900053
Abstract PDF(7181KB) ( 286 )   
References | Related Articles | Metrics
In view of the problems of fluid compressibility judgment relying on empirical knowledge and the lack of confinement constraints in the current three-dimensional simulation methods for vehicle outer flow field,an adaptive modification turbulence calculating model for the three dimensions vehicle outer flow field is proposed.The proposed algorithm adopts the density-weighted average N-S equation to calculate the flow field distribution of the compressible fluid according to the a priori knowledge of the coefficient of variation of the vehicle outer flow field,and utilizes two strategies,namely,expansion-compression modification and surge indeterminate modification to modify the compressibility constraints of the k-ε turbulence model in an adaptive way,and uses the wall function to modify the constraints in the near-wall region in an approximate computation.Experimental results of three-dimensional modeling and visualization of the outer flow field distribution of a certain type of unmanned aerial vehicle(UAV) show that when the Mach value reaches 1.5,the compressibility modification algorithm produces effects obviously.The proposed adaptive compressibility modification model is able to effectively judge the change of gas compressibility and improve the accuracy of fluid distribution calculation.
Low-rank HOG Voice Detection Method for Short-wave Communication
BAI Jie, TIAN Ruili, REN Yifu, YUAN Jianxia
Computer Science. 2024, 51 (6A): 230600115-5.  doi:10.11896/jsjkx.230600115
Abstract PDF(2268KB) ( 264 )   
References | Related Articles | Metrics
The low accuracy of voice detection in noisy environment is an open challenge for short wave communication.The application of existing methods is limited,because it is difficult to reliably extract accurate and efficient voice features in the noise environment.To solve the above problem,a Low-rank histogram of oriented gradient(LHOG) voice detection method for short wave communication is proposed in this paper.Firstly,target audio source data is preprocessed to realize visual representation of voice information in noisy environment.Then,a low-rank structure is embedded in the HOG feature extractor to alleviate redundant information and reduce noise interference,so as to obtain accurate and efficient features.Finally,the common SVM classification model can be used to reliably distinguish voice from noise in noisy environment.The test results show that the accuracy of this method is 95.12%,the false positive rate is 0.96%,and false negative rate is 13.14%.Compared with the existing mainstream methods,the experiment shows that the average detection accuracy of this method is higher,and resource occupation is less.Therefore,this method can effectively improve the detection and control efficiency of short-wave communication.
Study and Verification on Few-shot Evaluation Methods for AI-based Quality Inspection in Production Lines
JIAO Ruodan, GAO Donghui, HUANG Yanhua, LIU Shuo, DUAN Xuanfei, WANG Rui, LIU Weidong
Computer Science. 2024, 51 (6A): 230700086-8.  doi:10.11896/jsjkx.230700086
Abstract PDF(3863KB) ( 297 )   
References | Related Articles | Metrics
With the advent of industry 4.0,the deep integration of manufacturing industry with artificial intelligence(AI) has become an important development trend.Industrial quality inspection has emerged as a significant breakthrough point.However,there is currently a lack of standardized methods for evaluating industrial quality inspection products in the industry.The performance of various quality inspection products is often opaque,making it difficult to optimize and scale up.In response to this situation,this paper proposes an AI-based industrial quality inspection algorithm evaluation method,which is suitable for the application needs of production lines in the industrial field.This method can evaluate AI-based industrial quality inspection products and their competitors in situations where the sample size is small and imbalanced.The evaluation method constructs a data set through cross-validation to avoid the problem of large evaluation result fluctuations caused by small and imbalanced data sets.It also uses gray box testing to avoid the subjectivity in evaluation results caused by a single source of data.Furthermore,it formulates relevant evaluation indicators based on the actual production needs of the production line,which can truly reflect the detection performance of quality inspection products in the production line application scenario.The proposed method is validated through benchmark evaluation of EL testing products for photovoltaic cells,demonstrating its feasibility and its ability to objectively reflect the true performance of various products.Finally,based on the analysis and comparison of the evaluation results,some suggestions are provided for the optimization of AI-based industrial quality inspection products.
Scheduling Optimization Method for Household Electricity Consumption Based on Improved Genetic Algorithm
HUANG Fei, LI Yongfu, GAO Yang, XIA Lei, LIAO Qinglong, DAI Jian, XIANG Hong
Computer Science. 2024, 51 (6A): 230600096-6.  doi:10.11896/jsjkx.230600096
Abstract PDF(2153KB) ( 320 )   
References | Related Articles | Metrics
In response to the problems of insufficient electricity economy and comfort at the customer side during the peak consumption period,an improved genetic algorithm based on optimization method for household electricity scheduling is proposed.The traditional genetic algorithm is improved and the electricity consumption behavior is optimized by adopting different coding methods for different types of appliances instead of the single coding of the traditional genetic algorithm,and using the fitness function with penalty function to constrain the time required for each appliance’s electricity consumption task.The results show that the proposed algorithm can effectively realize the optimization of electricity load scheduling based on time-of-use tariff,and provide customers with economical electricity concumption solutions with low complexity, it can effectively solve the problem of economic and comfort level of power consumption during the peak period of power consumption.
Domain-adversarial Statistical Enhancement for Cross-domain Fault Diagnosis
ZHU Yuhao, ZHANG Songzhao, ZHANG Yong
Computer Science. 2024, 51 (6A): 230700196-6.  doi:10.11896/jsjkx.230700196
Abstract PDF(3126KB) ( 329 )   
References | Related Articles | Metrics
Fault diagnosis is of great importance in ensuring the safe and stable operation of large-scale mechanical equipment.However,the obtained data often suffer from severe label shortages or lack of labels,and the data distribution varies significantly at different operating conditions.Traditional machine learning or fine-tuning methods have limitations in feature extraction,with a single pattern and fixed perspective,making it difficult to align features of the same class but different domains.To address these issues,this paper proposes a domain-adversarial statistical enhancement-based cross-domain fault diagnosis method called DASEM.This method utilizes direct transfer deep learning techniques to enhance the representation of global statistical characteristics within the framework of domain adversarial learning.It also integrates these characteristics with local structural patterns by constructing a dual-path feature extractor.The balance between domain labels and data structures is utilized to describe the manifestation of domain adversarial learning,and the fault diagnosis results are outputted based on class labels.Experimental results on the bearing datasets from Western Reserve University and Jiangnan University demonstrate the effectiveness of DASEM,achieving an average accuracy of 94.90% and 93.15%,respectively,for various cross-domain tasks.
Adaptive Sliding Mode Disturbance Rejection Control for Cricket System of Improved NDO
YANG Liwei, LI Ping, XIA Guofeng, WANG Tao
Computer Science. 2024, 51 (6A): 230700229-5.  doi:10.11896/jsjkx.230700229
Abstract PDF(2736KB) ( 283 )   
References | Related Articles | Metrics
To address the challenge of achieving high-precision tracking in a ball and plate system subject to uncertain distur-bances such as external loads,friction,and cross-coupling interference,this paper proposes an enhanced approach that combines an improved nonlinear disturbance observer(NDO) and a nonlinear information gain sliding mode control strategy.Utilizing error transformation techniques based on a prescribed performance function(PPF),we constrain the error within specified bounds,and introduce the improved NDO to estimate the uncertain disturbances accurately.We leverage the precise estimation to design a sliding mode controller with a corrective term on the sliding surface,which can reduce the impact of disturbances,and enhance control precision.Furthermore,we employ nonlinear information gain to adaptively adjust the switching gain,so as to mitigate chattering effects.Simulation results demonstrate that the enhanced NDO effectively estimates disturbances,leading to improved dynamic performance.The proposed control approach exhibits higher control precision and disturbance rejection capabilities.
Performance Risk Prediction of Power Grid Material Suppliers Based on XGBoost
LI Jinxia, BIAN Huaxing, WEN Fuguo, HU Tianmu, QIN Shihan, WU Han, MA Hui
Computer Science. 2024, 51 (6A): 230400115-9.  doi:10.11896/jsjkx.230400115
Abstract PDF(3670KB) ( 311 )   
References | Related Articles | Metrics
The performance quality of power grid material suppliers is the basis for the safe and stable operation of power grid,which involves many links and complex risk factors,causing the current research on it is relatively scarce and stays at the level of theoretical analysis.In order to solve this problem,a supplier performance risk prediction model based on XGBoost is proposed,which fully considers various risk factors during the whole process,integrates internal supply chain operation,knowledge map data,external eye inspection,epidemic situation and other data,constructs 191 risk features based on feature engineering for initial training,and retrains 49 selected features after model optimization,taking into account the requirements of prediction accuracy and feature interpretability in actual business,and uses SHAP method to explain the model.Experimental results show that,compared with other three mainstream machine learning algorithms,the accuracy rate,precision rate and KS value are as high as 93.05%,94.45% and 45.38%,which further verifies the feasibility and superiority of XGBoost model in the performance risk prediction.The prediction model can be applied to the power grid supply chain business to further guide the practical application.
Multi-agent Based Bidding Strategy Model Considering Wind Power
HUANG Feihu, LI Peidong, PENG Jian, DONG Shilei, ZHAO Honglei, SONG Weiping, LI Qiang
Computer Science. 2024, 51 (6A): 230600179-8.  doi:10.11896/jsjkx.230600179
Abstract PDF(2858KB) ( 302 )   
References | Related Articles | Metrics
Under the background of new power system,the pricing problem of new energy generators has been a research hotspot in the electricity spot market.Compared with traditional energy,wind power output is subject to more uncertain factors,which poses a challenge to wind power generators in finding the optimal bid.To address this issue,this paper proposes a pricing strategy model for generators that takes into account of wind power based on the multi-agent reinforcement learning algorithm named WoLF-PHC.In the model,the spot market includes wind power,thermal power,and hydropower,and each generator is abstracted as an intelligent agent,and a stochastic constrained planning algorithm is used to model the profit function of the wind power agent.For the pricing strategy model of the agents,the D3QN algorithm is combined with the WoLF-PHC algorithm,which enables the model to handle complex state spaces when bidding.In addition,to model the interactive environment,a DDPM diffusion model is proposed to generate wind power output data and optimize the simulation of wind power clearing scenarios.In this paper,simulation experiments are carried out based on a 3-node power simulation system.Experimental results show that the proposed wind power profit function modeling,WoLF-PHC improvement,wind power output generation,and other techniques are feasible,which can effectively solve the bidding pricing problem of wind power in the spot market,and learn better strategy after fewer iterations.
Research and Implementation of Urban Traffic Accident Risk Prediction in Dynamic Road Network
DONG Wanqing, ZHAO Zirong, LIAO Huimin, XIAO Hui, ZHANG Xiaoliang
Computer Science. 2024, 51 (6A): 230500118-10.  doi:10.11896/jsjkx.230500118
Abstract PDF(3110KB) ( 332 )   
References | Related Articles | Metrics
Accident risk prediction of traffic accidents through graph convolution networks is a research hotspot in the transportation field.However,the existing researches on using graph convolution networks for accident risk prediction lack semantic adjacency in graph construction and unable to perform adaptive learning of graph weights.To address these problems,a data-driven,multi-granularity and multi-view spatio-temporal topology graph is constructed based on multi-source traffic big data to realize the accurate modeling of spatio-temporal correlation and dependency in traffic network.The nodes on the graph provide a comprehensive description of the traffic state from time and space two dimensions,while the edges show the abstract adjacency relationship between roadways from geography and semantics two perspectives.Then,a dynamic spatio-temporal graph network based on the spatio-temporal topology graph is designed to achieve accurate prediction of roadway-level traffic accident risk.The model introduces spatial graph network layers with multi-headed attention mechanism to learn spatial correlations,while temporal learning units based on 1-D dilated convolution are used to capture short-time dependencies and long-time periodicity.According to large-scale experiments carried out on real traffic data in Beijing area,our method achieves the recall of 0.899 and the F-1 Score of 0.860.Meanwhile,there are also improvements in other indicators comparing to mainstream methods.
Traffic Subarea Boundary Control Strategy Based on Nonlinear Traffic Flow Model
WANG Xiaolong
Computer Science. 2024, 51 (6A): 230900016-7.  doi:10.11896/jsjkx.230900016
Abstract PDF(2523KB) ( 278 )   
References | Related Articles | Metrics
Urban traffic flow has complex nonlinear dynamic characteristics,which cannot be accurately described by a simplified linear traffic flow model.Therefore,in this paper,on the basis of considering the influence of perturbation on subarea boundary control,a nonlinear urban macroscopic traffic flow model considering perturbation is firstly established,so that the model can better describe the operation of actual traffic flow.Secondly,a subarea boundary control strategy based on iterative learning control is designed by combining the periodic characteristics of urban traffic flow operation,and the convergence of the iterative learning control law is analyzed by using the Lipschitz condition and partial derivatives.Finally,the effectiveness of the traffic subarea boundary control strategy based on the nonlinear traffic flow model is demonstrated by simulation cases.