Computer Science

CONTENTS

Computer Science. 2024, 51 (9): 0-0.

Abstract

PDF(301KB) ( 549 )

RelatedCitation | Metrics

Review on the Development and Application of Checkpointing Technology in High-performanceComputing

YAN Xiaoting, WANG Xiaoning, DONG Sheng, ZHAO Yining, XIAO Haili

Computer Science. 2024, 51 (9): 1-14. doi:10.11896/jsjkx.231000220

Abstract

PDF(2099KB) ( 587 )

References | Related Articles | Metrics

As high-performance computers grow in size and complexity,the fault tolerance of applications becomes one of the key challenges facing exascale computing.Checkpointing technology is one of the main means used to achieve fault-tolerance of applications,enabling fault recovery by periodically saving the execution state of applications.This paper conducts a review study on the development and application of checkpointing techniques for high performance computing.First,the development of checkpointing technology in the field of high performance computing is compiled.Then,the system-level checkpointing and application-level checkpointing work are described according to the different operation levels,including the mainstream tool software,available checkpointing techniques,and the application scenarios used.The application of checkpoint technology in four aspects:fault tolerance and resilience in parallel computing,scheduling and migration of HPC,FPGA debugging,and fault tole-rance and faithful replay in deep learning,is discussed.Finally,further research directions of checkpointing technology in the field of high-performance computing are proposed.

Optimizing Distributed GMRES Algorithm with Mixed Precision

GUO Shuaizhe, GAO Jianhua, JI Weixing

Computer Science. 2024, 51 (9): 15-22. doi:10.11896/jsjkx.231000204

Abstract

PDF(2392KB) ( 610 )

References | Related Articles | Metrics

The generalized minimum residual(GMRES) method is an iterative method for solving sparse linear systems.It is broadly used in many areas like scientific and engineering computing.The exponential data growth makes the scale of problems solved by the GMRES algorithm expand rapidly.To support the solving of large-scale problems,researchers have implemented distributed GMRES algorithm on clusters.However,the current inter-node network still significantly lags behind intra-node fa-brics in terms of both bandwidth and latency,which greatly limits the performance of the distributed GMRES algorithm.This paper proposes a mixed-precision approach for optimizing the GMRES algorithm on GPU clusters,where the data transferred is represented in a low-precision format,the network traffic during inter-GPU communication is significantly reduced.In addition,this paper proposes a balancing algorithm that dynamically adjusts the precision of the data transferred to achieve the satisfied resi-dual.Experimental results show that the proposed method achieves an average speedup of 2.4×,and a further average speedup of 7.6× when combined with other optimizations.

Optimization of Atomic Kinetics Monte Carlo Program TensorKMC Based on Machine Learning Atomic Potential Functions

LIU Renyu, CHEN Xin, SHANG Honghui, ZHANG Yunquan

Computer Science. 2024, 51 (9): 23-30. doi:10.11896/jsjkx.230400010

Abstract

PDF(3859KB) ( 534 )

References | Related Articles | Metrics

The nuclear reactor pressure vessel is a crucial component in a nuclear power plant,but it is susceptible to damage from irradiation during its use.This damage greatly affects its service life and poses a potential safety hazard.The atomic kinetics Monte Carlo(AKMC) method is an effective theoretical method for studying the irradiation damage of materials.It can be combined with numerical computer simulations to study the microstructural evolution of pressure vessels.Since irradiated damaged materials have defects,the modeling of interatomic interactions must consider non-spherical symmetric interactions.However,the TensorKMC method does not account for the angular interactions of atoms in its calculations.To address this issue,this paper proposes a fingerprint modeling method that includes angular interactions.It can be perfectly combined with the triple encoding of TensorKMC,and the computational process of angular fingerprinting can be simplified by using multiple weight.We have implemented this method in the TensorKMC program.The test results show that the angular fingerprint has a significant impact on the accuracy of the potential function.The higher the maximum angular momentum,the more accurate the potential function is.However,the simulation time consumed by the program will increase significantly.We also test the activation functions for the atomic potential function of TensorKMC.The results show that the gradient-smooth Softplus and SquarePlus have a significant advantage over the ReLU used in the initial version of TensorKMC in fitting the high-dimensional potential surface.The ReLU has a performance advantage at low maximum angular momentum,but as the maximum angular momentum increases,the different activation functions have almost no particular effect on the overall simulation time.Therefore,we recommend using gradient-smooth activation functions in practical studies.

Heterogeneous Parallel Computing and Performance Optimization for DSMC/PIC Coupled Simulation Based on MPI+CUDA

LIN Yongzhen, XU Chuanfu, QIU Haozhong, WANG Qingsong, WANG Zhenghua, YANG Fuxiang, LI Jie

Computer Science. 2024, 51 (9): 31-39. doi:10.11896/jsjkx.230300188

Abstract

PDF(2805KB) ( 655 )

References | Related Articles | Metrics

DSMC/PIC coupled simulation is an important high-performance computing application that demands efficient parallel computing for large-scale simulations.Due to the dynamic injection and migration of particles,DSMC/PIC coupled simulations based on MPI parallelism often suffer from large communication overheads and are difficult to achieve load balancing.To address these issues,we design and implement efficient MPI+CUDA heterogeneous parallel algorithm based on the self-developed DSMC/PIC simulation software.Combining the characteristics of the GPU architecture and the DSMC/PIC computation,we conduct a series of performance optimizations,including GPU memory access optimization,GPU thread workload optimization,CPU-GPU data transmission optimization,and DSMC/PIC data conflict optimization.We perform large-scale DSMC/PIC coupled he-terogeneous parallel simulations on NVIDIA V100 and A100 GPUs in the Beijing Beilong Super Cloud HPC system for the pulsed vacuum arc plasma jet application with billions of particles.Compared to the original pure MPI parallelism,the GPU heterogeneous parallelism significantly reduce simulation time,with a speedup of 550% on two GPU cards compared to 192 cores of the CPU,while maintaining better strong scalability.

Study on High Performance Computing Container Checkpoint Technology Based on CRIU

CHEN Yiyang, WANG Xiaoning, YAN Xiaoting, LI Guanlong ZHAO Yining, LU Shasha, XIAO Haili

Computer Science. 2024, 51 (9): 40-50. doi:10.11896/jsjkx.231000221

Abstract

PDF(3965KB) ( 437 )

References | Related Articles | Metrics

Fault tolerance has always been a hot and difficult problem in the field of high performance computing.Checkpointing is a common technical means to solve the fault tolerance problem,which can dump the state of running processes into files and recover them.Containers have strong resource isolation capability,which can provide a more ideal running environment and carrier for checkpointing technology and avoid the abnormality caused by the change of environment and resources in the case of node change after migration.Therefore,the combination of container and checkpointing can better support the research and implementation of task migration.This paper focuses on the design and optimization of Singularity checkpointing scheme based on CRIU(Checkpoint/Restore In Userspace).Based on the characteristics of checkpointing technology in HPC container applications,effective solutions are given in terms of safe use of CRIU,migration performance optimization,and maintaining network status.The paper extends the checkpointing function to Singularity and implements the prototype tool Migrator to verify the container migration performance.It can provide support for the subsequent implementation of HPC task migration.

LU Parallel Decomposition Optimization Algorithm Based on Kunpeng Processor

XU He, ZHOU Tao, LI Peng, QIN Fangfang, JI Yimu

Computer Science. 2024, 51 (9): 51-58. doi:10.11896/jsjkx.230900079

Abstract

PDF(2173KB) ( 455 )

References | Related Articles | Metrics

Scalable linear algebra PACKage(ScaLAPACK) is a parallel computing package suitable for MIMD(multiple instruction,multiple data) parallel computers with distributed storage.It is widely used in parallel application program development based on linear algebra operation.However,during the LU decomposition process,the routines in the ScaLAPACK library are not communication optimal and do not take full advantage of the current parallel architecture.To solve the above problems,a parallel LU factorization optimization algorithm(PLF) based on Kunpeng processor is proposed to achieve load balancing and adapt to domestic Kunpeng environment.PLF processes the data of different partitions of different processes differently.PLF allocates part of the data of each process to the root process for calculation.After the calculation is completed,the root process spreads the data back to each sub-process,which helps to fully utilize CPU resources and achieve load balancing.Tests are performed on single-node Intel 9320R processors and Kunpeng 920 processors.Intel MKL(Math Kernel Library) is used on the Intel platform,and PLF algorithm is used on the Kunpeng platform.After comparing the performance of solving equations of different scales on two platforms,it is found that the performance of solving equations on Kunpeng platform has a significant advantage compared with Intel platform.In the case of NUMA process and single thread,the optimized computing efficiency reaches 4.35% on a small scale on average,which is 215% higher than Intel's 1.38%.The average size of the medium scale reaches 4.24%,compared with 1.86% of Intel platform,an increase of 118%.The large-scale average reaches 4.24%,compared to Intel's 1.99%,an increase of 113%.

CPU Power Modeling Accuracy Improvement Method Based on Training Set Clustering Selection

LI Zekai, ZHONG Jiaqing, FENG Shaojun, CHEN Juan, DENG Rongyu, XU Tao, TAN Zhengyuan, ZHOU Kexing, ZHU Pengzhi, MA Zhaoyang

Computer Science. 2024, 51 (9): 59-70. doi:10.11896/jsjkx.231100015

Abstract

PDF(4141KB) ( 419 )

References | Related Articles | Metrics

Building a high-precision and low-cost CPU power model is crucial for power management and power optimization of computer systems.It is generally believed that the larger the size of the training set,the higher the accuracy of the CPU power model.However,some studies have found that increasing the size of the training set may not necessarily improve the accuracy of power modeling,or even sometimes leading to a decrease in accuracy.Therefore,it is necessary to screen the training set of the power model to ensure that the accuracy of the CPU power model does not decrease while achieving a low-cost target for model training.This paper proposes an optimization algorithm for training set selection based on clustering.It first converts PMC-based program features into a p-dimension vector feature space through principal component analysis (PCA),then clusters the programs according to the optimal number of clusters found,and selects representative programs from each cluster.Finally,according to the principle of selecting the strongest representative program within a single cluster and selecting the least number of representative programs among multiple clusters,a low-cost training set is achieved for a significant reduction in training overhead without loss of modeling accuracy. Experimental evaluation of the algorithm is conducted on both x86 and ARM-based processor platforms using linear power modeling and neural network power modeling,and the experimental results validate the effectiveness of the algorithm.These results indicate a significant improvement in CPU power consumption model accuracy.

Padding Load:Load Reducing Cluster Resource Waste and Deep Learning Training Costs

DU Yu, YU Zishu, PENG Xiaohui, XU Zhiwei

Computer Science. 2024, 51 (9): 71-79. doi:10.11896/jsjkx.231000222

Abstract

PDF(2385KB) ( 390 )

References | Related Articles | Metrics

In recent years,large-scale models have achieved remarkable success in multiple domains such as bioinformatics,natural language processing,and computer vision.However,these models often require substantial computational resources during the training and inference stages,resulting in considerable computational costs.Additionally,computing clusters experience imba-lances between supply and demand,manifesting as low resource utilization and difficulties in task scheduling.To address this problem,the concept of Padding Load is introduced,which leverages idle computing resources for computational tasks.Resources allocated to Padding Load can be preempted by other tasks at any time.However,they operate with a lower resource priority,leading to relatively lower costs.PaddingTorch is a distributed deep learning training framework tailored for Padding Load.Utilizing data from the Alibaba PAI cluster,job scheduling is simulated on four GPUs,specifically during peak task-switching intervals.PaddingTorch is employed to train a protein complex prediction model using the Padding Load approach.While the training duration is 2.8 times that of exclusive resource usage,there is an 84% reduction in training costs and a 25.8% increase in GPU resource utilization during the periods when Padding Load is employed.

Simulation of Limited Entangled Quantum Fourier Transform Based on Matrix Product State

LIU Xiaonan, LIAN Demeng, DU Shuaiqi, LIU Zhengyu

Computer Science. 2024, 51 (9): 80-86. doi:10.11896/jsjkx.230300215

Abstract

PDF(2141KB) ( 417 )

References | Related Articles | Metrics

Unlike classical computing,qubits in quantum computing can be in the superposition state and entangled state can be formed between multiple qubits.Representing a quantum state composed of n qubits requires storing 2 to the n^th power amplitudes.The exponential memory cost makes large-scale quantum simulation difficult.Using the HIP-Clang language,based on the heterogeneous programming model of CPU+DCU and representing the quantum state with the matrix product state,quantum Fourier transform is simulated.By combining the characteristics of the matrix product state and analyzing the quantum Fourier transform circuit,unnecessary tensor contraction operations and orthogonalization construction are reduced during simulation implementation.Tensor contraction during simulation is analyzed and the TTGT algorithm is used to complete tensor contraction operations while utilizing DCU's parallel processing capabilities to improve efficiency.Simulation results are analyzed and the correctness of the simulation is verified through amplitude error and semi-classical Draper quantum adder results.Analyzing simulation scale,when the entanglement entropy of the quantum state is maximum,using 16GB of memory can simulate up to 24bit quantum states at most,while when the entanglement of the quantum state is limited,it can simulate hundreds of qubits of quantum Fourier transform.

Domain Analysis Based Approach to Obtain Identical Results on Varying Number of Processors for Structural Linear Static Software

TANG Dehong, YANG Hao, WEN Longfei, XU Zhengqiu

Computer Science. 2024, 51 (9): 87-95. doi:10.11896/jsjkx.231100016

Abstract

PDF(2880KB) ( 346 )

References | Related Articles | Metrics

Obtain identical results on varying number of processors is a prerequisite for the reliability of parallel CAE software.However,during the development of parallel CAE software,various types of faults that can cause non-identical results are often introduced.These faults couple with each other to produce the final non-identical results,and are concealed within various levels of the CAE software that incorporates numerous lines of code.This poses the challenge to obtain identical simulation results on varying number of processors for parallel CAE software.When applied to parallel CAE software,traditional approaches such as expert knowledge and fault-location are often characterized by coarse granularity,poor accuracy,high cost or lack of systematism.To address this issue,we propose an approach that combines domain analysis with expert knowledge and dataflow state comparison to obtain identical results on varying number of processors for structural linear static software.This approach can be used to identify and fix faults that cause non-identical results in structural linear static software with high accuracy and low cost.Based on the above approach,we develop a corresponding tool and apply it in conjunction with the approach to identify and fix eight faults in SSTA,a structural linear static software.This endeavor helps SSTA to obtain strictly identical results on varying number of processors in more than ninety real simulation models,and significantly reduces the time required to identify a fault from more than two days to several hours.

Simulation of Grover's Quantum Search Algorithm in “Songshan” Supercomputer System

DU Shuaiqi, LIU Xiaonan, LIAN Demeng, LIU Zhengyu

Computer Science. 2024, 51 (9): 96-102. doi:10.11896/jsjkx.230600219

Abstract

PDF(2868KB) ( 440 )

References | Related Articles | Metrics

With its superposition and entanglement properties,quantum computing has a powerful parallel computing capacity.However,current quantum computers cannot ensure stable superposition states of large-scale qubits while performing quantum operations such as interference and entanglement.Therefore,the current approach to promote quantum computing is to simulate quantum computing using classical computers.The Grover quantum search algorithm is designed for the problem of searching unsorted databases,reducing the time complexity to square root level,and accelerating principal component analysis in machine learning.Therefore,studying and simulating the Grover algorithm can promote the development of quantum computing combined with machine learning and lay the foundation for its application as well as the simulation of Grover quantum search algorithm in the “Songshan” supercomputer system.By studying the Grover quantum search algorithm,the quantum circuit of the algorithm is simulated.The Toffoli quantum gate is used to optimize the quantum circuit,proposing a universal quantum circuit for the Grover algorithm while reducing two auxiliary qubits.The experiment is based on the CPU+DCU heterogeneous system of the “Song-shan” supercomputer system,using a two-level parallel strategy of MPI multiprocessing and HIP multithreading.By adjusting the position of the auxiliary qubits in the quantum circuit,the communication between MPI processes is reduced,and the data-depen-dent quantum states are transmitted using a fragmentation method.Compared to the serial version,the parallelized simulation algorithm achieves a maximum speedup of 560.33 times,realizing for the first time the Grover quantum search algorithm with a scale of 31 qubits.

g-Good-Neighbor Conditional Diagnosability and g-Extra Conditional Diagnosability of Hypercubes Under Symmetric PMC Model

TU Yuanjie, CHENG Baolei, WANG Yan, HAN Yuejuan, FAN Jianxi

Computer Science. 2024, 51 (9): 103-111. doi:10.11896/jsjkx.230700007

Abstract

PDF(1878KB) ( 371 )

References | Related Articles | Metrics

Fault diagnosis plays a very important role in maintaining the reliability of multiprocessor systems,and the diagno-sability is an important measure of the diagnosis capability of the system.Except for the traditional diagnosability,there are also conditional diagnosability,such as g-good-neighbor conditional diagnosability,g-extra conditional diagnosability,etc.Where g-good-neighbor conditional diagnosability is defined under the condition that every fault-free vertex has at least g fault-free neighbors,and g-extra conditional diagnosability is defined under the condition that every fault-free component contains more than g vertices.Fault diagnosis needs to be performed under a specific diagnosis model,such as PMC model,symmetric PMC model,in which the symmetric PMC model is a new diagnosis model proposed by adding two assumptions to the PMC model.The n-dimensional hypercube has many excellent properties,so it has been widely studied by researchers.At present,there are a number of diagnosability studies under the PMC models,but there is a lack of diagnosability studies under the symmetric PMC models.This paper first investigates the upper and lower bounds for the g-good-neighbor conditional diagnosability of hypercubes under the symmetric PMC model,with an upper bound of 2^g+1(n－g－1)+2^g－1 when n≥4 and 0≤g≤n－4 and a lower bound of (2n－2^g+1+1)2^g－1+(n－g)2^g－1－1 when g≥0 and n≥max{g+4,2^g+1－2^－g－g－1}.Also study the upper and lower bounds for the g-extra conditional diagnosability of hypercubes under the symmetric PMC model,the upper bound is 2n(g+1)－5g－2C²_g－2 when n≥4 and 0≤g≤n－4,and the lower bound is$\frac{3}{2} n(g+1)-g-\frac{5}{2} C_{g+1}^{2}-1$ when n≥4 and $0 \leqslant g \leqslant \min \left\{n-4,\left\lfloor\frac{2}{3} n\right\rfloor\right\}$.Finally,the correctness of the relevant theoretical conclusions is verified by simulation experiments.

FP8 Quantization and Inference Memory Optimization Based on MLIR

XU Jinlong, GUI Zhonghua, LI Jia'nan, LI Yingying, HAN Lin

Computer Science. 2024, 51 (9): 112-120. doi:10.11896/jsjkx.230900143

Abstract

PDF(2980KB) ( 387 )

References | Related Articles | Metrics

With the development of object detection models and language models,network models are becoming increasingly large.In order to better deploy the model on the end-to-end hardware,model quantization technology is usually used to compress the model.The existing model quantization strategies are mainly implemented based on FP16,BF16,INT8,and other types.Among them,the 8-bit data type is the most significant in reducing inference memory usage and deployment costs,but the INT8 type relies on specific calibration algorithms and fails to handle models with large dynamic ranges and multiple outliers well.The FP8 type can better fit the data distribution in neural networks,and has multiple formats that can be flexibly adjusted in terms of expression range and accuracy.However,the current MLIR lacks support for quantifying the FP8 type.To this end,a FP8 quantization simulation strategy based on MLIR is proposed,which includes two formats:FP8E4M3 and FP8E5M2.By quantifying and simulating the operators in the network,the impact of the two formats on the inference accuracy of the model is evaluated.A memory reuse strategy based on define use chain is proposed to address the issue of memory allocation redundancy in inference engines,further reducing the peak memory usage during the model inference process.Typical Yolov5s and Resnet50 models are selected for testing and verification,and the results show that,compared to the existing INT8 quantization strategy,the FP8 quantization strategy can maintain better model accuracy,and does not rely on specific calibration algorithms,making deployment more convenient.In terms of model accuracy,the test cases achieve an accuracy of 55.5% and 77.8%,respectively.After memory reuse optimization,the peak memory usage is reduced by about 15%~20%.

Correlation Filter Based on Low-rank and Context-aware for Visual Tracking

SU Yinqiang, WANG Xuan, WANG Chun, LI Chong, XU Fang

Computer Science. 2024, 51 (9): 121-128. doi:10.11896/jsjkx.230700045

Abstract

PDF(5261KB) ( 403 )

References | Related Articles | Metrics

Discriminative correlation filter(DCF)-based visual tracking approaches have attracted remarkable attention due to their good tradeoff between accuracy and robustness while running at real-time.However,the existing trackers still face model drift and even tracking failure situation when there are interferences such as long-term occlusion,out-of-view and out-of-plane rotation.To this end,we propose a low-rank and context-aware correlation filter(LR_CACF).Specifically,we directly integrate the target and its global contexts into DCF framework during filter learning stage to better discriminate the target from surrounding.Meanwhile,the low-rank constraint is injected across frames to emphasize the temporal smoothness,so that the learned filter is retained in a low-dimensional discriminant manifold to further improve tracking performance.Then,the ADMM is used to optimize the model effectively.Moreover,for model distortion,the multimodal detection mechanism is utilized to identify anomaly in the response.The filter stops training while extends the search regions to recapture the target when feedback is unreliable.Finally,extensive experiments are conducted on OTB50,OTB100 and DTB70 datasets,and the results demonstrate that,compared with the baseline SAMF_CA,LR_CACF achieves gains of 6.9%,4.0% and 7.1% in DP,respectively,and the average AUC improves by 3.6%,2.7% and 5.4%,respectively.Meanwhile,attribute-based evaluation shows that the proposed tracker is parti-cularly adept at handling the scenes such as occlusion,out-of-view,out-of-plane rotation,low resolution,and fast motion.

Image Arbitrary Style Transfer via Artistic Aesthetic Enhancement

LI Xin, PU Yuanyuan, ZHAO Zhengpeng, LI Yupan, XU Dan

Computer Science. 2024, 51 (9): 129-139. doi:10.11896/jsjkx.230800098

Abstract

PDF(5487KB) ( 419 )

References | Related Articles | Metrics

Current research has shown remarkable success in universal style transfer,which can transfer arbitrary visual styles to content images.However,in the evaluation dimension of arbitrary style transfer of images,it is not comprehensive to only consi-der the retention of semantic structure and the diversity of style patterns,and the artistic aesthetics should also be taken into account.Existing methods generally have the problem of artistic aesthetic unnaturalness,which is manifested in the disharmonious patterns and obvious artifacts in the stylized images which are easy to distinguish from the real paintings.To solve this problem,a novel artistic aesthetic enhancement image arbitrary style transfer(AAEST) approach is proposed.Specifically,first,a multi-scale artistic aesthetic enhancement module is designed to improve the problem of disharmonious patterns by extracting style image features at different scales.At the same time,an aesthetic-style attention module is designed,which uses the channel attention mechanism to adaptively match and enhance style features according to the global aesthetic channel distribution of the aesthetic features.Finally,a covariance transformation fusion module is proposed to transfer the second-order statistics of the enhanced style features to the corresponding content features,so as to achieve aesthetic-enhanced style transfer while preserving the content structure.The effectiveness of the proposed module and the added loss function are verified by qualitative comparison with the latest four style transfer methods and ablation experiments.In the comparison of five quantitative indicators,four achieve optimal scores.Experimental results show that the proposed method can generate more harmonious style transfer images.

Change Detection in SAR Images Based on Evolutionary Multi-objective Clustering

ZHOU Yu, YANG Junling, DANG Kelin

Computer Science. 2024, 51 (9): 140-146. doi:10.11896/jsjkx.230800014

Abstract

PDF(3422KB) ( 439 )

References | Related Articles | Metrics

SAR images change detection is a challenging task in the field of remote sensing,and it is an urgent problem to keep trade-off between robustness to noise and effectiveness of preserving the details.However,in order to better suppress speckle noise,it is inevitable that most of change detection methods loss image details to some extent.In order to solve this problem,a multi-objective clustering algorithm based on MOEA/D is proposed for change detection in SAR images.The change detection problem is formulated as a multi-objective optimization problem.Two conflicting objectives are constructed and then optimized by the proposed multi-objective clustering algorithm simultaneously.Finally,we obtain a set of change detection maps by the proposed technique.And the users can choose an appropriate one to satisfy their requirements.Experimental results on two SAR images show that the proposed method works well.

Few-shot Shadow Removal Method for Text Recognition

WANG Jiahui, PENG Guangling, DUAN Liang, YUAN Guowu, YUE Kun

Computer Science. 2024, 51 (9): 147-154. doi:10.11896/jsjkx.230800003

Abstract

PDF(3308KB) ( 405 )

References | Related Articles | Metrics

Shadow removal is an important task in the field of computer vision,with the goal of detecting and removing shaded regions from shadow regions in images.As image editing techniques are constrained by the quality of shaded images,existing me-thods exploit the knowledge from other tasks and the properties of shadows to obtain more effective feature vectors for shadow removal.Since the color and shape features of the text differ from the foreground and background in the shaded images,the text may be incorrectly detected as part of the shadows to generate incorrect results.To address this problem,a few-shot shadow removal method for text recognition is proposed.First,the features of the text incorrectly identified as shadows are used to produce base class data and new class data to enhance feature learning of such text in the infrastructure part of the few-shot target detection model.Second,the text itself is used to merge structurally relevant detection frames with multiple constraints to fix the objects correctly in the enhancement part of the detection frame merging algorithm.Experimental results validate the effectiveness of the proposed method on real and synthetic datasets.

Semantic-guided Neural Network Critical Data Routing Path

ZHU Fukun, TENG Zhen, SHAO Wenze, GE Qi, SUN Yubao

Computer Science. 2024, 51 (9): 155-161. doi:10.11896/jsjkx.230900109

Abstract

PDF(3187KB) ( 390 )

References | Related Articles | Metrics

In recent years,with the popularity of artificial intelligence in various fields,it has become an increasingly important topic to study the interpretable methods of neural networks and understand their running principles.As a subfield of neural network interpretability methods,the interpretability of network pathways garners increasing attention.This paper particularly focuses on the critical data routing path(CDRP),an interpretable method for network pathways.Firstly,the routing path visualization attribution of CDRP in the input domain is analyzed by use of the score-class activation map(Score-CAM) method,pointing out the inherent defects of the CDRP approach in terms of semantics.Then a channel semantic guided CDRP method termed as Score-CDRP is proposed,which improves the semantic consistency between the original deep neural network and its corresponding CDRP from the perspective of method mechanism.Lastly,experimental results demonstrate that the proposed Score-CDRP approach is more reasonable,effective and robust than CDRP in terms of visualization of the routing path heatmap as well as its corresponding prediction and localization accuracy.

Re-parameterization Enhanced Dual-modal Realtime Object Detection Model

LI Yunchen, ZHANG Rui, WANG Jiabao, LI Yang, WANG Ziqi, CHEN Yao

Computer Science. 2024, 51 (9): 162-172. doi:10.11896/jsjkx.230700106

Abstract

PDF(4461KB) ( 472 )

References | Related Articles | Metrics

The objects captured by drones at high altitudes are generally small and have weak features,and they are greatly affec-ted by complex weather conditions.Object detection based on visible or infrared images often has high rates of missed detection and false detection.To address this problem,this paper proposes a dual-modal realtime object detection model DM-YOLO with reparameterization enhancement.Firstly,the visible and infrared images are effectively fused by channel concatenation,which makes efficient use of the complementary information in the dual-modal images at a very low cost.Secondly,a more efficient reparameterization module is proposed and a more powerful backbone network RepCSPDarkNet is constructed based on it,which effectively improves the feature extraction capability of the backbone network for dual-modal images.Then,a multi-level feature fusion module is proposed to enhance the multiscale feature representation of weak and small objects by fusing multi-scale feature information of weak and small objects with multi-receptive field dilated convolution and attention mechanism.Finally,the deep feature layer of the feature pyramid is removed,which reduces the model size while maintaining the detection accuracy.Experimental results on the large-scale dual-modal image dataset DroneVehicle show that,the detection accuracy of DM-YOLO is 2.45% higher than that of the baseline YOLOv5s,and is better than that of the YOLOv6 and YOLOv7 models.Furthermore,it effectively improves the accuracy and robustness of object detection under complex weather conditions,while achieving a detection speed of 82 frames per second,which can meet the requirements of realtime detection.

Night Vehicle Detection Algorithm Based on YOLOv5s and Bistable Stochastic Resonance

HU Pengfei, WANG Youguo, ZHAI Qiqing, YAN Jun, BAI Quan

Computer Science. 2024, 51 (9): 173-181. doi:10.11896/jsjkx.230600056

Abstract

PDF(5171KB) ( 547 )

References | Related Articles | Metrics

Aiming at the problems of missed and false detection caused by weak illumination during night vehicle detection,an improved night vehicle detection algorithm is proposed based on bistable stochastic resonance and YOLOv5s.YOLOv5s is improved from four aspects,replacing small structures in Backbone and Neck to improve the detection ability of the network to small targets.A dual attention mechanism composed of coordinate attention CA and energy attention SimAM is added to improve the feature extraction ability of the network to the target.The lightweight backbone Fasternet is adopted to reduce the amount of model parameters.The WIoU loss function is used in Head to accelerate the convergence speed of bounding box regression loss.The effectiveness of the nighttime vehicle dataset is analyzed from quantitative and qualitative perspectives by using classical bistable stochastic resonance,and the enhanced nighttime vehicle images are passed into the improved YOLOv5s network for training.Experimental results show that,compared with the original YOLOv5s,the night vehicle detection algorithm combining improved YOLOv5s and bistable stochastic resonance has better accuracy and lower missed detection rate when performing long-range small targets and densely occluded night vehicle detection tasks.

Survey of Knowledge Graph Representation Learning for Relation Feature Modeling

NIU Guanglin, LIN Zhen

Computer Science. 2024, 51 (9): 182-195. doi:10.11896/jsjkx.240100113

Abstract

PDF(3454KB) ( 506 )

References | Related Articles | Metrics

Knowledge graph representation learning techniques can transform symbolic knowledge graphs into numerical representations of entities and relations,and then effectively combine various deep learning models to facilitate downstream applications of knowledge enhancement.In contrast to entities,relations fully embody semantics in knowledge graphs.Thus,modeling various characteristics of relations significantly influences the performance of knowledge graph representation learning.Firstly,aiming at the complex mapping properties of one-to-one,one-to-many,many-to-one,and many-to-many relations,relation-aware mapping-based models,specific representation space-based models,tensor decomposition-based models,and neural network-based models are reviewed.Next,focusing on modeling various relation patterns such as symmetry,asymmetry,inversion,and composition,we summarize models based on modified tensor decomposition,models based on modified relation-aware mapping,and models based on rotation operations.Subsequently,considering the implicit hierarchical relations among entities,we introduce auxiliary information-based models,hyperbolic spaces-based models,and polar coordinate system-based models.Finally,for more complex scenarios such as sparse knowledge graphs and dynamic knowledge graphs,this paper discusses some future research directions.It explores ideas like integrating multimodal information into knowledge graph representation learning,rule-enhanced relation patterns mo-deling,and modeling relation characteristics for dynamic knowledge graph representation learning.

Survey on Event Extraction Methods:Comparative Analysis of Deep Learning and Pre-training

WANG Jiabin, LUO Junren, ZHOU Yanzhong, WANG Chao, ZHANG Wanpeng

Computer Science. 2024, 51 (9): 196-206. doi:10.11896/jsjkx.231000123

Abstract

PDF(4247KB) ( 477 )

References | Related Articles | Metrics

Event extraction is born along with the development of information technology.As people's demand for extracting useful information from a wide variety of daily information is increasing,the research and development of event extraction has attracted more and more attention.This paper first introduces the development process of event extraction,clarifies the development context of event extraction,and then introduces two paradigms of event extractionand a comparative analysis of pipeline and fe-derated extraction paradigms is presented.Secondly,according to the level of event extraction,the development of event extraction in recent years is described from sentence level event extraction and text level event extraction.Then,the event extraction me-thods are compared and analyzed from three aspects:traditional event extraction methods,deep learning based event extraction methods,and Pre-training model-based event extraction methods.Finally,some typical application scenarios of event extraction are introduced,and the future development of event extraction topics is prospected according to the development status of event extraction.

Multi-modal Fusion Method Based on Dual Encoders

HUANG Xiaofei, GUO Weibin

Computer Science. 2024, 51 (9): 207-213. doi:10.11896/jsjkx.230700212

Abstract

PDF(2572KB) ( 421 )

References | Related Articles | Metrics

The dual encoder model has faster inference speed than the fusion encoder model,and can pre-calculate images and text during the inference process.However,the shallow interaction module used in the dual encoder model is not sufficient to handle complex visual language comprehension tasks.In response to the above issues,this paper proposes a new multi-modal fusion method.Firstly,a pre-interactive bridge tower structure(PBTS) is proposed to establish connections between the top layer of a single mode encoder and each layer of a cross-mode encoder.This enables comprehensive bottom-up interaction between visual and textual representations at different semantic levels,enabling more effective cross-modal alignment and fusion.At the same time,in order to better learn the deep interaction between images and text,a two-stage cross-modal attention double distillation method(TCMDD) is proposed,which uses the fusion encoder model as the teacher model and distills knowledge of the cross-modal attention matrix of the single modal encoder and fusion module simultaneously in the pre-training and tuning stages.Using 4 million images for pre-training and tuning on three public datasets to validate the effectiveness of this method.Experimental results show that the proposed multi-modal fusion method achieves better performance in multiple visual language comprehension tasks.

Domain-adaptive Entity Resolution Algorithm Based on Semi-supervised Learning

DAI Chaofan, DING Huahua

Computer Science. 2024, 51 (9): 214-222. doi:10.11896/jsjkx.230800102

Abstract

PDF(2191KB) ( 412 )

References | Related Articles | Metrics

Entity resolution is a fundamental task in many natural language processing tasks,which aims to find out whether two data entities refer to the same entity.Existing deep learning-based solutions for entity resolution typically require a large amount of annotated data,even when pre-trained language models are used for training.Obtaining such annotated data is challenging in real-world scenarios.To address this issue,a domain-adaptive entity resolution model based on semi-supervised learning is proposed.First,a classifier is trained on the source domain,and then domain adaptation is used to reduce the distributional difference between the source and target domains.Soft pseudo-labels from the augmented target domain are then added to the source domain for iterative training,enabling knowledge transfer from the source to the target domain.Comparison and ablation experiments are performed on 13 datasets from various domains.The results show that,compared to unsupervised baseline models,the proposed model achieves an average F1 score improvement of 2.84%,9.16%,and 7.1% across multiple datasets.Compared to supervised baseline models,it achieves comparable performance with only 20% to 40% of the labels required.Ablation experiments further demonstrate the effectiveness of the proposed model,and better entity resolution results can be obtained in general(The relevant code is available¹⁾).

Study on Following Car Model with Different Driving Styles Based on Proximal PolicyOptimization Algorithm

YAN Xin, HUANG Zhiqiu, SHI Fan, XU Heng

Computer Science. 2024, 51 (9): 223-232. doi:10.11896/jsjkx.230700131

Abstract

PDF(2921KB) ( 385 )

References | Related Articles | Metrics

Autonomous driving plays a crucial role in reducing traffic congestion and improving driving comfort.It remains of significant research importance to enhance public acceptance of autonomous driving technology.Customizing different driving styles for diverse user needs can aid drivers in understanding autonomous driving behavior,enhancing the overall driving experience,and reducing psychological resistance to using autonomous driving systems.This study proposes a design approach for deep reinforcement learning models based on the proximal policy optimization(PPO) algorithm,focusing on analyzing following behaviors in autonomous driving scenarios.Firstly,a large dataset of vehicle trajectories on German highways(HDD) is analyzed.The driving behaviors are classified based on features such as time headway(THW),distance headway(DHW),vehicle acceleration,and follo-wing speed.Characteristic data for aggressive and conservative driving styles are extracted.On this basis,an encoded reward function reflecting driver styles is developed.Through iterative learning,different driving style deep reinforcement learning models are generated using the PPO algorithm.Simulations are conducted on the highway environment platform.Experimental resultsde-monstrate that the PPO-based driving models with different styles possess the capability to achieve task objectives.Moreover,when compared to traditional intelligent driver model(IDM),these models accurately reflect distinct driving styles in driving behaviors.

CFGT:A Lexicon-based Chinese Address Element Parsing Model

HUANG Wei, SHEN Yaodi, CHEN Songling, FU Xiangling

Computer Science. 2024, 51 (9): 233-241. doi:10.11896/jsjkx.230900159

Abstract

PDF(3862KB) ( 421 )

References | Related Articles | Metrics

As a key step in the geocoding process,address element parsing directly affects the accuracy of geocoding.Due to the diversity and complexity of Chinese address expressions,two similar address texts may be completely different in geographical representation.Traditional address element parsing based on dictionary matching cannot handle ambiguous words well,thus showing poor recognition accuracy.A lexicon-based Chinese address element parsing model CFGT:collaborative flat-graph transformer is proposed,which uses self-matched words,nearest contextual and other lexical information to enhance the character sequence representation of address text,effectively curbing the ambiguity of address text expression.Specifically,the model first constructs two collaboration graphs,flat-lattice and flat-shift,to capture the knowledge of self-matched words and nearest contextual words for address characters,and designs a fusion layer to implement collaboration between graphs.Secondly,with the help of the improved relative position encoding,the enhancing effect of word information on the address text character sequence is further strengthened.Finally,Transformer and conditional random fields are used to analyze address elements.Experiments are conducted on multiple public datasets such as Weibo and Resume,as well as the private dataset Address.Experimental results show that the performance of the CFGT is superior to previous Chinese address element parsing models and existing models in the field of Chinese named entity recognition.

Text-Image Gated Fusion Mechanism for Multimodal Aspect-based Sentiment Analysis

ZHANG Tianzhi, ZHOU Gang, LIU Hongbo, LIU Shuo, CHEN Jing

Computer Science. 2024, 51 (9): 242-249. doi:10.11896/jsjkx.230600117

Abstract

PDF(2445KB) ( 468 )

References | Related Articles | Metrics

Multimodal aspect-based sentiment analysis is an emerging task in multimodal sentiment analysis field,which aims to identify the sentiment of each given aspect in text and image.Although recent research on multimodal sentiment analysis has made breakthrough progress,most existing models only use simple concatenation in multimodal feature fusion without considering whether there is semantically irrelevant information in image with text,which may introduce additional interference to the model.To address the above problems,this paper proposes a text-image gated gusion mechanism(TIGFM) model for multimodal aspect-based sentiment analysis,which introduces adjective-noun pairs(ANPs) extracted from the dataset images while text interacts with image,and treats the weighted adjectives as image auxiliary information.In addition,multimodal feature fusion is achieved by constructing a gating mechanism that dynamically controls the input of image and image auxiliary information in feature fusion stage.Experimental results demonstrate that TIGFM model achieves competitive results on two Twitter datasets,and then validate the effectiveness of proposed method.

Multimodal Sentiment Analysis Model Based on Visual Semantics and Prompt Learning

MO Shuyuan, MENG Zuqiang

Computer Science. 2024, 51 (9): 250-257. doi:10.11896/jsjkx.230600047

Abstract

PDF(2712KB) ( 459 )

References | Related Articles | Metrics

With the development of deep learning technology,multimodal sentiment analysis has become one of the research highlights.However,most multimodal sentiment analysis models either extract eigenvector from different modalities and simply use weighted sum method,resulting in data that cannot be accurately mapped into a unified multimodal vector space,or rely on image description models to translate image into text,resulting in the extraction of too many visual semantics without sentimental information and information redundancy,and ultimately affecting the performance of the model.To address these issues,a multimodal sentiment analysis model VSPL based on visual semantics and prompt learning is proposed.This model translates images into precise,concise,and sentimentally informative visual semantic vocabulary to alleviate the problem of information redundancy.Based on prompt learning,the obtained visual semantic vocabulary is combined with pre-designed prompt templates for sentiment classification tasks to form new text,achieving modal fusion.It not only avoids the problem of inaccurate feature space mapping caused by weighted sum method,but also stimulates the potential performance of pre-trained language model through prompt learning methods.Comparative experiments are conducted on multimodal sentiment analysis tasks,and the proposed model VSPL outperforms advanced baseline models on three public datasets.In addition,ablation experiments,feature visualization,and sample analysis are conducted to verify the effectiveness of VSPL.

Image-Text Sentiment Classification Model Based on Multi-scale Cross-modal Feature Fusion

LIU Qian, BAI Zhihao, CHENG Chunling, GUI Yaocheng

Computer Science. 2024, 51 (9): 258-264. doi:10.11896/jsjkx.230700163

Abstract

PDF(2991KB) ( 523 )

References | Related Articles | Metrics

For the image-text sentiment classification task,the cross-modal feature fusion strategy which combines early fusion and Transformer model is usually used for image-text feature fusion.However,this strategy prefers to focus on the unique information within a single modality,while ignoring the interconnections and common information among multiple modalities,resulting in unsatisfactory effect of cross-modal feature fusion.To solve this problem,a method of image-text classification based on multi-scale cross-modal feature fusion is proposed.On the one hand,for the local scale,local feature fusion is carried out based on the cross-modal attention mechanism,so that the model not only focuses onthe unique information of the image and text,but also explores the connection and common information between the image and text.On the other hand,for the global scale,global feature fusion based on MLM loss enables the model to conduct global modeling of image and text data,further mine the relationship between them,and thus promote the deep fusion of image and text features.Compared with ten baseline models on two public datasets,MVSA-Single and MVSA-Multiple,the proposed method shows distinct advantages in accuracy,F1 score,and model para-meter quantity,verifying its effectiveness.

Offline Reinforcement Learning Algorithm for Conservative Q-learning Based on Uncertainty Weight

WANG Tianjiu, LIU Quan, WU Lan

Computer Science. 2024, 51 (9): 265-272. doi:10.11896/jsjkx.230700151

Abstract

PDF(2886KB) ( 489 )

References | Related Articles | Metrics

Offline reinforcement learning,in which the agent learns from a fixed dataset without interacting with the environment,is a current hot spot in the field of reinforcement learning.Many offline reinforcement learning algorithms try to regularize value function to force the agent choose actions in the given dataset.The conservative Q-learning(CQL) algorithm avoids this problem by assigning a lower value to the OOD(out of distribution) state-action pairs through the value function regularization.However,the algorithm is too conservative to recognize the state-action pairs outside the distribution precisely,and therefore it is difficult to learn the optimal policy.To address this problem,the uncertainty-weighted conservative Q-learning algorithm(UWCQL) is proposed by introducing an uncertainty mechanism during training.The UWCQL adds uncertainty weight to the CQL regularization term,assigns higher conservative weight to actions with high uncertainty to ensure that the algorithm can more effectively train the agent to choose proper state-action pairs in the dataset.The effectiveness of UWCQL is verified by applying it to the D4RL MuJoCo dataset,along with the best offline reinforcement learning algorithms,and the experimental results show that the UWCQL algorithm has better performance.

Real-time Prediction Model of Carrier Aircraft Landing Trajectory Based on Stagewise Autoencoders and Attention Mechanism

LI Zhe, LIU Yiyang, WANG Ke, YANG Jie, LI Yafei, XU Mingliang

Computer Science. 2024, 51 (9): 273-282. doi:10.11896/jsjkx.230700149

Abstract

PDF(4363KB) ( 420 )

References | Related Articles | Metrics

During the landing process of an aircraft carrier,the carrier aircraft should fly along a relatively fixed trajectory to ensure that the touch point is located in the area where the stern arresting system is located.Therefore,the carrier aircraft trajectory is one of the important basis for the landing signal officer(LSO) to make decisions.The real-time prediction of carrier aircraft trajectory is helpful for the LSO to judge the situation of aircraft carrier landing operation and then form correct guidance instructions in time.Therefore,this paper proposes a real-time prediction of carrier aircraft landing trajectory based on stagewise auto-encoders and attention mechanism.In the first stage,a denoising autoencoder is used to extract features from historical trajectory data;in the second stage,a timeseries autoencoder is constructed based on a long short-term memory(LSTM),and at the same time,the attention mechanism is introduced to assign different weights to the encoder output at different times,and adaptively learns its influence on the final prediction result.The proposed model is compared with six baseline models through simulation experiments,and the results show that the comprehensive performance of the proposed model is better than that of the baseline model,which can meet the application requirements of real-time and accurate prediction of the landing trajectory.

Assembly Job Shop Scheduling Algorithm Based on Discrete Variable Neighborhood Mayfly Optimization

CHEN Yali, PAN Youlin, LIU Genggeng

Computer Science. 2024, 51 (9): 283-289. doi:10.11896/jsjkx.230900086

Abstract

PDF(1784KB) ( 365 )

References | Related Articles | Metrics

Due to the impact of the epidemic,it is more urgent for enterprises to reduce costs and increase efficiency by upgrading automated flexible production lines.In this context,the assembly job shop scheduling problem(AJSSP) has once again become a research hotspot in academia and business circles.AJSSP has one more assembly stage than ordinary job-shop scheduling pro-blems,so it has the phenomenon of mutual restriction and multi-machine parallel,and the problem solving is also more complica-ted.To solve this problem,a scheduling method based on a discrete variable neighborhood mayfly algorithm(D-VNMA) is proposed.The main work is as follows:1)Adopt the encoding and decoding mechanism conforming to Lamarkian characteristics to realize the iterative inheritance of individual effective information.2)Circle mapping and common heuristic algorithm are used to initialize the ephemera population to ensure the diversity of the population.3)A novel strategy for exploring neighborhoods,incorporating a variety of distinct neighborhood structures and search strategies,is employed to enhance the diversity of search schemes and optimize the efficiency of finding local optimal solutions.4)An improved mating strategy of male and female mayflies is proposed to accelerate the global exploration ability of the algorithm and improve the overall convergence speed of the algorithm.During the experiment,the optimal parameter setting of D-VNMA is obtained by the design of experiment(DOE) method,and D-VNMA is compared with other algorithms in AJSSP example data of different specifications.Experimental results show that the probability of obtaining the optimal solution of D-VNMA is increased by 30%,and the convergence efficiency is increased by 62.15%.

IRRT^*-APF Path Planning Algorithm Considering Kinematic Constraints of Unmanned Surface Vehicle

LIU Yi, QI Jie

Computer Science. 2024, 51 (9): 290-298. doi:10.11896/jsjkx.230900017

Abstract

PDF(3154KB) ( 404 )

References | Related Articles | Metrics

Aiming at the path planning problem of unmanned surface vehicle(USV) in unknown environment,an improved rapidly-exploring random tree artificial potential field path planning algorithm(IRRT^*- APF) considering the kinematics constraints of USV is proposed.The improved artificial potential field(APF) method is introduced to improve the obstacle avoidance perfor-mance of the rapidly-exploring random tree(RRT^*).The use of the taxicab geometry method greatly improves the efficiency of the RRT^* algorithm.The proposed IRRT^*- APF method is compared with the rolling RRT^* algorithm and PSOFS algorithm in simulation experiments,and the results show that the number of turns and corners planned by the proposed method are significantly reduced,which is conducive to the smooth control of the USV.At the same time,it reduces the time for planning the path.Further simulation experiments in the wind and waves interference environment are carried out,and the results show that the proposed algorithm can still plan the trajectory consistent with thekinematics constraints of USV even in the case of wind and waves interference,which shows strong robustness against wind and waves.

Improved Differential Evolution Algorithm Based on Time-Space Joint Denoising

WANG Bin, ZHANG Xinyu, JIN Haiyan

Computer Science. 2024, 51 (9): 299-309. doi:10.11896/jsjkx.230600074

Abstract

PDF(3066KB) ( 365 )

References | Related Articles | Metrics

In the optimization process of solving engineering problems,the evaluation of individual fitness may be affected by environmental noise,so as to affect the reasonable survival of the fittest operation on the population,and result in a decline in algorithm performance.In order to combat the impact of noise environment,an improved differential evolution algorithm(SEDADE) based on joint temporal and spatial denoising is proposed.The population is divided into two subpopulations according to fitness ranking,and the subpopulations composed of poorly evaluated individuals are evolved using a distribution estimation algorithm(EDA).Gaussian distribution is used to model the solution space,using the randomness of multiple individual noises in the solution space to offset the noise impact.Differential evolution algorithm(DE) is used to evolve subpopulations with better evaluated individual composition,and a time-based stagnation resampling mechanism is introduced to denoise to improve convergence accuracy.The EDA information utilization operation based on probability selection is performed on the two subpopulations derived from time-space mixed evolution,and the global information obtained from EDA search is used to guide the search direction of DE to avoid falling into local optimization.In the experiment,a benchmark function interfered by zero mean Gaussian noise is used,and it is found that SEDADE is competitive with other algorithms.In addition,the effectiveness and rationality of the proposed mechanism are verified through ablation experiments.

CCSD:Topic-oriented Sarcasm Detection

LIU Qilong, LI Bicheng, HUANG Zhiyong

Computer Science. 2024, 51 (9): 310-318. doi:10.11896/jsjkx.230600217

Abstract

PDF(1988KB) ( 419 )

References | Related Articles | Metrics

With the development of social media,an increasing number of people express their opinions about hot topics on social platforms,and the utilization of sarcastic expression has severely affected the accuracy of sentiment analysis in social media.Currently,topic-oriented sarcasm detection research does not consider the role of context and common sense knowledge simultaneously,and also ignores the scene of sarcasm recognition under the same topic.This paper proposes a sarcasm detection with context and commonsense(CCSD)approach.Firstly,the model uses the C³KG commonsense knowledge base to generate commonsense text.Then,the target sentence,topic context,and commonsense text are concatenated as the input to the pre-training BERT model.In addition,an attention mechanism is used to focus on important information in the target sentence and commonsense text.Finally,sarcasm detections are realized through gating mechanism and feature fusion.A topic-oriented sarcasm detection dataset is constructed to verify the effectiveness of the proposed model in specific topics.Experimental results show that the proposed model achieves better performance compared to baseline models.

Study on Adaptive Cloud-Edge Collaborative Scheduling Methods for Multi-object State Perception

ZHOU Wenhui, PENG Qinghua, XIE Lei

Computer Science. 2024, 51 (9): 319-330. doi:10.11896/jsjkx.240200036

Abstract

PDF(3627KB) ( 406 )

References | Related Articles | Metrics

With the development of smart cities and intelligent industrial manufacturing,the demand for comprehensive information from surveillance cameras for multi-objective visual analysis has become increasingly prominent.Existing research mainly focuses on resource scheduling on servers and improvements of visual model,which often struggle to adequately handle dynamic changes in system resource and task state.With the advancement of edge hardware resources and task processing models,designing an adaptive cloud-edge collaborative scheduling model to meet the real-time user requirements of tasks has become an essential approach to optimize multi-objective state perception tasks.Thus,based on a profound analysis of characteristics of multi-objective state perception tasks in cloud-edge scenarios,this paper proposes a model of adaptive task scheduler based on soft actor-critic(ATS-SAC).ATS-SAC intelligently decides key factors of tasks such as video stream configuration and model deployment configuration according to real-time analysis of runtime state,thereby significantly optimizing the accuracy and delay of multi-objective state perception tasks in cloud-edge scenarios.Furthermore,we introduce an action filtering method based on user expe-rience threshold that helps to eliminate redundant decision-making actions,so as to reduce the decision-making space of the mo-del.Depending on user's varied demands for performance outcomes of the multi-objective state perception tasks,ATS-SAC model can provide three flexible scheduling strategies,namely speed mode,balance mode,and precision mode.Experimental results show that,comparing to other executing methods,the scheduling strategies of ATS-SAC mo-del make multi-objective state perception tasks more satisfactory in terms of accuracy and delay.Moreover,when the real-time operating state changes,the ATS-SAC mo-del can dynamically adjust its scheduling strategies to maintain stable task processing results.

Edge Cloud Computing Approach for Intelligent Fault Detection in Rail Transit

LI Zhi, LIN Sen, ZHANG Qiang

Computer Science. 2024, 51 (9): 331-337. doi:10.11896/jsjkx.231200190

Abstract

PDF(2241KB) ( 369 )

References | Related Articles | Metrics

Rail transit systems are the main carrying system of transportation capacity in the current society.It is extremely sensitive to safety.Because multiple components of the system are directly exposed to the environment,they are affected by various environments and are prone to failures,which may cause train delays,passenger retention,service outage,or even catastrophic loss of life or property.Therefore,it is necessary to design a fault detection scheme so that effective maintenance measures can be taken.Different from traditional machine learning(ML) based fault classification work,this paper adopts Chinese bidirectional encoder representation from transformer(BERT) deep learning(DL) model for intelligent fault detection.The model can obtain bidirectional contextual understanding when dealing with fault detection tasks,so as to more accurately capture the semantic relationship in sentences,and understand the fault descriptions more accurately.The training of BERT requires a large amount of data support,and there are multiple operators in the field of rail transit,each of which holds independent fault detection data.Due to the confidentiality of the data,these data cannot be shared,which limits the training of the BERT model.This paper designs and adopts the federated edge cloud computing method,allowing multiple operators to jointly train the BERT model while maintaining data privacy.Federated learning combined with the edge cloud computing method allows the data of rail transit operators to be preliminary processed locally,and then the summarized gradients are uploaded to the cloud for model training,and finally the trained model parameters are sent back to each edge device to realize model updates.The research results show that the BERT model training using the federated edge cloud computing method is superior to the existing advanced solutions in the fault detection task in the field of rail transit.This method not only solves the problem of data confidentiality,but also effectively improves the accuracy and reliability of fault detection.

CLU-Net Speech Enhancement Network for Radio Communication

YAO Yao, YANG Jibin, ZHANG Xiongwei, LI Yihao, SONG Gongkunkun

Computer Science. 2024, 51 (9): 338-345. doi:10.11896/jsjkx.230700200

Abstract

PDF(3019KB) ( 365 )

References | Related Articles | Metrics

In order to overcome the adverse effects of environmental and channel noise on speech communication quality in radio systems and improve the speech quality of radio communication,this paper proposes a deep separable network called CLU-Net(channel attention and LSTM-based U-Net),which adopts the deep U-shape architecture and long short-term memory(LSTM).In the network,deep separable convolution is used to implement low-complexity feature coding.The combination of attention mechanisms and LSTM can pay attention to the relationship between different convolution channels and the context of clean speech simultaneously and obtain the clean speech characteristic with fewer parameters.Varieties of noisy speech datasets are tested,including public and self-built sets using noise collected in different environments and radio systems.The results of the simulation experiment on the VoiceBank-DEMAND dataset indicate that the proposed method outperforms similar speech enhancement models in terms of objective metrics such as PESQ and STOI.Field experimental results show that the enhancement scheme can effectively suppress different environmental and radio noise types.The performance under low signal-to-noise ratios is superior to that of the same kind of enhancement networks.

Parallel Construction of Edge-independent Spanning Trees in Augmented Cubes

LI Xiajing, CHENG Baolei, FAN Jianxi, WANG Yan, LI Xiaorui

Computer Science. 2024, 51 (9): 346-356. doi:10.11896/jsjkx.230700041

Abstract

PDF(3639KB) ( 367 )

References | Related Articles | Metrics

In recent years,more and more research work is being conducted around interconnected networks.Independent spanning trees(ISTs) can be used in reliable transmission,parallel transmission,safe distribution of information,and parallel diagnosis of fault servers,and have attracted many researchers' attention.In network communication,such as one-to-all broadcasting,reliable communication,multi-node broadcasting,fault-tolerant broadcasting,secure message distribution,and IP fast rerouting,edge-independent spanning trees(EISTs) play a significant role.The augmented cube of n dimension AQ_n is a node-symmetric variation of the n-dimensional hypercube Q_n,it has some embeddable properties that the hypercube and its variants do not have.However,the current EISTs construction methods in augmented cubes are serial construction.This paper first proposes parallel algorithms for constructing 2n－1 trees rooted at any node in AQ_n.Then,it proves that the 2n－1 trees obtained by algorithms are EISTs with the height n,and the algorithms' time complexity is O(N),where N equals the number of nodes in AQ_n.Finally,its accuracy is verified by simulation experiment.

Integrated VPN Solution

TAO Zhiyong, YANG Wangdong

Computer Science. 2024, 51 (9): 357-364. doi:10.11896/jsjkx.240200062

Abstract

PDF(3947KB) ( 413 )

References | Related Articles | Metrics

Aimed at the problems that the traditional VPN does not support the carrying of multiple data types,lack of security of data,and overweight label edge devices,an integrated VPN solution is proposed..The design includes the establishment of GRE VPN,the establishment of IPSEC VPN,the virtualization of network equipment,the establishment of MPLS VPN,the recognition and isolation of private network data,to realize the nesting of each VPN technology data and the mutual integration of each VPN technology.The integrated VPN supports multiple data types,also supports the security of data interaction,and can achieve private network data access control and address reuse,and can also realize the load sharing of data.In order to verify the feasibility of the scheme,tunnels,network resource pools,and label forwarding paths established by the scheme have been tested and ve-rified,and expected goal is achieved.In order to highlight the advantages of the scheme,it is compared with traditional methods in terms of backplane bandwidth and port rate.The analysis results show that the backplane bandwidth and port rate of the scheme increase with the increase of the device number in the resource pool,and its data transmission capability is multiplied compared with the traditional mode,and the data load is reduced.It is superior to the traditional scheme in load sharing,data security,ma-nageability and maintainability,and provides an new ideal for building a practical,reliable and secure VPN.

Study on SSL/TLS Encrypted Malicious Traffic Detection Algorithm Based on Graph Neural Networks

TANG Ying, WANG Baohui

Computer Science. 2024, 51 (9): 365-370. doi:10.11896/jsjkx.230800079

Abstract

PDF(2108KB) ( 492 )

References | Related Articles | Metrics

In order to achieve precise detection of SSL/TLS encrypted malicious traffic,a graph neural network-based model for malicious encrypted traffic detection is proposed,to address the issue of excessive reliance on expert experience in traditional machine learning methods.Through the analysis of SSL/TLS encrypted sessions,the interactive information within traffic sessions is characterized using a graph structure,transforming the problem of detecting malicious encrypted traffic into a graph classification task.The proposed model is based on a hierarchical graph pooling architecture,which aggregates through multiple layers of con-volutional pooling,incorporating attention mechanisms to fully exploit node features and graph structure information,resulting in an end-to-end approach for malicious encrypted traffic detection.The proposed model is evaluated on public CICAndMal2017 dataset.Experimental results demonstrate tha it achieves an accuracy of 97.1% in binary classification of encrypted malicious traffic detection,outperforming other models with an accuracy improvement of 2.1%,recall improvement of 3.2%,precision improvement of 1.6%,F1 score improvement of 2.1%.These results indicate that the proposed method exhibits superior representational and detection capabilities for malicious encrypted traffic in comparison to other methods.

IoT Device Recognition Method Combining Multimodal IoT Device Fingerprint and Ensemble Learning

LU Xulin, LI Zhihua

Computer Science. 2024, 51 (9): 371-382. doi:10.11896/jsjkx.230800076

Abstract

PDF(6754KB) ( 414 )

References | Related Articles | Metrics

The existing IoT device recognition methods have the problems of single feature dimension for characterizing device fingerprints,incomplete selection of traffic feature information,which easily lead to insufficient ability to characterize traffic features,and fail to fully exploit the recognition potential of multiple network models,resulting in unsatisfactory recognition results.To address these problems,this paper proposes a method called MultiDI(IoT device recognition method combining multimodal IoT device fingerprint and ensemble learning).First,to enhance the feature representation ability of IoT device fingerprints while preserving the traffic feature information,an improved Nilsimsa algorithm and data visualization method are combined to develop a multimodal IoT device fingerprint generation algorithm.Then,based on the generated IoT device fingerprint features,three neural network models are used to explore the different dimensional information of multimodal fingerprint features,enabling more comprehensive learning and recognition of IoT device traffic features.Lastly,to further explore the recognition potential of multiple network models,a classification connection network is constructed using weighted classification and LeakyRelu activation function.The proposed classification connection network is employed for ensemble learning,integrating the recognition results from multiple network models to enhance the accuracy of the MultiDI method for IoT device recognition.Experimental results show that the MultiDI method achieves 91.3%,98.6% and 99.2% weighted F1 values on the three datasets,respectively,which verifies its effectiveness.Compared with multiple IoT device recognition methods,it presents a relatively good recognition effect,verifing its efficiency.

Deep-learning Based DKOM Attack Detection for Linux System

CHEN Liang, SUN Cong

Computer Science. 2024, 51 (9): 383-392. doi:10.11896/jsjkx.230700035

Abstract

PDF(1723KB) ( 397 )

References | Related Articles | Metrics

Direct kernel object manipulation(DKOM) attacks hide the kernel objects through direct access and modification to the kernel objects.Such attacks are a long-term critical security issue in mainstream operating systems.The behavior-based online scanning can efficiently detect limited types of DKOM attacks,and the detection procedure can be easily affected by the attacks.In recent years,memory-forensics-based static analysis has become an effective and secure detection approach in the systems potentially attacked by DKOM.The state-of-the-art approach can identify the Windows system kernel objects using a graph neural network model.However,this approach cannot be adapted to Linux kernel objects and has limitations in identifying small kernel objects with few pointer fields.This paper designs and implements a deep-learning-based DKOM attack detection approach for Linux systems to address these issues.An extended memory graph structure is proposed to depict the points-to relation and the constant fields' characteristics of the kernel objects.This paper uses relational graph convolutional networks to learn the topology of the extended memory graph to classify the graph nodes.A voting-based object inference algorithm is proposed to identify the kernel objects' addresses.The DKOM attack is detected by comparing our kernel object identification results and the results of the memory forensics framework Volatility.The contributions of this paper are as follows.1) An extended memory graph structure that improves the detection effectiveness of the existing memory graph on capturing the features of small kernel data structures with few pointers but with evident constant fields.2) On the DKOM attacks raised by five real-world Rootkits,our approach achieves 20.1% higher precision and 32.4% higher recall than the existing behavior-based online scanning tool chkrootkit.

Enhanced Location K-anonymity Privacy Protection Scheme Based on Geohash

LI Yongjun, ZHU Yuefei, BAI Lifang

Computer Science. 2024, 51 (9): 393-400. doi:10.11896/jsjkx.230800183

Abstract

PDF(2245KB) ( 415 )

References | Related Articles | Metrics

With the wide application of LBS,location privacy protection is imperative.In recent years,location k-anonymity solution has become a research hotpot which is widely used.However,k-anonymity schemes are vulnerable to background knowledge attacks.Although some scholars have considered location-related information to varying degrees,they are not comprehensive,and the current form in the anonymous scheme are relatively time-consuming.Based on this,in order to resist background knowledge attacks from adversaries,enhanced location k-anonymity scheme is proposed,which fully considers the semantic information,time attributes,query probability and query semantics related to physical location when constructing anonymous areas.When the location point is selected,it is necessary to ensure that the selected location is relatively scattered.In order to reduce the time consumption of anonymous area construction,Geohash is used to encode the location information.Finally,experiments on real data sets show that the proposed scheme can provide better location privacy protection.

Cross-age Identity Membership Inference Based on Attention Feature Decomposition

LIU Yulu, WU Shuhong, YU Dan, MA Yao, CHEN Yongle

Computer Science. 2024, 51 (9): 401-407. doi:10.11896/jsjkx.230600112

Abstract

PDF(3635KB) ( 410 )

References | Related Articles | Metrics

Generative adversarial networks(GANs) can generate high-resolution “non-existent” realistic images,so they are widely used in various artificial data synthesis scenarios,especially in the field of face image generation.However,the face generators based on these models typically require highly sensitive facial images of different identities for training,which may lead to potential data leakage enabling attackers to infer identity membership relationships.To address this issue,this study proposes an identity membership inference attack when significant difference exist between the obtained samples and the actual training samples for the queried identity,resulting in a drastic decline in the performance of identity membership inference based on samples.Subsequently,a reconstruction error attack scheme is designed based on attention feature decomposition to further enhance the attack performance.This scheme maximizes the elimination of influences from factors such as background poses between different samples,as well as mitigates the representation difference caused by a large age span.Extensive experiments are conducted on three representative face datasets,training generative models with three mainstream GAN architectures and performing the proposed attacks.Experimental results demonstrate that the proposed attack scheme achieves an average increase of 0.2 in AUCROC value compared to previous researches.

Transaction Granularity Modifiable Consortium Blockchain Scheme Based on Dual Merkel Trees Block Structure

WANG Dong, LI Xiaoruo, ZHU Bingnan

Computer Science. 2024, 51 (9): 408-415. doi:10.11896/jsjkx.231000054

Abstract

PDF(2090KB) ( 420 )

References | Related Articles | Metrics

With the vigorous development of blockchain technology,information systems based on blockchain have been applied in many fields,including digital currency,supply chain and other fields.Driven by the dual needs of supervision and practical application,modifiable blockchain technology has been developed.However,the current modification scheme still has problems such as excessive centralization of modification authority and low modification efficiency.In response to the aforementioned problems,a transaction-granularity consortium blockchain ledger modification approach is proposed.It constructs a dual Merkle tree block structure,utilizing elliptic curve encryption and Diffie-Hellman key exchange technology to encrypt and store the chameleon hash trapdoor information(i.e.,chameleon hash private key) in the blockchain,reducing the system communication overhead for key distribution.On this basis,the modification right is bound to the user through the Merkle tree,and the proposal is subject to vo-ting review by authorized nodes,which effectively prevents the mining of modification rights and further improves the regulatory warehouse capabilities of the blockchain system.Modification experiments show that the overall algorithm execution speed of this consortium blockchain ledger solution reaches the millisecond level,and significantly reduces the additional overhead of data ope-rations on the chain.

Web Access Control Vulnerability Detection Approach Based on Site Maps

REN Jiadong, LI Shangyang, REN Rong, ZHANG Bing, WANG Qian

Computer Science. 2024, 51 (9): 416-424. doi:10.11896/jsjkx.230900075

Abstract

PDF(2285KB) ( 424 )

References | Related Articles | Metrics

Attackers usually exploit access control vulnerabilities in web applications to gain unauthorized access to systems or engage in malicious activities such as data theft.Existing methods for detecting access control vulnerabilities in web applications suffer from low page coverage and high detection overhead,resulting in high false negative rates and inefficient performance.To address this issue,a web access control vulnerability detection method based on sitemap is proposed using dynamic analysis.The method starts by establishing separate site maps for different user roles and combining them to create comprehensive site maps for each role.Then,by analyzing the site maps,the expected web application access control strategies are derived.Illegal test cases are constructed to dynamically access and analyze the execution results,enabling the detection of unauthorized access and privilege escalation vulnerabilities.Finally,the proposed method is validated on seven real-world open-source web applications.The results demonstrate that this approach significantly reduces overhead,achieves a page coverage rate of over 90%,and successfully detects 10 real vulnerabilities with a recall rate of 100%.