Computer Science

CONTENTS

Computer Science. 2024, 51 (7): 0-0.

Abstract

PDF(297KB) ( 577 )

RelatedCitation | Metrics

Bug Report Reformulation Method Based on Topic Consistency Maintenance and Pseudo-correlation Feedback Library Extension

LIU Wenjie, ZOU Weiqin, CAI Biyu, CHEN Bingting

Computer Science. 2024, 51 (7): 1-9. doi:10.11896/jsjkx.230400069

Abstract

PDF(2083KB) ( 748 )

References | Related Articles | Metrics

To enhance the speed of locating software bugs for developers,a set of bug location techniques based on text retrieval has been proposed.These techniques aim to automatically recommend potentially suspicious code files associated with bug reports submitted by users.However,due to varying levels of professional expertise among users,the quality of bug reports tends to be inconsistent.As a result,some low-quality bug reports cannot be successfully located.To improve the quality of those bug reports,it is common to refactor the bug reports.Existing mainstream methods for reformulation,which involve query extension and query reduction,often face issues such as inconsistent query topics before and after reformulation or the utilization of poor-quality pseudo-correlation libraries.To address this problem,this paper proposes a bug report reformulation method that focuses on maintaining topic consistency and extending pseudo-correlation feedback libraries.This method consists of two parts:the query reduction stage,which aims to maintain topic consistency through combining a concise problem description with keywords extracted from the text,and the query expansion stage,which involves using various locating tools(Lucene,BugLocator,and Blizzard) to comprehensively obtain a pseudo-correlation feedback library.From this library,additional keywords for query expansion are extracted to address the issue of low reformulation quality caused by the inadequacy of the existing pseudo-correlation feedback library.Ultimately,the outputs of the query reduction and expansion stages are combined to form the reformulated query.Through experiments conducted on six Java projects,it is discovered that for low-quality bug reports that could not be identified among the top 10 recommended files using the existing bug location method,21%~39% of them can be located using the proposed reformulation method,i.e.,Accuracy@10 and MRR@10 is 10%~16%.Compared withexisting reformulation techniques,the Accuracy@10and MRR@10 of the proposed reformulation method can improve by 7%~32% and 2%~13%,respectively.

Integrated Avionics Software Code Automatic Generation Method for ARINC653 Operating System

LING Shixiang, YANG Zhibin, ZHOU Yong

Computer Science. 2024, 51 (7): 10-21. doi:10.11896/jsjkx.230600216

Abstract

PDF(5977KB) ( 596 )

References | Related Articles | Metrics

Integrated modular avionics(IMA) is a typical safety-critical system characterized by its distributed,heterogeneous nature and strong coupling of computing and physical resources.With the increasing complexity and intelligence of IMA systems,software is increasingly being used to implement system functionalities.Modeling and generating code for such complex software pose significant challenges.This paper presents a code generation approach for IMA systems based on the architecture analysis and design language(AADL).Firstly,an extension of the HMC4ARINC653(heterogeneous model container for ARINC653) attribute set is proposed to enable the description of IMA software architecture,heterogeneous functional behavior,and non-functional attributes.Secondly,mapping rules from the IMA model to C code and ARINC653 system configuration files are defined,adhering to the MISRA C safety coding guidelines.The generated code can be deployed and simulated on the ARINC653opera-ting system.Finally,the corresponding prototype tool is designed and implemented to validate the effectiveness of the methodology and tools proposed in this paper with the ARINC653 operating system and real cases from the industry.

Study on Deep Learning Automatic Scheduling Optimization Based on Feature Importance

YANG Heng, LIU Qinrang, FAN Wang, PEI Xue, WEI Shuai, WANG Xuan

Computer Science. 2024, 51 (7): 22-28. doi:10.11896/jsjkx.230500220

Abstract

PDF(3424KB) ( 639 )

References | Related Articles | Metrics

With the rapid development of deep learning and hardware architectures,the diversity of models and hardware architectures make the deployment for deep learning models with high performance manually become increasingly challenging.So current AI compiler framework often adopts automatic scheduling.Since the existing optimization to TVM automatic scheduling has such issues as unbalanced data sets in cost model and overlong scheduling time,an automatic scheduling optimization strategy based on feature importance is designed in this paper.First,the feature importance is analyzed through the xgboost algorithm.Then a stra-tegy that reduce the data feature dimensions based on the importance coefficient and reassign the data labels is adopted to improve the precision of the cost model and optimize the efficiency of the automatic scheduling.Experiment results show that the proposed optimization method can reduce the automatic scheduling time of three kinds of deep learning models by 9.7%~17.2%,and reduce the inference time by up to 15%.

Natural Language Requirements Based Approach for Automatic Test Cases Generation of SCADE Models

SHAO Wenxin, YANG Zhibin, LI Wei, ZHOU Yong

Computer Science. 2024, 51 (7): 29-39. doi:10.11896/jsjkx.230600126

Abstract

PDF(5135KB) ( 572 )

References | Related Articles | Metrics

With the increasing scale and complexity of safety-critical software,model-driven development(MDD) is widely used in safety-critical fields.As an important modeling method and tool,SCADE can express deterministic concurrent behavior and has precise time semantics,which is suitable for modeling,testing and verification of safety-critical software.At present,the existing methods mainly use manual methods to construct SCADE model test cases,and there are some problems such as inconsistency between requirements and test cases,high cost and easy to make mistakes.This paper presents an automatic generation method of SCADE model test cases based on natural language requirements.Firstly,an automatic test case generation method based on mo-del checking is presented,which generates atomic propositions by natural language requirements processing to generate the assume and observer models,and provides the rules of trap properties generation to generate trap properties for model checking.Secondly,a test case quality evaluation method based on coverage analysis and mutation testing is presented,and the mutation testing is carried out on SCADE model.Finally,the prototype tool is designed and implemented,and an industrial case of pilot ejection seat control system is analyzed to verify the effectiveness of the proposed method.

Advances in SQL Intelligent Synthesis Technology

LIU Yumeng, ZHAO Yijing, WANG Bicong, WANG Chao, ZHANG Baomin

Computer Science. 2024, 51 (7): 40-48. doi:10.11896/jsjkx.231000143

Abstract

PDF(1837KB) ( 527 )

References | Related Articles | Metrics

In recent years,with the rapid development of technologies such as big data and cloud computing,large-scale data ge-neration has deepened the dependence of various applications on database technology.However,traditional databases typically operate through the formalized database query language SQL,which poses a significant difficulty for users without programming or database usage experience,reducing the accessibility of databases across various fields.With the rapid advancement of artificial intelligence technologies like machine learning and deep neural networks,especially the surge of large language model technology sparked by the emergence of ChatGPT,there has been a profound synthesis and technological transformation of databases and intelligent technology.Intelligent methods are employed to automatically translate user input language into SQL,meeting the operational needs of database users of varying levels of expertise and enhancing databases' intelligence,environmental adaptability,and user-friendliness.To comprehensively focus on the latest research developments in intelligent SQL generation technology,this paper delves into three types of user inputs-example-based,text-based,and voice-based-and provides a detailed exposition of the research trajectory,representative works,and the latest advancements of various intelligent synthesis models.Additionally,this paper categorizes and compares the technical frameworks of these methods and provides an overall summary.Finally,it paper looks forward to future development directions in light of existing problems and challenges with current methods.

Development on Methods and Applications of Cognitive Computing of Urban Big Data

LIU Wei, SUN Jia, WANG Peng, CHEN Yafan

Computer Science. 2024, 51 (7): 49-58. doi:10.11896/jsjkx.221200039

Abstract

PDF(2071KB) ( 498 )

References | Related Articles | Metrics

Urban big data provides theory and action support for urban operation state estimation and comprehensive decision-making,while its characteristics of multi-source heterogeneity,low coupling and dynamic change bring great challenges to traditional integrated analysis.Cognitive computing is applicable to the mining of time-varying multidimensional,complex and diverse data,and can conduct adaptive learning and evolution of problems.Based on the characteristics of different types and structures of urban big data,this paper summarizes the corresponding processing methods according to the four stages of the cognitive process,and further classifies the above specific methods at the conceptual level according two the angle of knowledge driven,data driven and knowledge and data driven.Finally,it forms an organic collaboration between the methods of different driving modes in the cognitive process,and the urban big data cognitive closed-loop from perception and understanding to decision-making behavior.At the same time,it summarizes the research and development status of urban big data cognitive computing in multiple application fields.Finally,the challenges of cognitive computing in the field of urban big data construction are discussed,and the future deve-lopment trend are prospected.

Overview of Sample Reduction Algorithms for Support Vector Machine

ZHANG Daili, WANG Tinghua, ZHU Xinglin

Computer Science. 2024, 51 (7): 59-70. doi:10.11896/jsjkx.230400143

Abstract

PDF(1675KB) ( 471 )

References | Related Articles | Metrics

Support vector machine(SVM) is a supervised machine learning algorithm developed based on statistical learning theory and the principle of structural risk minimization,which effectively overcomes the problems of local minimum and curse of dimensionality and has good generalization performance.SVM has been widely used in the fields of pattern recognition and artificial intelligence.However,the learning efficiency of SVM decreases significantly with the increase of the number of training samples.For large-scale training datasets,the traditional SVM with standard optimization methods will be confronted with the problems of excessive memory requirements,slow training speed,and sometimes even being unable to execute.To alleviate the problems of high storage requirements and long training time of SVM on large-scale training sets,scholars have proposed SVM sample reduction algorithms.This paper firstly introduces the theoretical basis of the SVM and then systematically reviews the current research status of the SVM sample reduction algorithms from five aspects based on clustering,geometric analysis,active learning,incremental learning and random sampling,respectively.And it discusses the advantages and disadvantages of these algorithms,and finally presents an outlook on the future research of the SVM sample reduction methods.

Efficient Query Workload Prediction Algorithm Based on TCN-A

BAI Wenchao, BAI Shuwen, HAN Xixian, ZHAO Yubo

Computer Science. 2024, 51 (7): 71-79. doi:10.11896/jsjkx.231100200

Abstract

PDF(3507KB) ( 489 )

References | Related Articles | Metrics

The query workload prediction algorithm based on a novel time series prediction model is proposed to address the pro-blem of database management system cannot be optimized in time due to the dynamic change of query workload and the difficulty of forecasting effectively in the field of big data querying.First of all,the algorithm preprocesses the original historical users' queries by filtering,temporal interval partition and query workload construction to obtain the query workload sequence which is convenient for the network model to analyze and process.Secondly,the algorithm constructs a time series prediction model with temporal convolution network as the core,extracts the historical trend and auto-correlation characteristics of query workload,and realizes the time series prediction efficiently.At the same time,the algorithm integrates the designed temporal attention mechanism to weight the important query workloads to ensure that the query workload sequence can be analyzed and calculated efficiently by the model,and thus improving the performance of prediction algorithm.Finally,the algorithm uses the above time series prediction model to make full use of the query interval time to accurately predict the future query workloads,so that the database management system can achieve self-performance tuning in advance to adapt to the dynamic change of the workloads.Expe-rimental results show that the designed query workload prediction algorithm exhibits good prediction performance on several evaluation metrics and is able to predict future query workload accurately over the query time interval.

Dynamic Treatment Regime Generation Model Combining Dead-ends and Offline SupervisionActor-Critic

YANG Shasha, YU Yaxin, WANG Yueru, XU Jingming, WEI Yangjie, LI Xinhua

Computer Science. 2024, 51 (7): 80-88. doi:10.11896/jsjkx.231000138

Abstract

PDF(2965KB) ( 447 )

References | Related Articles | Metrics

Reinforcement learning has low dependence on mathematical models,and it is easy to construct and optimize models by using experience,which is very suitable for dynamic treatment regime learning.However,existing studies still have the following problems:1)risk is not considered when learning strategy optimality,resulting in certain risks in the learned policy; 2)the problem of distribution deviation is ignored,resulting in learning policies completely different from the doctor's policy; 3)the patient's histo-rical observation data and treatment history are ignored,thus failing to obtain a good patient status and thus failing to learn the optimal policy.Based on this,DOSAC-DTR,a dynamic treatment regime generation model combining dead-ends and offline supervision actor-critic,is proposed.First,considering the risk of treatment actions recommended by the learned policies,the concept of dead-ends is integrated into the actor-critic framework.Secondly,in order to alleviate the problem of distribution offset,physician supervision is integrated into the actor-critic framework to minimize the gap between learned policies and doctors' policies while maximizing the expected return.Finally,in order to obtain a state representation that includes critical patient historical information,a LSTM-based encoder decoder model is used to model the patient's historical observation data and treatment history.Experiments show that DOSAC-DTR has better performance than the baseline approach,resulting in lower estimated mortality rates and higher Jaccard coefficients.

Decision Implication Preserving Attribute Reduction in Decision Context

BI Sheng, ZHAI Yanhui, LI Deyu

Computer Science. 2024, 51 (7): 89-95. doi:10.11896/jsjkx.230900009

Abstract

PDF(1450KB) ( 425 )

References | Related Articles | Metrics

Formal concept analysis is a theory of data analysis using concept lattice,and attribute reduction is one of the main ways of concept lattice reduction.Decision implication is a knowledge representation and reasoning model of formal concept analysis in decision situations.In the existing research on attribute reduction that preserves decision context knowledge information,concept rules or granular rules are usually used to preserve decision context knowledge information.Compared with concept rules and granular rules,decision implication has a stronger ability of knowledge representation.To further reduce the difference between the representation of knowledge information before and after attribute reduction,a study is conducted on attribute reduction which preserves decision implication.Firstly,based on the semantics of decision implication,the definitions of consistent set and reduction that preserve decision implication aregiven,and the necessary and sufficient conditions for determining consistent set and reduction are provided.Examples show the problems of the reduction,and by combining implication theory,the definitions of weak consistent set and weak reduction are introduced.Then,the rationality of weak reduction compared with reduction is analyzed from the perspective of knowledge inclusion.Finally,the necessary and sufficient conditions for judging weak consistent set and weak reduction are provided,and the method that can find weak reduction is given by combining decision implication canonical basis,which enriches the research of attribute reduction that preserves knowledge information.

Multilabel Feature Selection Based on Fisher Score with Center Shift and Neighborhood IntuitionisticFuzzy Entropy

SUN Lin, MA Tianjiao

Computer Science. 2024, 51 (7): 96-107. doi:10.11896/jsjkx.230400018

Abstract

PDF(2554KB) ( 403 )

References | Related Articles | Metrics

The edge samples in the existing multilabel Fisher score models affect the classification effect of the algorithm.It has the available virtues of stronger expression and resolution when using neighborhood intuitive fuzzy entropy to deal with uncertain information.Therefore,this paper develops a multilabel feature selection based on the Fisher score with center shift and neighborhood intuitionistic fuzzy entropy.Firstly,the multilabel domain is divided into multiple sample sets according to the labels,the feature mean of the sample set is calculated as the original center point of the samples under the labels,and the distance of the furthest samples is multiplied by the distance coefficient,the edge sample set is removed,and then a new effective sample set is defined.The score of each feature under the labels is calculated after center migration processing and the feature score of the label set.Then,a multilabel Fisher score model is established based on center migration to preprocess multilabel data.Secondly,the multilabel classification interval is introduced as the adaptive fuzzy neighborhood radius parameter,the fuzzy neighborhood similarity relation and fuzzy neighborhood particle are defined,and the upper and lower approximate sets of the multilabel fuzzy neighborhood rough sets are constructed.On this basis,the rough intuitive membership function and non-membership function of multilabel neighborhood are proposed,and the multilabel neighborhood intuitionistic fuzzy entropy is defined.Finally,the formulas for calculating the external and internal significance of features are obtained,and a multilabel feature selection algorithm based on neighborhood intuitive fuzzy entropy is designed to screen the optimal feature subset.Under the multilabel K-nearest neighbor classifier,experimental results on nine multilabel datasets show that the optimal subset selected by the proposed algorithm has great classification effect.

Multivariate Time Series Anomaly Detection Algorithm in Missing Value Scenario

ZENG Zihui, LI Chaoyang, LIAO Qing

Computer Science. 2024, 51 (7): 108-115. doi:10.11896/jsjkx.230400109

Abstract

PDF(2475KB) ( 476 )

References | Related Articles | Metrics

Time series anomaly detection is an important research field in industry.Current methods of time series anomaly detection focus on anomaly detection for complete time series data,without considering the time series anomaly detection task containing missing values caused by network anomaly and sensor damage in industrial scenarios.In this paper,we propose an attention representation-based time series anomaly detection algorithm MMAD (missing multivariate time series anomaly detection) for the more common time series anomaly detection tasks with missing values in industrial scenarios.Specifically,MMAD first models the spatial correlation of different time stamps in time series by time position coding.Then,we build an attention representation module to learn the relationships between different time stamps and represent them as an embedded high-dimensional coding matrix,thereby representing the multivariate time series with missing values as a high-dimensional representation without missing values.Finally,we design the conditional normalized flow to reconstruct the representation and use the reconstruction probability as the anomaly score,the lower the probability of reconstruction,the more abnormal the sample.Experiments on three classical time series datasets show that,the average performance of MMAD is improved by 11% comparing with other baseline methods,which verifies the efficacy of MMAD to achieve multivariate time series anomaly detection with missing values.

Active Sampling of Air Quality Based on Compressed Sensing Adaptive Measurement Matrix

HUANG Weijie, GUO Xianwei, YU Zhiyong, HUANG Fangwan

Computer Science. 2024, 51 (7): 116-123. doi:10.11896/jsjkx.230400111

Abstract

PDF(2319KB) ( 405 )

References | Related Articles | Metrics

With the continuous acceleration of urbanization,industrial development and population agglomeration make the pro-blem of air quality increasingly serious.Due to the cost of sampling,more and more attention is paid to active sampling of air qua-lity.However,the existing models can either only select the sampling location iteratively or hardly update the sampling algorithm in real time.Motivated by this,an active sampling method of air quality based on compressed sensing adaptive measurement matrix is proposed in this paper.The problem of sampling location selection is transformed into the column subset selection problem of the matrix.Firstly,the historical complete data is used for dictionary learning.After column subset selection of the learned dictionary,an adaptive measurement matrix that can guide batch sampling is obtained.Finally,the unsampled data is recovered by using the sparse basis matrix constructed by the data characteristics of air quality.This method uses a compressed sensing model to realize sampling and inference integrally,which avoids the shortcoming of using multiple models.In addition,considering the ti-ming variation of air quality,after each active sampling,the dictionary is updated online with the latest data to guide the next sampling.Experimental results on two real datasets show that the adaptive measurement matrix obtained after dictionary learning has better recovery performance than all baselines at multiple sampling rates less than 20%.

Clustering Algorithm Based on Attribute Similarity and Distributed Structure Connectivity

SUN Haowen, DING Jiaman, LI Bowen, JIA Lianyin

Computer Science. 2024, 51 (7): 124-132. doi:10.11896/jsjkx.231000125

Abstract

PDF(3153KB) ( 478 )

References | Related Articles | Metrics

According to different data characteristics,clustering analysis adopts different similarity measures.However,the data distribution is complex in the real world,and there are various phenomena such as irregular distribution and uneven density.Considering attribute similarity or distribution structure connectivity alone will reduce clustering performance.Therefore,this paper proposes a clustering algorithm based on attribute similarity and distributed structure connectivity(ASDSC).Firstly,a completely undirected graph is constructed using all data instances,and a novel similarity measurement method is defined to calculate the node similarity by the topology structure and the attributes similarity,and the adjacency matrix are constructed to update the weights of edges.Secondly,based on the adjacency matrix,random walk with increasing step is performed.Subsequently,the cluster centers and their numbers are obtained according to the connected centrality of nodes,and the connectivity of other nodes is also acquired.Then,the connectivity is used to calculate the dependencies among nodes,and the propagation process of cluster number is carried out accordingly until the clustering process is completed.Finally,comparative experiments with 5 advanced clustering algorithms are conducted on 16 synthetic datasets and 10 real datasets,and the result show that the ASDSC algorithm has achieved excellent performance.

Multimodality and Forgetting Mechanisms Model for Knowledge Tracing

YAN Qiuyan, SUN Hao, SI Yuqing, YUAN Guan

Computer Science. 2024, 51 (7): 133-139. doi:10.11896/jsjkx.231000137

Abstract

PDF(2413KB) ( 544 )

References | Related Articles | Metrics

Knowledge tracing is the core and key to build an adaptive education system,and it is often used to capture students' knowledge states and predict their future performance.Previous knowledge tracing models only model questions and skills based on structural information,unable to utilize the multimodal information of questions and skills to construct their interdependence.Additionally,the memory level of students is only quantified by time,without considering the influence of different modalities.Therefore,a multimodality and forgetting mechanisms model for knowledge tracing(MFKT)is proposed.Firstly,for question and skill nodes,a image-text matching task is used to optimize the unimodal embedding,and obtain the association weight calculation of questions and skills by calculating the similarity between nodes after multimodal fusion to generate the embedding of question nodes.Secondly,the student's knowledge state is obtained through the long short-term memory network,and forgetting factors are incorporated into their response records to generate student embeddings.Finally,the correlation strength between students and questions is calculated based on the student's response frequency and the effective memory rate of different modalities.Information propagation is performed using a graph attention network to predict the student's response to different questions.Comparative experiments and ablation experiments on two real classroom self-collected datasets show that our method has better prediction accuracy compared to other graph-based knowledge tracing models,and the design of multimodality and forgetting mechanisms effectively improves the prediction performance of the original model.At the same time,through the visual analysis of a specific case,further illustrate the practical application effect of this method.

Multi-embedding Fusion Based on top-N Recommendation

YANG Zhenzhen, WANG Dongtao, YANG Yongpeng, HUA Renyu

Computer Science. 2024, 51 (7): 140-145. doi:10.11896/jsjkx.230400066

Abstract

PDF(1815KB) ( 447 )

References | Related Articles | Metrics

Heterogeneous information network(HIN) is widely used in recommender systems since its rich semantic and structu-ral information.Although the HIN and the network embedding have achieved good results in recommender systems,the local feature amplification,the interaction of embedding vectors,and the multi-embedding aggregation methods have not been fully consi-dered.To overcome these problems,a new multi-embedding fusion recommendation(MFRec) model is proposed.Firstly,object-contextual representation network is introduced to both branches of user and node representation learning to amplify local features and enhance the interaction of neighbor nodes.Subsequently,the dilated convolution and the spatial pyramid pooling are introduced to the meta-paths learning to obtain multi-scale information and enhance the representation of meta-paths.In addition,the multi-embedding fusion module is introduced to better carry out the embedding fusion of users,items and meta-paths.The interaction between embeddings is carried out in a fine-grained way,and the different importance of each feature is emphasized.Finally,experimental results on two public recommendation system datasets show that the proposed MFRec has better performance than other existing top-N recommendation models.

Graph Contrastive Learning Incorporating Multi-influence and Preference for Social Recommendation

HU Haibo, YANG Dan, NIE Tiezheng, KOU Yue

Computer Science. 2024, 51 (7): 146-155. doi:10.11896/jsjkx.230400147

Abstract

PDF(3031KB) ( 451 )

References | Related Articles | Metrics

At present,social recommendation methods based on graph neural network mainly alleviate the cold start problem by jointly modeling the explicit and implicit relationships of social information and interactive information.Although these methods aggregate social relations and user-item interaction relations well,they ignore that the higher-order implicit relations do not have the same impacts on each user.And these supervised methods are susceptible to popularity bias.In addition,these methods mainly focus on the collaborative function between users and items,but do not make full use of the similarity relations between items.Therefore,this paper proposes a social recommendation algorithm (SocGCL) that incorporates multiple influences and prefe-rences into graph contrastive learning.On the one hand,a fusion mechanism for nodes(users and items) and a fusion mechanism for graphs are introduced,taking into account the similarity relations between items.The fusion mechanism for nodes distinguishes the different impacts of different nodes in the graph on the target node,while the fusion mechanism for graphs aggregates the node embedding representations of multiple graphs.On the other hand,by adding random noise for cross-layer graph contrastive learning,the cold start problem and popularity bias of social recommendation can be effectively alleviated.Experimental results on two real-world datasets show that SocGCL outperforms the baselines and effectively improves the performance of social recommendation.

Two Stage Rumor Blocking Method Based on EHEM in Social Networks

LIU Wei, WU Fei, GUO Zhen, CHEN Ling

Computer Science. 2024, 51 (7): 156-166. doi:10.11896/jsjkx.230800169

Abstract

PDF(6045KB) ( 421 )

References | Related Articles | Metrics

Therise of online social networks has brought about a series of challenges and risks,including the spread of false and malicious rumors,which can mislead the public and disrupt social stability.Therefore,blocking the spread of rumors has become a hot topic in the field of social networks.While significant efforts have been made in rumor blocking,there still exist limitations in accurately describing information propagation in social networks.To address this issue,this paper proposes a novel model,the extended heat energy model(EHEM),to characterize information propagation.EHEM fully takes into consideration several key aspects of information propagation,including the dynamic adjustment mechanism of node activation probabilities,the cascading mechanism of information propagation,and the dynamic transition mechanism of node states.By incorporating these factors,the EHEM provides a more precise representation of the explosive and complex nature of information propagation.Furthermore,ta-king into account the possibility of belief transition from rumors to truth for nodes that initially believe in rumors in the real world,this paper introduces a correction threshold to determine whether a node undergoes belief transformation.Additionally,the importance of nodes determines their influence spreading.Therefore,a multidimensional quality measure of nodes is proposed to assess their importance.Finally,a two stage rumor containment(TSRC) algorithm is proposed,which first prunes the network using the multidimensional quality measure of nodes and then selects the optimal set of positive seeds through simulations.Expe-rimental results on four real-world datasets demonstrate that the proposed algorithm outperforms six other comparative algorithms,including Random,Betweenness,MD,PR,PWD,and ContrId on multiple metrics.

Survey of 3D Point Clouds Upsampling Methods

HAN Bing, DENG Lixiang, ZHENG Yi, REN Shuang

Computer Science. 2024, 51 (7): 167-196. doi:10.11896/jsjkx.230900110

Abstract

PDF(9804KB) ( 587 )

References | Related Articles | Metrics

With the popularity of three-dimensional(3D) scanning devices such as depth cameras and laser radars,the methods of representing 3D data using point clouds are becoming increasingly popular.The analysis and processing of point cloud data are also arousing great interest in the field of computer visual research.In fact,the quality of the original point clouds directly obtained by sensors is influenced by many factors,such as the self-occlusion of objects themselves,mutual occlusion between objects,differences in scanning accuracy,reflectivity,transparency,as well as environmental limitations during the scanning process,hardware limitations of scanning equipment,inevitably leading to noise,hollow,sparse point clouds.Therefore,obtaining high-quality dense and complete point clouds is an urgent task to be solved.Among them,point cloud upsampling is an important point cloud processing task that aims to transform sparse,non-uniform,and noisy point clouds into dense,uniform,and noiseless point clouds,and the quality of its results affects the quality of various downstream tasks.Therefore,some researchers have further explored and proposed various point cloud upsampling methods from multiple perspectives,so as to improve computing efficiency and network performance,and solve various difficult issues in point cloud upsampling.In order to promote future research on the point cloud upsampling task,first of all,the background and importance of this critical task are introduced.After that,the existing point cloud upsampling methods are comprehensively classified and reviewed from different task type perspectives,including geometric point cloud upsampling(GPU),arbitrary point cloud upsampling(APU),multi-attribute point cloud upsampling(MAPU),multi-modal point cloud upsampling(MMPU),scene point cloud upsampling(ScenePU)and sequential point cloud upsampling(SequePU).Then,the performance of these point cloud upsampling networks is analyzed and compared in detail.Finally,the existing problems and challenges are further analyzed,and possible future research directions are explored,hoping to provide new ideas for further research on 3D point cloud upsampling task and its downstream tasks(such as surface reconstruction)in the future.

Study on Algorithm of Depth Image Super-resolution Guided by High-frequency Information ofColor Images

LI Jiaying, LIANG Yudong, LI Shaoji, ZHANG Kunpeng, ZHANG Chao

Computer Science. 2024, 51 (7): 197-205. doi:10.11896/jsjkx.230400102

Abstract

PDF(3125KB) ( 510 )

References | Related Articles | Metrics

Depth image information is an important part of 3D scene information.However,due to the limitations of acquisition equipment and the diversity of imaging environments,the depth images acquired by depth sensors often have low resolution and less high-frequency information,which limits their further applications in various computer vision tasks.Depth image super-resolution attempts to improve the resolution of depth images and is a practical and valuable task.The RGB image in the same scene has high resolution and rich texture information,and some depth image super-resolution algorithms achieve significant improvement in algorithm performance by introducing RGB images from the same scene to provide guidance information.However,due to the structural inconsistency between RGB images and depth maps,how to utilize RGB information fully and effectively is still extremely challenging.To this end,this paper proposes a depth image super-resolution guided by high-frequency information of co-lor images.Specifically,a high-frequency feature extraction module is designed to adaptively learn high-frequency information of color images to guide the reconstruction of depth map edges.In addition,a feature self-attention module is designed to capture the global dependencies between features,extract deeper features to help recover details in the depth image.After cross-modal fusion,the depth image features and color image-guided features are reconstructed,and the proposed multi-scale feature fusion module is used to fuse the spatial structure information between different scale features to obtain reconstruction information including multi-level receptive fields.Finally,through the depth reconstruction module,the corresponding high-resolution depth map is recovered.Comprehensive qualitative and quantitative experimental results on public datasets have demonstrated that the proposed method outperforms comparative methods,which verifies its effectiveness.

Foggy Weather Object Detection Method Based on YOLOX_s

LOU Zhengzheng, ZHANG Xin, HU Shizhe, WU Yunpeng

Computer Science. 2024, 51 (7): 206-213. doi:10.11896/jsjkx.230400086

Abstract

PDF(2787KB) ( 627 )

References | Related Articles | Metrics

This paper proposes a foggy weather object detection model based on depth-wise separable convolution and attention mechanism,aiming to achieve fast and accurate detection of objects in foggy scenes.The model consists of a dehazing module and a detection module,which are jointly trained during the training process.To ensure the accuracy and real-time performance of the model in foggy scenes,the dehazing module adopts AODNet to perform dehazing processing on input images,reducing the interference of fog on the detected objects in the images.In the detection module,an improved version of the YOLOX_s model is used to output the confidence scores and position coordinates of the detected objects.To enhance the detection performance of the network,depth-wise separable convolution and attention mechanism are employed on the basis of YOLOX_s to improve the feature extraction capability and expand the receptive field of the feature maps.The proposed model can improve the detection accuracy of the model in foggy scenes without increasing the model parameters and computational complexity.Experimental results demonstrate that the proposed model performs excellently on the RTTS dataset and the synthesized foggy object detection dataset,effectively enhancing the detection accuracy in foggy weather scenarios.Compared to the baseline model,the average precision(mAP@50_95)is improved by 1.9% and 2.37% respectively.

Image Captioning Generation Method Based on External Prior and Self-prior Attention

LI Yongjie, QIAN Yi, WEN Yimin

Computer Science. 2024, 51 (7): 214-220. doi:10.11896/jsjkx.230600167

Abstract

PDF(2524KB) ( 428 )

References | Related Articles | Metrics

Image captioning,a multimodal task that combines computer vision and natural language processing,aims to comprehend the content of images and generate appropriate textual captions.Existing image captioning methods often employ self-attention mechanisms to capture long-range dependencies within samples.However,this approach overlooks the potential correlations among different samples and fails to utilize prior knowledge,resulting in discrepancies between the generated content and refe-rence captions.To address these issues,this paper proposes an image description approach based on external prior and self-prior attention(EPSPA).The external prior module implicitly considers the potential correlations among samples and removes interfe-rence from other samples.Meanwhile,the self-prior attention effectively utilizes attention weights from previous layers to simulate prior knowledge and guide the model in feature extraction.Evaluation results of EPSPA on publicly available datasets using multiple metrics demonstrates its superior performance compared to existing methods while maintaining a low parameter count.

Deep Feature Learning and Feature Clustering of Streamlines in 3D Flow Fields

CHEN Jie, JIN Linjiang, ZHENG Hongbo, QIN Xujia

Computer Science. 2024, 51 (7): 221-228. doi:10.11896/jsjkx.230500033

Abstract

PDF(2570KB) ( 425 )

References | Related Articles | Metrics

Flow field visualization refers to converting data of fluid motion into visual forms for better understanding and analysis of flow in the field.Using streamlines for flow field visualization is currently the most popular method.This paper proposes a method for learning and clustering 3D flow field streamline features.Firstly,a convolutional autoencoder-based method is designed to extract streamline features.The autoencoder in this method consists of an encoder and a decoder.The encoder uses convolutional layers to reduce the dimensions of input streamlines to extract features,while the decoder uses transpose convolution to upsample the streamline features to restore the streamlines.By continuously reducing the difference between input and restored streamlines through training,the encoder can extract more accurate streamline features.Secondly,this paper improves the CFSFDP(clustering by fast search and find of density peaks) algorithm for clustering streamline features.To address the issue of manually selecting cluster centers and the problem of sensitivity to distance parameters in the CFSFDP algorithm,this paper improves its metric calculation method,realizes automatic selection of cluster centers,and introduces adaptive calculation of truncation distance parameters using Gaussian kernel density estimation.Experimental results show that this method has good perfor-mance in learning streamline features and clustering.

Action Recognition Model Based on Improved Two Stream Vision Transformer

LEI Yongsheng, DING Meng, SHEN Yao, LI Juhao, ZHAO Dongyue, CHEN Fushi

Computer Science. 2024, 51 (7): 229-235. doi:10.11896/jsjkx.230500054

Abstract

PDF(2697KB) ( 589 )

References | Related Articles | Metrics

To address the issues of poor resistance to background interference and low accuracy in existing action recognition methods,an improved dual stream visual Transformer action recognition model is proposed.The model adopts a segmented sampling method to increase its processing ability for long-term sequence data; embedding a parameter free attention module in the network header enhances the model's feature representation ability while reducing action background interference;embedding a temporal attention module at the tail of the network to fully extract temporal features by integrating high semantic information in the time domain.A new joint loss function is proposed in the paper,aiming to increase inter class differences and reduce intra class differences.Adopting a decision fusion layer to fully utilize the features of optical flow and RGB flow.In response to the above improved model,comparative and ablation experiments are conducted on the benchmark datasets UCF101 and HMDB51.The ablation experiment results verify the effectiveness of the proposed method.The comparison results show that the accuracy of the proposed method is 3.48% and 7.76% higher than that of the time segmented network on the two datasets,respectively,which is better than the current mainstream algorithms and has good recognition performance.

Lane Detection Method Based on RepVGG

CAI Wenliang, HUANG Jun

Computer Science. 2024, 51 (7): 236-243. doi:10.11896/jsjkx.230400128

Abstract

PDF(3617KB) ( 422 )

References | Related Articles | Metrics

Aiming at the problems of slow detection speed and low detection accuracy in existing lane detection methods,lane detection is regarded as a classification problem,and a lane detection method based on RepVGG is proposed.Based on the RepVGG model,different levels of feature maps are fused in the backbone network to reduce the loss of spatial positioning information and improve the accuracy of lane positioning.Modeling lane as a whole and correcting lane line prediction effects from both overall and local perspectives through post-processing.Introducing a branch of lane presence prediction based on distribution guidance to learn the lane presence features directly from the localization distribution,working in conjunction with post-processing to further improve the detection accuracy while enhancing the inference speed.Experiments on the TuSimple dataset and the CULane dataset show that the proposed method achieves a good balance in speed and accuracy.On the CULane dataset,the reasoning speed is 1.13 times faster than UFLDv2 and the F1 score is improved from 74.7% to 77.1% compared with UFLDv2.

Occluded Face Recognition Based on Deep Image Prior and Robust Markov Random Field

LI Xiaoxin, DING Weijie, FANG Yi, ZHANG Yuancheng, WANG Qihui

Computer Science. 2024, 51 (7): 244-256. doi:10.11896/jsjkx.230400127

Abstract

PDF(4046KB) ( 462 )

References | Related Articles | Metrics

The occlusion-caused difference between test and training images is one of the most challenging issues for real-world face recognition system.Most of the existing occluded face recognition methods based on deep neural networks(DNNs) need to use large-scale occluded face images to train network models.However,any external object in the real world might become occlusions,and limited training data cannot exhaust all possible objects.Also,using large-scale occluded face images to train networks violates the human visual mechanism,the human eyes detect occlusions by only using small-scale unoccluded face images without seeing any occlusions.In order to simulate the occlusion detection mechanism of human vision,we combine the deep image prior with the robust Markov random field model to construct a novel occlusion detection model,namely DIP-rMRF,based on small-scale data,and propose a uniform zero filling method to effectively utilize the occlusion detection resultsof DIP-rMRF.Experimental resultsofsix advanced DNN-based face recognitions methods,including VGGFace,LCNN,PCANet,SphereFace,InterpretFR and FROM,on three face datasets,including Extended Yale B,AR and LFW,show that DIP-rMRF can effectively preprocess face images with occlusions and quasi-occlusions caused by extreme illuminations,and greatly improve the performance of the existing DNN models for face recognition with occlusion.

Lightweight Deep Neural Network Models for Edge Intelligence:A Survey

XU Xiaohua, ZHOU Zhangbing, HU Zhongxu, LIN Shixun, YU Zhenjie

Computer Science. 2024, 51 (7): 257-271. doi:10.11896/jsjkx.240100045

Abstract

PDF(3269KB) ( 587 )

References | Related Articles | Metrics

With the rapid development of the Internet of Things(IoT) and artificial intelligence(AI),the combination of edge computing and AI has given rise to a new research field called edge intelligence.Edge intelligence possess appropriate computing power and can provide real-time,efficient,and intelligent responses.It has significant applications in areas such as smart cities,industrial IoT,smart healthcare,autonomous driving,and smart homes.In order to improve the accuracy of models,traditional deep neural networks often adopt deeper and larger architectures,resulting in significant increases in model parameters,storage requirements,and computational complexity.However,due to the limitations of IoT terminal devices in terms of computing power,storage space,and energy resources,deep neural networks are difficult to be directly deployed on these devices.Therefore,lightweight deep neural networks with low memory,low computational resources,high accuracy,and real-time inference capability have become a research hotspot.This paper first reviews the development process of edge intelligence and analyzes the practical requirements for lightweight deep neural networks to adapt to intelligent terminals.Two methods for constructing lightweight deep neural network models:model compression techniques and lightweight architecture design are proposed.Next,it discusses in detail five main model compression techniques:parameter pruning,parameter quantization,low-rank decomposition,knowledge distillation,and mixed compression techniques.It summarizes their respective performance advantages and limitations,and eva-luates their compression effects on commonly used datasets.Then,the paper analyzes in depth the strategies of adjusting the size of the convolution kernel,reducing input channel number,decomposing convolution operations,and adjusting convolution width in lightweight architecture design,and compares several commonly used lightweight network models.Finally,the future research direction of lightweight deep neural networks in the field of edge intelligence is prospected.

Literal Chunk Contradiction and Clause Regular Contradiction in Propositional Logic

WANG Chenglong, HE Xingxing, ZANG Hui, LI Yingfang, WANG Danchen, LI Tianrui

Computer Science. 2024, 51 (7): 272-277. doi:10.11896/jsjkx.230500237

Abstract

PDF(1461KB) ( 443 )

References | Related Articles | Metrics

The resolution principle is a concise,reliable,and complete inference rule in automatic reasoning.The contradiction sepa-ration-based dynamic multi-clause synergized automated deduction is an extension of the resolution principle,and the contradiction is the theory's core part.Due to the complex structure of the contradiction and the few generation strategies,a new strategy for generating the contradiction is proposed,i.e.,multiple standard contradictions are used to generate the literal chunk contradiction.Then,a new contradiction is obtained by adding complementary contradiction sets.The focus is on the nature of the contradiction generated by the literal chunk contradiction with a special structure,i.e.,the clause regular contradiction,which shows that the clause regular contradiction with a specific structure is still a contradiction after adding the clause.Finally,an algorithm for generating contradiction is proposed,which provides a reference for generating new contradictions on computers.

KHGAS:Keywords Guided Heterogeneous Graph for Abstractive Summarization

MAO Xingjing, WEI Yong, YANG Yurui, JU Shenggen

Computer Science. 2024, 51 (7): 278-286. doi:10.11896/jsjkx.230500059

Abstract

PDF(2542KB) ( 439 )

References | Related Articles | Metrics

Abstractive summarization is a crucial task in natural language processing that aims to generate concise and informative summaries from a given text.Deep learning-based sequence-to-sequence models have become the mainstream approach for generating abstractive summaries,achieving remarkable performance gains.However,existing models still suffer from issues such as semantic ambiguity and low information content due to the lack of attention to the dependency relationships between key concepts and sentences in the input text.To address this challenge,the keywords guided heterogeneous graph model for abstractive summarization is proposed.This model leverages extracted keywords and constructs a heterogeneous graph with both keywords and sentences as input to model the dependency relationships between them.A document encoder and a graph encoder are respectively used to capture textual information and dependency relationships in the heterogeneous graph.Moreover,a hierarchical graph attention mechanism is introduced in the decoder to improve the model's attention to significant information when generating summaries..Extensive experiments on the CNN/Daily Mail and XSum datasets demonstrate that the proposed model outperforms existing methods in terms of the ROUGE evaluation metric.Human evaluations also reveal that the generated summaries by the proposed model contain more key information and are more readable compared to the baseline models.

Overlap Event Extraction Method with Language Granularity Fusion Based on Joint Learning

YAN Jingtao, LI Yang, WANG Suge, PAN Bangze

Computer Science. 2024, 51 (7): 287-295. doi:10.11896/jsjkx.230700118

Abstract

PDF(2553KB) ( 422 )

References | Related Articles | Metrics

Event extraction is a crucial task in information extraction.The existing event extraction methods generally assume that only one event occurs in a sentence.However,overlapping events are inevitable in real scenarios.Therefore,this paper designs an overlap event extraction method with language granularity fusion based on joint learning.In this method,a strategy of increasing and decreasing token number layer by layer is designed to represent fragments of different language granularity.On this basis,a sentence representation of progressive language granularity fusion is constructed.By introducing event information perception,the sentence representation of language granularity and event information fusion based on gating mechanism is established.Finally,through the joint study of the fragment relationship and role relationship between words,the identification of event triggering words,arguments,event types and argument roles is realized.The experiments conducted on the FewFC and DuEE1.0-1datasets demonstrate that the LGFEE model proposed in this paper achieves an improvement of 0.8% and 0.6% in the F1 score for event type discrimination tasks,respectively.Furthermore,it also exhibits higher recall rates and F1 scores in trigger word recognition,argument recognition,and argument role classification tasks,which verifies the validity of LGFEE model.

CINOSUM:An Extractive Summarization Model for Low-resource Multi-ethnic Language

WENG Yu, LUO Haoyu, Chaomurilige, LIU Xuan , DONG Jun, LIU Zheng

Computer Science. 2024, 51 (7): 296-302. doi:10.11896/jsjkx.231100201

Abstract

PDF(3339KB) ( 428 )

References | Related Articles | Metrics

To address the issue of existing models being unable to handle abstractive summarization for low-resource multilingual languages,this paper proposes an extractive summarization model,CINOSUM,based on CINO(a Chinese minority pre-trained language model).We construct a multi-ethnic language summarization dataset,MESUM,to extend the linguistic scope of text summarization.To overcome the poor performance of previous models on low-resource languages,a unified sentence extraction framework is employed for extractive summarization across various ethnic languages.In addition,we introduce a joint training strategy for multilingual datasets that effectively expands applications in low-resource languages,thereby greatly improving the model's adaptability and flexibility.Ultimately,this paper conducts extensive experimental study on the MESUM dataset,and the results reveal that the CINOSUM model demonstrates superior performance in multilingual low-resource linguistic environments,including Tibetan and Uyghur languages,achieving significant improvements in the ROUGE evaluation metric.

Text Classification Method Based on Multi Graph Convolution and Hierarchical Pooling

WEI Ziang, PENG Jian, HUANG Feihu, JU Shenggen

Computer Science. 2024, 51 (7): 303-309. doi:10.11896/jsjkx.230400164

Abstract

PDF(2021KB) ( 472 )

References | Related Articles | Metrics

Text classification,as a critical task in natural language processing,aims to assign labels to input documents.The Co-occurrence relationship between words offers key perspectives on text characteristics and vocabulary distribution,while word embeddings supply rich semantic information,influencing global vocabulary interaction and potential semantic relationships.Previous research has struggled to adequately incorporate both aspects or has disproportionately emphasized one over the other.To address this issue,a novel method is proposed in this paper that adaptively fuses these two types of information,aiming to strike a balance that can improve model performance while considering both structural relationships and embedded information.The method begins by constructing text data into text co-occurrence graphs and text embedding graphs,reflecting the context structure and semantic embedding information respectively.Graph convolution is then utilized to enhance node embeddings.In the graph pooling layer,node embeddings are fused and nodes of higher importance are identified by employing a hierarchical pooling model,learning document level representations layer by layer.Furthermore,we introduce a gated fusion module to adaptively fuse the embeddings of the two graphs.The proposed approach is validated with extensive experiments on five publicly available text classification datasets,and the experimental results show the superior performance of the HTGNN model in text classification tasks.

Device Fault Inference and Prediction Method Based on Dynamic Graph Representation

ZHANG Hui, ZHANG Xiaoxiong, DING Kun, LIU Shanshan

Computer Science. 2024, 51 (7): 310-318. doi:10.11896/jsjkx.231000223

Abstract

PDF(2840KB) ( 546 )

References | Related Articles | Metrics

Effective equipment operation and maintenance is able to ensure the proper operation of equipment.Nevertheless,as the equipment becomes more and more sophisticated,the complexity and difficulty of maintaining and troubleshooting these devices are constantly increasing.As a result,equipment operation and maintenance mode that only rely on manual efforts is gradually unable to meet the requirements of intelligent equipment.Intelligent operation and maintenance that applies many new emerging technologies such as artificial intelligence to process of operation and maintenance can be used as a strong support for equipment operation and maintenance task.However,many existing methods still have deficiencies such as lack of considering dynamic cha-racteristics.In order to solve these problems,a device fault inference and prediction method that is based on dynamic knowledge graph representation learning is proposed.The method can predict whether a target device is potentially associated with a faulty device time during the operation and maintenance process.The proposed method combines dynamic knowledge graph representation learning with graph representation inference models,updates the graph network based on real-time data,and employs graph representation inference models to infer new fault data.Firstly,it takes advantage of a dynamic knowledge graph to represent the equipment operation and maintenance data,so as to records the evolution of the equipment over time.The representation effectively denote dynamic changes of the relationship between the devices.Next,the time-aware representations of the source faulty equipment and the target equipment in the dynamic knowledge graph are obtained through representation learning.Finally,the time-aware representations are used as inputs for fault inference prediction,which predicts whether there exists any potential correlation between the equipment so as to assist the operation and maintenance engineer in solving the corresponding equipment fault problems.Experiments on multiple datasets verify the effect of the proposed method.

Multi-agent Cooperative Algorithm for Obstacle Clearance Based on Deep Deterministic PolicyGradient and Attention Critic

WANG Xianwei, FENG Xiang, YU Huiqun

Computer Science. 2024, 51 (7): 319-326. doi:10.11896/jsjkx.230600129

Abstract

PDF(3056KB) ( 499 )

References | Related Articles | Metrics

Dynamic obstacles have always been a key factor hindering the development of autonomous navigation for agents.Obstacle avoidance and obstacle clearance are two effective methods to address the issue.In recent years,multi-agent obstacle avoi-dance(collision avoidance) has been an active research area,and there are numerous excellent multi-agent obstacle avoidance algorithms.However,the problem of multi-agent obstacle clearance remains relatively unknown,and the corresponding algorithms for multi-agent obstacle clearance are scarce.To address the issue of multi-agent obstacle clearance,a multi-agent cooperative algorithm for obstacle clearance based on deep deterministic policy gradient and attention Critic(MACOC) is proposed.Firstly,the first multi-agent cooperative environment model for obstacle clearance is created,and the kinematic models of the agents and dynamic obstacles are defined.Four simulation environments are constructed based on different numbers of agents and dynamic obstacles.Secondly,the process of obstacle clearance cooperatively by multi-agent is defined as a Markov decision process(MDP) model.The state space,action space,and reward function for multi-agent are constructed.Finally,a multi-agent cooperative algorithm for obstacle clearance based on deep deterministic policy gradient and attention critic is proposed,and it is compared with classical multi-agent algorithms in the simulated environments for obstacle clearance.Experimental results show that,the proposed MACOC algorithm has a higher success rate in obstacle clearance,faster speed,and better adaptability to complex environments compared to the compared algorithms.

Deep-Init:Non Joint Initialization Method for Visual Inertial Odometry Based on Deep Learning

SHI Dianxi, GAO Yunqi, SONG Linna, LIU Zhe, ZHOU Chenlei, CHEN Ying

Computer Science. 2024, 51 (7): 327-336. doi:10.11896/jsjkx.230500036

Abstract

PDF(3339KB) ( 388 )

References | Related Articles | Metrics

For a non-linear monocular VIO system,its initialization process is crucial,and the initialization result directly affects the accuracy of the state estimation during the whole system operation.To this end,this paper introduces a deep learning method into the initialization process of the monocular VIO system and proposes an efficient non-joint initialization method(referred to as Deep-Init).The core of this method is to use a deep neural network to accurately estimate the random error terms such as bias and noise of the gyroscope in the IMU,to obtain the key parameter in the initialization process,i.e.the bias of the gyroscope.At the same time,we loosely couple the IMU pre-integration to the SfM.The absolute scale,velocity and gravity vector are quickly recovered by position and rotation alignment using least squares,which are used as initial values to guide the non-linear tightly coupled optimization framework.The accuracy of the rotation estimates in the IMU is greatly increased due to the compensation of the gyroscope data by the deep neural network,which effectively improves the signal-to-noise ratio of the IMU data.This also reduces the number of least squares equation failures,further reducing the computational effort.Using the pre-integrated amount of gyroscope data with the error term removed to replace the rotation amount in the SfM and using the IMU rotation amount as the true value,not only avoids the errors associated with initializing inaccurate SfM values as the true value but slao effectively improves the accuracy of system state estimation.Moreover,it enables effective adaptation to scenarios where SfM estimation is poor,such as high-speed motion,drastic lighting changes and texture repetition.The validity of the proposed method is verified on the EuRoC dataset,and the experimental results show that the proposed Deep-Init initialization method achieves good results in terms of both accuracy and time consumption.

Data-free Model Evaluation Method Based on Feature Chirality

MIAO Zhuang, JI Shipeng, WU Bo, FU Ruizhi, CUI Haoran, LI Yang

Computer Science. 2024, 51 (7): 337-344. doi:10.11896/jsjkx.230500179

Abstract

PDF(3883KB) ( 15347 )

References | Related Articles | Metrics

Evaluating the performance of convolutional neural network models is crucial,and model evaluation serves as a key component in the process,which is widely used in model design,comparison,and application.However,most existing model eva-luation methods rely on running models on test data to obtain evaluation indexes,so these methods are unable to deal with the situation where testing data is difficult to obtain due to privacy,copyright,confidentiality,and other reasons.To address the problem,this paper proposes a novel method to model evaluation that does not require testing data,instead,it is based on feature chirality.The model evaluation obtains the evaluation indexes of the models by calculating the kernel distance of different models.The negative correlation between model performance and kernel distance is then used to analyze model parameters and obtain the relative performance ranking of different models without accessing any testing data.Experimental results show that when using Euclidean distance,the proposed blind evaluation method achieves the highest accuracy across seventeen classic CNNs,including AlexNet,VGGNets,ResNets and EfficientNets.Thus,this method is an effective and viable approach for model evaluation.

Unsupervised Domain Adaptation Based on Entropy Filtering and Class Centroid Optimization

TIAN Qing, LU Zhanghu, YANG Hong

Computer Science. 2024, 51 (7): 345-353. doi:10.11896/jsjkx.230500144

Abstract

PDF(2496KB) ( 405 )

References | Related Articles | Metrics

As one of the emerging research directions in the field of machine learning,unsupervised domainadaptation mainly uses source domain supervision information to assist the learning of unlabeled target domains.Recently,many unsupervised domain adaptation methods have been proposed,but there are still some deficiencies in relation mining.Specifically,existing methods usually adopt a consistent processing strategy for target domain samples,while ignoring the discrepancy in target domain samples in relation mining.Therefore,this paper proposes a novel method called entropy filtering and class centroid optimization(EFCO).The proposed method utilizes a generative adversarial network architecture to label target domain samples.With the obtained pseudo-labels,the sample entropy value is calculated and compared with a predefined threshold to further categorize target domain samples.Simple samples are assigned pseudo-labels,while difficult samples are classified using the idea of contrastive learning.By combining source domain data and simple samples,a more robust classifier is learned to classify difficult samples,and class centroids of the source and target domains are obtained.Inter-domain and intra-domain discrepancies are minimized by optimizing inter-domain contrastive alignment and instance contrastive alignment.Finally,it is compared with several advanced domain adaptation methods on three standard data sets,and the results indicate that the performance of the proposed method outperforms the comparison methods.

Adaptive Grey Wolf Optimizer Based on IMQ Inertia Weight Strategy

YU Mingyang, LI Ting, XU Jing

Computer Science. 2024, 51 (7): 354-361. doi:10.11896/jsjkx.230600181

Abstract

PDF(2859KB) ( 449 )

References | Related Articles | Metrics

Aiming at the problems of low optimization accuracy and slow convergence speed of grey wolf optimizer(GWO),this paper proposes an adaptive grey wolf optimization algorithm(ISGWO) based on IMQ inertia weighting strategy.This algorithm utilizes the properties of the IMQ function to achieve a nonlinear adjustment of the inertia weights,which better balances the global exploration ability and local exploitation ability of the algorithm.At the same time,it adaptively updates the position of individuals based on the Sigmoid exponential function to better search and optimize the solution space of the problem.Six basic functions and 29 CEC2017 functions are used to test ISGWO and compare it with six commonly used algorithms,and the experimental results show that ISGWO has superior convergence accuracy and speed.

Schema Validation Approach for Constraint-enhanced RDFS Ontology

ZHAO Xiaofei, CHAI Zhengyi, YUAN Chao, ZHANG Zhen

Computer Science. 2024, 51 (7): 362-372. doi:10.11896/jsjkx.230800034

Abstract

PDF(2510KB) ( 420 )

References | Related Articles | Metrics

The constraint-enhanced RDFS(RDFS_(c)) ontology overcomes the deficiency of RDFS's ability to describe constraints.However,the introduction of constraints brings challenges to ontology validation.This paper proposes a decidable schema validation approach for RDFS_(c) ontology.This approach is based on analyzing the dependency between constraints.First,the RDFS_(c) schema is transformed into the first-order predicate logic expressions,and the checking tasks of the characteristics are transformed into the satisfiability checking of the first-order expressions.On this basis,a constraint dependence graph reflecting the mend-violation relationships between constraints is established and made necessary reductions,we then derive the decidability of the schema validation tasks by identifying the finite loops in the graph,and finally validate the schema by reasoning on the constraint dependency.The contribution of this paper is twofold.On the one hand,through the transformation to the first-order expressions and the reasoning based on the dependency of the corresponding first-order constraints,the proposed approach has strong applicability,especially the constraint dependency analysis can minimize the number of backtracking,thereby ensuring the efficiency of the validation process;on the other hand,because it is independent of any specific constraint modeling language,this approach is also a general solution for analyzing the decidability of the RDFS_(c) schema validation tasks.

Application,Challenge and New Strategy of Block Chain Technology in Metaverse

SUN Li

Computer Science. 2024, 51 (7): 373-379. doi:10.11896/jsjkx.230800072

Abstract

PDF(2353KB) ( 477 )

References | Related Articles | Metrics

In recent years,with the development of virtual reality,artificial intelligence and other technologies,metaverse system framework with immersive Internet as the core has emerged.Based on the analysis of the challenges faced by the core technologies of metauniverse environment,this study proposes the role of integrating blockchain technology on the metaverse system and its related core technologies,and points out the delays and scaling limitations of the existing blockchain operating mechanism on its application in the metaverse environment.Using sharding mechanism and Stackelberg game theory,this study proposes a new blockchain-based metaverse application strategy,designs the corresponding user incentive scheme,and verifies the effectiveness of the scheme through numerical experiments.Finally,based on the advantages and problems of this strategy,the future research direction is clarified.

Host Anomaly Detection Framework Based on Multifaceted Information Fusion of SemanticFeatures for System Calls

FAN Yi, HU Tao, YI Peng

Computer Science. 2024, 51 (7): 380-388. doi:10.11896/jsjkx.230400023

Abstract

PDF(2763KB) ( 431 )

References | Related Articles | Metrics

Obfuscation attack can bypass the detection of host security protection mechanism on the premise of achieving the same attack effect by modifying the system call sequence generated by the process running.The existing system call-based host anomaly detection methods cannot effectively detect the modified system call sequence after obfuscation attacks.This paper proposes a host anomaly detection method based on the fusion of multiple semantic information of system call.This method starts with the multiple semantic information of the system call sequence,fully mining the deep semantic information of the system call sequence through the system call semantic information abstraction and the system call semantic feature extraction,and uses the multi-channel TextCNN to realize the fusion of multiple information for anomaly detection.Semantic abstraction of system call can realize the mapping of specific system call to its type and shield the influence of specific system call change on detection effect by extracting sequence abstract semantic information.The system call semantic feature extraction uses the attention mechanism to obtain the key semantic features that represent the sequence behavior pattern.Experimental results on ADFA-LD dataset show that the false alarm rate of this method for detecting general host anomaly is lower than 2.2%,and the F1 score reaches 0.980.The false alarm rate of detecting the confusion attack is lower than 2.8%,and the F1 score reaches 0.969.Is detection performance is better than that of other methods.

PDF Malicious Indicators Extraction Technique Based on Improved Symbolic Execution

SONG Enzhou, HU Tao, YI Peng, WANG Wenbo

Computer Science. 2024, 51 (7): 389-396. doi:10.11896/jsjkx.230300117

Abstract

PDF(2239KB) ( 440 )

References | Related Articles | Metrics

The malicious PDF document is a common attack method used by APT organizations.Analyzing extracted indicators of embedded JavaScript code is an important means to determine the maliciousness of the documents.However,attackers can adopt high obfuscation,sandbox detection and other escape methods to interfere with analysis.Therefore,this paper innovatively applies symbolic execution method to PDF indicator extraction.We propose a PDF malicious indicator extraction technique based on improved symbolic execution and implement SYMBPDF,an indicator extraction system consisting of three modules:code parsing,symbolic execution and indicator extraction.In the code parsing module,we implement extraction and reorganization of inline Javascript code.In the symbolic execution module,we design the code rewriting method to force branch shifting,resulting in improving the code coverage of symbolic execution.We also design a concurrency strategy and two constraint solving optimization methods to improve the efficiency.In the indicator extraction module,we realize integration and recording of malicious indicators.In this paper,1 271 malicious samples are extracted and evaluated.The success rate of indicator extraction is 92.2%,the indicator effectiveness is 91.7%,the code coverage is 8.5% higher and the system performance is 32.3% higher than that of before optimization.

Privacy Incentive Mechanism for Mobile Crowd-sensing with Comprehensive Scoring

FU Yanming, ZHANG Siyuan

Computer Science. 2024, 51 (7): 397-404. doi:10.11896/jsjkx.230400181

Abstract

PDF(2267KB) ( 468 )

References | Related Articles | Metrics

The efficient operation of mobile crowd-sensing(MCS) largely depends on whether a large number of users participate in the sensing tasks.However,in reality,due to the increase of user's sensing cost and the privacy disclosure of users,the users' participation enthusiasm is not high,so an effective mean is needed to ensure the privacy security of users,and it can also promote users to actively participate in the tasks.In response to the above issues,a new privacy incentive mechanism of bilateral auction with comprehensive scoring(BCS) based on local differential privacy protection technology is proposed.This incentive mechanism includes three parts:auction mechanism,data perturbation and aggregation mechanism,and reward and punishment mechanism.The auction mechanism comprehensively considers the impact of various factors on users' sensing tasks,to some extent,it improves the matching degree of tasks.The data perturbation and aggregation mechanism makes a balance between privacy protection and data accuracy,and achieves good protection of user privacy while ensuring data quality.The reward and punishment mechanism rewards users of high integrity and activity to encourage users to actively participate in sensing tasks.Experimental results indicate that BCS can improve platform revenue and task matching rate while ensuring the quality of sensing data.

Lagrangian Dual-based Privacy Protection and Fairness Constrained Method for Few-shot Learning

WANG Jinghong, TIAN Changshen, LI Haokang, WANG Wei

Computer Science. 2024, 51 (7): 405-412. doi:10.11896/jsjkx.230500012

Abstract

PDF(2207KB) ( 515 )

References | Related Articles | Metrics

Few-shot learning aims to use a small amount of data for training and significantly improve model performance,and is an important approach to address privacy and fairness issues of sensitive data in neural network models.In few-shot learning,there is a risk of privacy and fairness issues in training neural network models due to the fact that small sample datasets often contain certain sensitive data,and that such sensitive data may be discriminatory.In addition,in many domains,data is difficult or impossible to access for reasons such as privacy or security.Also,in differential privacy models,the introduction of noise not only leads to a reduction in model utility,but also causes an imbalance in model fairness.To address these challenges,this paper proposes a sample-level adaptive privacy filtering algorithm based on the Rényi differential privacy filter to exploit Rényi differential privacy to achieve a more accurate calculation of privacy loss.Furthermore,it proposes a Lagrangian dual-based privacy and fairness constraint algorithm,which adds the differential privacy constraint and the fairness constraint to the objective function by introducing a Lagrangian method,and introduces a Lagrangian multiplier to balance these constraints.The Lagrangian multiplier method is used to transform the objective function into a pairwise problem,thus optimising both privacy and fairness,and achieving a balance between privacy and fairness through the Lagrangian function.It is shown that the proposed method improves the performance of the model while ensuring privacy and fairness of the model.

Backdoor Attack Method in Autoencoder End-to-End Communication System

GAN Run, WEI Xianglin, WANG Chao, WANG Bin, WANG Min, FAN Jianhua

Computer Science. 2024, 51 (7): 413-421. doi:10.11896/jsjkx.230400113

Abstract

PDF(2955KB) ( 466 )

References | Related Articles | Metrics

End-to-end communication systems based on auto-encoders do not require an explicit design of communication protocols,resulting in lower complexity compared to traditional modular communication systems,as well as higher flexibility and robustness.However,the weak interpretability of the auto-encoder model has brought new security risks to the end-to-end communication system.Experiment shows that,in the scenario of unknown channel and separate training of the decoder,by adding carefully designed triggers at the channel layer,the originally well-performing decoder can produce misjudgments,without affecting the performance of the decoder when processing samples without triggers,achieving a backdoor attack on the communication system.This paper designs a trigger generation model and proposes a backdoor attack method that combines the trigger generation model with the auto-encoder model for joint training,realizing the automatic generation of dynamic triggers,increasing the stealthiness of the attack while improving the success rate of the attack.In order to verify the effectiveness of the proposed me-thod,four different auto-encoder models are implemented,and the backdoor attack effects under different signal-to-noise ratios,different poisoning rates,different trigger sizes,and different trigger signal ratios are studied.Experimental results show that under a 6dB signal-to-noise ratio,the attack success rate and clean sample recognition rate of our proposal are both greater than 92% for the four different auto-encoder models.

Blockchain Anonymous Transaction Tracking Method Based on Node Influence

LI Zhiyuan, XU Binglei, ZHOU Yingyi

Computer Science. 2024, 51 (7): 422-429. doi:10.11896/jsjkx.230400177

Abstract

PDF(2811KB) ( 448 )

References | Related Articles | Metrics

With the rapid development of blockchain technology,illegal transactions with the help of virtual currencies are beco-ming increasingly common and still growing rapidly.In order to combat such crimes,blockchain transaction data are currently stu-died mainly from the perspectives of network analysis technology and graph data mining for blockchain transaction tracking.However,the existing studies are deficient in terms of effectiveness,generalizability,and efficiency,and cannot effectively track newly registered addresses.To address the above issues,a node-influence-based blockchain transaction tracking method NITT for account balance models is proposed in the paper,aiming to track the main fund flow of a specific target account model address.Compared with traditional methods,the proposed method introduces a temporal strategy to reduce the graph data size.It also filters out more influential and important account addresses by using a multiple weight assignment strategy.Experimental results on real datasets show that the proposed method has greater advantages in terms of effectiveness,generalizability and efficiency.