Computer Science

Semi-supervised Object Detection with Sequential Three-way Decision

SONG Faxing, MIAO Duoqian, ZHANG Hongyun

Computer Science. 2023, 50 (10): 1-6. doi:10.11896/jsjkx.230600035

Abstract

PDF(1901KB) ( 1861 )

References | Related Articles | Metrics

The need for large scale data in deep learning and the complexity of object detection annotation task promote the deve-lopment of semi-supervised object detection.In recent years,semi-supervised object detection has achieved many excellent results.However,the uncertainty in pseudo labels is still an unavoidable problem in semi-supervised object detection.The superior semi-supervised method requires an appropriate filtering threshold to balance the proportion of pseudo labels' noise and the recall rate,so as to retain accurate and effective labels as much as possible.To solve this problem,this paper introduces a sequential three-way decision algorithm into semi-supervised object detection,which divides the model output pseudo-labels into clean foreground labels,noisy foreground labels,and clean background labels according to different filtering thresholds,and adopts different processing strategies for them.For noisy foreground labels,we use negative class learning loss to learn these noisy labels,thereby avoiding learning noise information from them.Experimental results show the performance advantage of this algorithm.For COCO dataset,this method achieves performance of 35.2% when supervised data only accounts for 10%,which outperforms the supervised results by 11.34%.

Rule Extraction Based on OE-cp-Approximation Concepts in Incomplete Formal Contexts

NIU Lihui, MI Jusheng, BAI Yuzhang

Computer Science. 2023, 50 (10): 7-17. doi:10.11896/jsjkx.230600037

Abstract

PDF(1596KB) ( 1580 )

References | Related Articles | Metrics

In many practical application,data loss can be caused by data measurement errors,data understanding biases and transmission distortions.This “data incomplete” formal context is called incomplete formal context.In order to enrich the knowledge discovery model in the incomplete formal context,this paper combines the idea of three-way to construct the common-possible (cp) approximation concept in incomplete formal context using positive operator and necessity-possibility operators in rough set theory.The relationship between the object-induced common-possible (cp) approximation concept and the classical,attribute-orien-ted,and object-induced three-way approximation concepts is discussed,and an algorithm for constructing object-induced cp-approximation concepts from the classical and attribute-oriented concepts is formulated.Further,the acquisition of approximate decision rules in incomplete decision formal context is discussed based on the OE-cp-approximation concept.We propose positive decision rules and possibility decision rules in OE-cp-consistent incomplete decision formal context and compare them with the decision rules obtained from strongly consistent incomplete decision formal context.

Method of Updating Formal Concept Under Covering Multi-granularity

WANG Taibin, LI Deyu, ZHAI Yanhui

Computer Science. 2023, 50 (10): 18-27. doi:10.11896/jsjkx.230600049

Abstract

PDF(1658KB) ( 1478 )

References | Related Articles | Metrics

Multi-granularity formal concept analysis is an important tool for data mining and knowledge discovery.This paper studies the methods of coarsening and refining formal concepts under multi-granularity.Firstly,it is proved that the existing concept coarsening and updating algorithms will lead to concept deletion,and the concept coarsening algorithm is supplemented and improved by analyzing the characteristics of missing concepts.Secondly,it is proved that the existing concept refinement and updating algorithms will generate redundant concepts.The time complexity is high,so the existing concept refinement updating algorithm is optimized,and the performance advantages of the proposed concept refinement algorithm are verified by time complexity analysis and comparative experiments.

Two-sided Matching Method for Online Consultation Platform Considering Demand Priority

FAN Tingrui, LIU Dun, YE Xiaoqing

Computer Science. 2023, 50 (10): 28-36. doi:10.11896/jsjkx.230600042

Abstract

PDF(2172KB) ( 1880 )

References | Related Articles | Metrics

In recent years,with the rapid development of the Internet and smart medical care,online consultation platforms have gradually become an important channel to meet the basic medical needs of the public.With the continuous increase in the number of patients and doctors on the online consultation platform,the quality of doctors' consultation responses is uneven,and problems such as untimely response to patients' questions and a serious shortage of response rates continue to emerge.Therefore,How to mine patients' demand information and doctors' service information from a large amount of online medical content,describe patients' demand satisfaction and doctors' service ability,and achieve accurate matching are problems that need to be solved.Based on this,this paper proposes a multi-stage matching model combined with machine learning algorithms to improve matching accuracy and diversity.First of all,from the perspective of doctors and patients,this paper uses machine learning algorithms and sentiment analysis tools,combined with prospect theory,to fully evaluate patient preferences and doctors' professional capabilities.Secondly,considering the hierarchical structure of patient needs,this paper constructs a multi-stage dynamic matching model guided by the idea of granular computing.Finally,the validity of the method is verified through the research on real database on haodf.com.

Feature Selection Algorithm Based on Rough Set and Density Peak Clustering

CAO Dongtao, SHU Wenhao, QIAN Jin

Computer Science. 2023, 50 (10): 37-47. doi:10.11896/jsjkx.230600038

Abstract

PDF(6046KB) ( 1486 )

References | Related Articles | Metrics

Feature selection can effectively remove redundant and irrelevant features from high-dimensional data and retain important features,thus reducing the complexity of model computation and improving model accuracy.While in feature selection process,to deal with these noisy data that may affect the classification effect,such as outlier points and boundary points,a feature selection method based on rough set and density peak clustering is proposed.At first,noisy data are removed by density peak clustering method and cluster class centers are picked out.Then,the data are divided by cluster class centers by combining the idea of rough set theory,and the feature importance evaluation measure is defined according to the assumption that the data points of same cluster have same label.Finally,a heuristic feature selection algorithm is designed to pick up the feature subset that can makes for a purer homogeneous cluster structure.Experimental comparisons of classification accuracy,number of selected features and running time are conducted with other algorithms on six UCI datasets,and the experimental results verify the effectiveness and efficiency of the proposed algorithm.

Imbalanced Undersampling Based on Constructive Neural Network and Global Density Information

YAN Yuanting, MA Yingao, REN Yanping, ZHANG Yanping

Computer Science. 2023, 50 (10): 48-58. doi:10.11896/jsjkx.230600022

Abstract

PDF(4922KB) ( 1420 )

References | Related Articles | Metrics

Undersampling is one of the mainstream data-level technologies to deal with imbalanced data.In recent years,researchers have proposed numerous undersampling methods,but most of them focus on how to select representative majority class samples to avoid the loss of informative data.However,how to maintain the structures of the original majority class in the process of undersampling is still an open challenge.To this end,an undersampling method for imbalanced data classification is proposed based on constructive neural network and data density.Firstly,it detects the majority local patterns with a simplified constructive process.Then,two sample selection strategies are designed to maintain the structure of the selected groups according to the original majority distribution information.Finally,to solve the problem that the randomness of local pattern learning may lead to non-optimal sampling results,the bagging technique is introduced to further improve the learning performance.Comparative experiments with 13 comparison methodson 59 datasets verify the effectiveness of the proposed method in terms of three metrics G-mean,AUC and F1-score.

Adaptive Spectral Clustering Algorithm Combining Shared Nearest Neighbors and Manifold Distance

ZHANG Ximei, XIE Bin, MI Jusheng, XU Tongtong, ZHANG Yiling

Computer Science. 2023, 50 (10): 59-70. doi:10.11896/jsjkx.230600010

Abstract

PDF(6167KB) ( 1372 )

References | Related Articles | Metrics

Spectral clustering algorithm is built on the basis of graph theory.The clustering problem is transformed into the graph division problem,which can identify any shape of the cluster and easy to implement,so it has stronger adaptability than the traditional clustering algorithm.However,the distance measurement commonly used in this algorithm cannot consider both global and local consistency,and is easily affected by noise.The clustering results depend on the similarity matrix constructed from the input data,and the relaxation partition matrix obtained by feature decomposition and the two-step independent strategy of the dissociation process are difficult to obtain a common optimal solution.Therefore,an adaptive spectral clustering algorithm(SNN-MSC) combining shared nearest neighbors and manifold distance is proposed.A new manifold distance with exponential terms and sca-ling factors is introduced.It can flexibly adjust the similarity of data in the same manifold and the similarity ratio of data between different manifolds,and incorporate the density factor into the distance measurement of manifolds to eliminate the noise effect.The shared nearest neighbor is used to redefine the similarity measure,and the spatial structure and local relation between data points can be mined.At the same time,the rank constraint is applied to the Laplacian matrix so that the connected component in the similarity matrix is equal to the number of clusters.This method can adaptively optimize the data similarity matrix and clustering structure in the optimization process without any discretization operation.Through comparison experiments on artificial data sets and real data sets of UCI,the proposed algorithm shows better performance on multiple clustering validity indexes.

Optimal Granularity Selection and Attribute Reduction in Meso-granularity Space

LI Teng, LI Deyu, ZHAI Yanhui, ZHANG Shaoxia

Computer Science. 2023, 50 (10): 71-79. doi:10.11896/jsjkx.230500218

Abstract

PDF(1418KB) ( 1349 )

References | Related Articles | Metrics

The conventional formal concept analysis adopts a meso-granularity formal context to meet the requirements of cross-layer granulation of data.However,it does not effectively combine the search for optimal granularity with attribute reduction,nor does it efficiently solve the problem of combination explosion in a multi-granular context.Therefore,based on the connection between granularity selection and attribute reduction in the meso-granularity,a new optimal granularity selection method(i.e.,optimal granularity reduction) is proposed to synchronize the selection of the optimal granularity and attribute reduction.In view of the combination explosion in searching for optimal granularity reduction,a stepwise search method is designed to update the gra-nularity space with searched information,eliminating a large number of non-optimal granularity reduction and significantly improving search efficiency.Experimental results demonstrate the effectiveness and superiority of this method.

Novel Graph Convolutional Network Based on Multi-granularity Feature Fusion for Aspect-basedSentiment Analysis

DENG Ruhan, ZHANG Qinghua, HUANG Shuaishuai, GAO Man

Computer Science. 2023, 50 (10): 80-87. doi:10.11896/jsjkx.230600036

Abstract

PDF(2117KB) ( 1499 )

References | Related Articles | Metrics

Aspect-based sentiment analysis(ABSA) is a fine-grained task in sentiment analysis that aims to detect the emotional polarity of aspects in given sentence.Due to the rise of deep learning and graph convolutional networks(GCNs),GCN constructed over dependency tree has been widely applied to ABSA and achieved satisfactory results.However,most studies only acquire the last layer node features of graph convolutional network(GCN) as input to the classifier,while ignoring other layer node features and GCNs have over-smoothing problem.In recent years,some researchers ensembled the multilayer node features of GCN,improving the performance of sentiment classification models.A model combines adaptively spatial feature fusion and highway networks,namely highway graph convolutional network based on multi-granularity feature fusion(MGFF-HGCN) is proposed for ABSA in this paper.First,this model constructs GCN by syntactic dependency structure and bidirectional context information,and highway networks is introduced for alleviating the deep GCN over-smoothing problem,deepening the depth of GCN.Then,a adaptive fusion mechanism is effectively employed to fuse the more comprehensive and multi-granularity node feature information obtained from various highway GCN(HGCN) layers.Finally,experimental results on public datasets show that the proposed method is comparable to the benchmark models and be able to capture more granular syntactic information and long-range dependencies relationship accurately.

Classification Uncertainty Minimization-based Semi-supervised Ensemble Learning Algorithm

HE Yulin, ZHU Penghui, HUANG Zhexue, Fournier-Viger PHILIPPE

Computer Science. 2023, 50 (10): 88-95. doi:10.11896/jsjkx.230600048

Abstract

PDF(2418KB) ( 1438 )

References | Related Articles | Metrics

Semi-supervised ensemble learning(SSEL) is a combinatorial paradigm by fusing semi-supervised learning and ensemble learning together,which improves the diversity of ensemble learning by introducing unlabeled samples and at the same time solves the problem of insufficient sample size for ensemble learning.In addition,SSEL can improve the generalization capability of classification system by integrating multiple classifiers trained on the highly-credible labeled samples.The existing researches have proved the mutual benefit between semi-supervised learning and integrated learning from both theoretical and practical perspectives.The existing SSEL algorithms are unable to make full use of the unlabeled samples,which limit their prediction capabi-lities when handling the classification problems with less labeled samples.This paper proposes a novel classification uncertainty minimization-based semi-supervised ensemble learning(CUM-SSEL) algorithm,which introduces the information entropy as the criterion of confidence and uses the characteristics of information entropy to minimize the classification uncertainty in the process of predicting unlabeled samples.The feasibility,rationality and effectiveness of CUM-SSEL algorithm are verified based on a series of persuasive experiments.Experimental results demonstrate that CUM-SSEL is a valid algorithm to deal with the semi-supervised learning problems.

Fusion Tracker:Single-object Tracking Framework Fusing Image Features and Event Features

WANG Lin, LIU Zhe, SHI Dianxi, ZHOU Chenlei, YANG Shaowu, ZHANG Yongjun

Computer Science. 2023, 50 (10): 96-103. doi:10.11896/jsjkx.220900075

Abstract

PDF(2834KB) ( 2161 )

References | Related Articles | Metrics

Object tracking is a fundamental research problem in the field of computer vision.As the mainstream object tracking method sensor,conventional cameras can provide rich scene information.However,due to the limitation of sampling principle,conventional cameras suffer from overexposure or underexposure under extreme lighting conditions,and there is motion blur in high-speed motion scenes.In contrast,event camera is a bionic sensor that can sense light intensity changes to output event streams,with the advantages of high dynamic range and high temporal resolution,but it is difficult to capture static targets.Inspired by the characteristics of conventional and event cameras,a dual-modal fusion single-target tracking method,called fusion tracker,is proposed.The method adaptively fuses visual cues from conventional and event camera data by feature enhancement,while designing an attention mechanism-based feature matching network to match object cues of template frames with search frames to establish long-term feature associations and make the tracker focus on object information.The fusion tracker can solve the semantic loss problem caused by correlation operations during feature matching and improve the performance of object tra-cking.Experiments on two publicly available datasets demonstrate the superiority of our approach and validate the effectiveness of the key parts of the fusion tracker by ablation experiments.The fusion tracker can effectively improve the robustness of object tracking tasks in complex scenarios and provide reliable tracking results for downstream applications.

Unbiased Scene Graph Generation Based on Adaptive Regularization Algorithm

LI Haochen, CAO Fuyuan, QIAO Shichang

Computer Science. 2023, 50 (10): 104-111. doi:10.11896/jsjkx.221000084

Abstract

PDF(4153KB) ( 1758 )

References | Related Articles | Metrics

The purpose of scene graph generation is to give a picture,obtain the visual triplet form of entities and relationships between entities through the object detection module,namely subject,relationship and object,and construct a semantic structured representation.Scene graphs can be applied to downstream tasks such as image retrieval and visual question answering.However,due to the longtail distribution of relationships between entities in the dataset,existing models tend to predict coarse grained head relationships.Such scene graph cannot play an auxiliary role for downstream tasks.Previous works generally adopt rebalancing strategies such as resampling and reweighting to solve the long tail problem.However,because the models repeatedly learn the tail relationship samples,it is prone to overfitting.In order to solve the above problems,an adaptive regularized unbiased scene graph generation method is proposed in this paper.Specifically,the method adaptively adjusts the weights of full connected classifier of the model by designing a regularization term based on the prior relation frequency,so as to achieve the prediction of model balance.The proposedmethod is tested on Visual Genome dataset,and the experimental results show that it can not only prevent the model from overfitting,but also alleviate the negative impact of the longtail distribution problem on the scene graph generation,and the state-of-the-artscene graph generation methods combined with the proposed method can more effectively improve the performance of unbiased scene graph generation.

Forgery Face Detection Based on Multi-scale Transformer Fusing Multi-domain Information

MA Xin, JI Lixin, LI Shaomei

Computer Science. 2023, 50 (10): 112-118. doi:10.11896/jsjkx.220900048

Abstract

PDF(2733KB) ( 1919 )

References | Related Articles | Metrics

At present,the proliferation of “face-changing” fake videos generated based on deep forgery technologies such as Deepfakes poses a considerable threat to citizens' privacy and national political security.Therefore,it is of great significance to study deep-faked face detection technology in videos.Aiming at the problems of insufficient extraction of facial features and weak gene-ralization ability of existing forged face detection methods,this paper proposes a fake face detection method based on multi-scale Transformer for the fusion of multi-domain information.First,based on the idea of multi-domain feature fusion,feature extraction from the frequency domain and RGB domain of video frames improves the generalization of the model.Second,the EfficientNet and multi-scale Transformer are combined to design a multi-level feature extraction network to extract more elaborate forged features.The test results on open-source datasets show that the proposed method has better detection performance than the existing methods.At the same time,experimental results on cross-datasets prove that the proposed model has better generalization performance.

Study on Fine-grained Image Classification Based on ConvNeXt Heatmap Localization and Contrastive Learning

ZHENG Shijie, WANG Gaocai

Computer Science. 2023, 50 (10): 119-125. doi:10.11896/jsjkx.220900196

Abstract

PDF(2251KB) ( 1991 )

References | Related Articles | Metrics

Aiming at the challenges of high intra-class disparity and low inter-class disparity in fine-grained image classification,a multi-branch fine-grained image classification method based on ConvNeXt network and using GradCAM heatmap for cropping and attention erasure is proposed.This method uses GradCAM to obtain the attention heatmap of the network through gradient reflow,locates the region with discriminative features,crops and enlarges the region,and makes the network focus on local deeper features.At the same time,supervised contrastive learning is introduced to expand between-class differences and reduce intra-class differences.Finally,a heatmap attention erasure operation is performed to enable the network to focus on other regions useful for classification while focusing on the most discriminative features.The proposed method achieves 91.8%,94.9%,94.0%,and 94.4% classification accuracy on CUB-200-2011,Stanford Cars,FGVC Aircraft,and Stanford Dogs datasets,respectively,which is better than many mainstream fine-grained image classification methods.And this method achieves top-3 and top-1 classification accuracy on the CUB-200-2011 and Stanford Dogs datasets,respectively.

Active Learning-based Text Entity and Relation Joint Extraction Method

DING Hongxin, ZOU Peinie, ZHAO Junfeng, WANG Yasha

Computer Science. 2023, 50 (10): 126-134. doi:10.11896/jsjkx.230300079

Abstract

PDF(1942KB) ( 2129 )

References | Related Articles | Metrics

Unstructured text data contains a large amount of valuable knowledge,entities and relations extracted from which can form structured knowledge and help to build knowledge graphs and support downstream tasks.There is a wide range of application prospects for entity and relation extraction.Currently,entity and relation extraction mostly use deep learning methods.However,the training of deep learning models consumes large amounts of annotated datasets,resulting in high labor cost.Therefore,how to reduce the workload of manual annotation is one of the focuses of research.Active learning is a subfield of machine lear-ning,which aims to maximize a model's performance gain while annotating the fewest samples possible,by selecting the most va-luable samples to be labeled and handed over to the model for training.Its potential to reduce training data complements the data-hungry nature of deep learning.Therefore,deep active learning that applies active learning in deep learning has become a hot research topic in entity and relation extraction.In the above context,using deep active learning for joint entity and relation extraction and appling active learning to the training process of the deep learning model to minimize the manual labeled data required for training while maintaining model performance,a deep learning model based on unified label space and matrix annotation for entity relation joint extraction is implemented and based on it,a variety of active learning query strategies are designed and implemented.The validity of the method is verified on text datasets and common entity and relation joint extraction datasets in the medical field.Several methods are proposed to select the stopping time of model training,including methods based on training loss curve of the model,model performance on the training set,and the prediction stability on reserved data.The method of selecting stop time for practical application scenario is studied by experiments.An intelligent text annotation tool based on active learning for joint extraction of entity and relation is designed and implemented,which allows users to annotate entities and relations in the text.The tool implements a deep learning model for entity and relation extraction and active learning methods to minimize the annotation workload of users.

Co-Forecasting for Multi-modal Traffic Flow Based on Graph Contrastive Learning

XIAO Yang, QIN Jianyang, LI Kenli, WANG Ge, LI Rui, LIAO Qing

Computer Science. 2023, 50 (10): 135-145. doi:10.11896/jsjkx.230700127

Abstract

PDF(4113KB) ( 2074 )

References | Related Articles | Metrics

An accurate traffic flow prediction in urban areas is of important significance to provide guidance for urban vehicle scheduling and public transportation system optimization.So far,most existing traffic flow prediction methods only consider a single type of traffic flow prediction in a regular grid area,ignoring the spatial irregularity and heterogeneity in the traffic network and the interactivity among different kinds of traffic flow.To address these problems,this paper proposes a co-forecasting method for multi-modal traffic flow based on graph contrastive learning,named CoF-MGCL,so as to reveal the effect of the interaction among various traffic flows on the traffic demand in irregular and heterogeneous areas.Specifically,this paper collects multi-modeltraffic data,including the individual and total traffic flow of various travel types(e.g.,the traffic flow of bike and taxi);then,constructs a heterogeneous graph with multiple relations,including geographical proximity and functional similarity relations,for irregular areas.By using a heterogeneous graph coding module,this paper can fuse multiple relations in a heterogeneous graph to learn high-quality representations for various traffic flows in different areas.The learned representations of each individual traffic flow are integrated via an attention mechanism,which is compared with the representation of total traffic flow via a graph con-trastive learning,so as to capture the interactive correlation among different traffic flows.Finally,this paper introduces a mutual information regularization for multi-modal traffic flow co-forecasting,maximizing multi-modal information learning.To achieve multi-modal traffic flow forecasting in irregular areas,two new multi-modal traffic flow datasets for the Manhattan Borough of New York and Chicago have been constructed and used for experiments.Experimental results demonstrate that the proposed method can be combined with existing uni-modal traffic flow forecasting methods to obtain 0.60%~12.13% performance gains in terms of root-mean-square error(RMSE) and mean-absolute error(MAE),verifying the effectiveness of the proposed method.

Knowledge Enhanced Relationship Prediction Model for Enterprise Entities

WANG Jiaqi, LI Wengen, GUAN Jihong, XING Ting, WEI Xiaomin, SHAO Bingqing, FU Chongjie

Computer Science. 2023, 50 (10): 146-155. doi:10.11896/jsjkx.221000063

Abstract

PDF(3651KB) ( 1847 )

References | Related Articles | Metrics

With the development of knowledge graphs,a variety of industrial knowledge graphs have come into being.However,these industrial knowledge graphs lack sufficient relationships among enterprises,such as up-down stream relationship,supply relationship,cooperation and competition relationship,which greatly affects their applications.Most existing methods for predicting the enterprise entity relationships focus on the fact triples and cannot fully utilize multiple perspectives such as enterprise descriptions and associated entity descriptions.To solve this problem,KERP,a knowledge enhanced relationship prediction model for enterprise entities is proposed.The model first improves enterprise features representations using a multi-view entity feature lear-ning module,then uses graph attention network to obtain higher-order semantic representations of entities and fuses lower-order semantic representations learned by TransR for knowledge enhancement,and finally predicts enterprise entity relationships by a convolutional decoder ConvE.Experimental results on the new energy automobile industrial knowledge graph show that KERP has better results in predicting the relationships between enterprises with a improvement of 6.7%in terms of F1 value compared with the existing models.Generalization is also evaluated on multiple datasets,and the experimental results demonstrate that KERP has good generality for generalized entity relationship prediction tasks.

Intention-based Multi-agent Motion Planning Method with Deep Reinforcement Learning

PENG Yingxuan, SHI Dianxi, YANG Huanhuan, HU Haomeng, YANG Shaowu

Computer Science. 2023, 50 (10): 156-164. doi:10.11896/jsjkx.220900031

Abstract

PDF(3505KB) ( 2079 )

References | Related Articles | Metrics

The challenges of multi-agent motion planning lie in the lack of effective cooperative approaches,high communication dependency requirements,and the lack of information screening mechanisms.To this end,an intention-based multi-agent deep reinforcement learning motion planning method is proposed,which can help agents reach goals while avoiding collisions without explicit communication.Firstly,the concept of intention is introduced into the multi-agent motion planning problem by combining the visual images with the history maps to predict the intentions of agents,so that agents can anticipate the actions of other agents and thus collaborate effectively.Secondly,a convolutional neural network architecture based on attention mechanism is designed.This network architecture can be used to predict the intentions of agents and select the actions of agents,filtering the useful visual input information while reducing the reliance on communication for multi-agent cooperation.Thirdly,a value-based deep reinforcement learning algorithm is proposed to learn the motion planning strategy.By improving the objective function and the calculation of the Q values,the strategy is made more stable.Tested in six different PyBullet simulation scenes,the experimental results demonstrate that the proposed method improves the cooperation efficiency of multi-agent teams by an average of 10.74% with significant performance advantages compared to other advanced multi-agent motion planning methods.

Ventilator and Sedative Management in Networked ICUs Based on Federated Learning

CAO Linxiao, LIU Jia, ZHU Yifei, ZHOU Haoquan, GONG Wei, YU Weihua, LI Chaoyou

Computer Science. 2023, 50 (10): 165-175. doi:10.11896/jsjkx.220900177

Abstract

PDF(3287KB) ( 1939 )

References | Related Articles | Metrics

The proliferation of medical IoT devices and the abundance of medical data open up new possibilities for smart healthcare.Patients in the intensive care unit (ICU) rely on numerous medical IoT devices to continuously monitor and manage patients' health status.Among the common therapeutic interventions in ICUs,invasive mechanical ventilation and sedation are mostly administered to maintain patients' respiratory function and enhance the care quality.While the existing therapeutic interventions rely heavily on physician judgment.This paper proposes a data-driven optimal policy learning framework named MFed that allows distributed learning of optimal intervention policies on networked ICUs.A differentially private federated learning method is constructed to overcome privacy limitations in medical data.MFed further ensures worst-case performance with distributionally robust optimization and adaptively filters out noisy data.Extensive experiments on a real-world ICU dataset show that the proposed method improves accuracy by 36.75% compared to other state-of-the-art baselines.

Bidirectional Inference Model with Multiple Latent Variables Based on Variational Auto-encoders

ZHAO Yanbin, SU Jindian

Computer Science. 2023, 50 (10): 176-183. doi:10.11896/jsjkx.220900201

Abstract

PDF(1819KB) ( 1841 )

References | Related Articles | Metrics

One of the key tasks of open-domain dialog system is to generate diverse and coherent dialog responses.However,one-way inference from above information alone cannot achieve this goal.To solve this problem,this paper proposes a bidirectional inference model MLVBI(multiple latent variables bidirectional inference) based on multiple latent variables.First,variational auto-encoder is incorporated into the language model and one-way inference is extended to two-way inference.That is,after the corpus is divided into context,query and response,forward inference is used to infer the response from the query to learn the word order information,and reverse inference is used to infer the query from the response to learn additional topic information at the same time.Finally,the model is integrated into bidirectional inference to generate more coherent responses.Then,in order to solve the problem of insufficient explanation ability of a single latent variable in the two-way inference process,this paper introduces multiple latent variables in the inference process to further improve the diversity of generated conversations.Experimental results show that MLVBI obtains the best accuracy and diversity on two open-domain datasets,DailyDialog and PersonalChat,and ablation experiments also show the effectiveness of two-way inference and multiple latent variables.

Spatial Crowdsourcing Task Pricing Algorithm Based on Nash Bidding

LIN Weida, DONG Hongbin, ZHAO Bingxu

Computer Science. 2023, 50 (10): 184-192. doi:10.11896/jsjkx.220900130

Abstract

PDF(4210KB) ( 1821 )

References | Related Articles | Metrics

Task pricing is an important step for crowdsourcing platforms to solve profit-driven task allocation and maximize pro-fits.However,there are relatively few studies on task pricing about worker expectations,and most existing studies do not consi-der the dynamic demands of workers and tasks.Furthermore,obtaining complete worker information is difficult due to worker privacy and sensor limitations.In order to solve the above problems,a pricing algorithm for spatial crowdsourcing tasks based on nash bidding is proposed.The algorithm first obtains the price range of the task through the machine learning algorithm,and then conducts nash bidding on the price range.In order to solve the problem of large price fluctuations caused by dynamic supply and demand,an adjustment mechanism is designed to stabilize the average price of tasks.Finally,in order to simulate the Nash equilibrium point,two different gradient functions are used to search for the task price with the largest number of matches.The proposed algorithm is tested on the gMission data set and the synthetic data set respectively.The results show that the algorithm is 60% and 1.57 times of the MCMF algorithm in terms of the number of matches and the average task price,and the time cost is 9.6% of the MCMF algorithm.Experimental results show the effectiveness of the proposed algorithm.

Aspect-based Sentiment Analysis Based on Aspect Semantic and Gated Filtering Network

HE Zhihao, CHEN Hongmei, LUO Chuan

Computer Science. 2023, 50 (10): 193-202. doi:10.11896/jsjkx.220900192

Abstract

PDF(2276KB) ( 1904 )

References | Related Articles | Metrics

Aspect-based sentiment analysis(ABSA)is a fine-grained sentiment analysis,which aims to predict sentiment polarity of text toward a specific aspect.Currently,given the excellent capabiities of recurrent neural networks(RNN) in sequence mode-ling and the outstanding performance of convolutional neural networks(CNN) in learning local patterns,some works have combined the two to mine sentiment information and achieved good results.However,few works consider aspect information while applying the combination of the two to ABSA.In aspect-based sentiment analysis tasks,most of the work treat aspect as an independent whole interacting with the contexts,but the representation of aspect is too simple and lacks real semantic.To address the above issues,this paper proposes a neural network model based on aspect semantic and gated filtering network(ASGFN) to better mine aspect-based sentiment information.First,an aspect encoding module is designed to capture context-specific aspect semantic information,which is based on a global context fusion multi-head attention mechanism with a graph convolutional neural network to construct aspect representation containing specific semantic.Second,a gated filtering network is designed to connect RNN and CNN as a way to enhance the interaction of aspect with the contexts,while combining the advantages of the RNN and the CNN,and then extracting the sentiment feature.Eventually,the sentiment feature is combined with aspect representation to generate semantic representation that predicts sentiment polarity.Sentiment classification accuracies of 84.72%,78.64%,and 76.22% are achieved in three communal datasets,restaurant,laptop,and twitter,respectively.Experimental results demonstrate the effectiveness of the proposed model,which can improve the performance of ABSA.

Multi-surrogate Multi-task Optimization Approach Based on Two-layer Knowledge Transfer

MA Hui, FENG Xiang, YU Huiqun

Computer Science. 2023, 50 (10): 203-213. doi:10.11896/jsjkx.220900242

Abstract

PDF(4954KB) ( 1844 )

References | Related Articles | Metrics

Evolutionary multi-task optimization is a new research direction in the field of computational intelligence.It focuses on how to handle multiple optimization tasks effectively and simultaneously through evolutionary algorithm,so as to enhance the performance of solving each task individually.Based on this,a multi-surrogate and multi-task optimization approach based on two-layer knowledge transfer(AMS-MTO) is proposed,which achieves the purpose of cross-domain optimization by transferring knowledge between surrogates and within surrogates at the same time.Specifically,the knowledge transfer within the surrogates realizes the cross-dimensional transfer of decision variable information through differential evolutionary,so as to avoid the algorithm falling into local optimum.The learning between surrogates adopts two strategies:implicit knowledge transfer and explicit knowledge transfer.The former uses the selective crossover of populations to generate offspring and promote the exchange of genetic information.The latter is mainly the transfer of elite individuals,which can make up for the strong randomness of implicit transfer.For the sake of evaluate the effectiveness of the AMS-MTO algorithm,we carry out an empirical study on 8 benchmark problems up to 100 dimension.At the same time,we give the convergence proof and compare it with the existing algorithms.Experiment resultsshow that when solving expensive problems of single objective optimization,the AMS-MTO algorithm has higher efficiency,better performance and faster convergence speed.

UAV Anti-tank Policy Training Model Based on Curriculum Reinforcement Learning

LIN Zeyang, LAI Jun, CHEN Xiliang, WANG Jun

Computer Science. 2023, 50 (10): 214-222. doi:10.11896/jsjkx.220700121

Abstract

PDF(3960KB) ( 1930 )

References | Related Articles | Metrics

In the intelligent era,the battle for land battlefield expands from planar land control to vertical land control.UAV anti-tank operation plays a crucial role in the battle for land control in future intelligent war.Deep reinforcement learning method in complex problem solving are faced with problems such as decision space explosion and sparse reward,this paper puts forward a dynamic multi-agent curriculum learning method based on VDN,the curriculum learning method is added into the training process of multi-agent deep reinforcement learning in this method,and combined with Stein variational gradient descent algorithm to improve the curriculum learning process.The problems of poor initial training effect,long training time and difficult convergence of reinforcement learning in complex tasks are solved.In addition,the curriculum learning model is constructed in the multi-agent particle environment and UAV anti-tank combat scene respectively,and the transfer of the model and training prior knowledge from easy to difficult is realized.Experimental results show that the curriculum learning DyMA-CL mechanism can improve the reinforcement learning training process,and the reinforcement learning agent can obtain better initial training effect,model convergence speed and final effect when conducting difficult task learning.

Biomedical Relationship Extraction Method Based on Prompt Learning

WEN Kunjian, CHEN Yanping, HUANG Ruizhang, QIN Yongbin

Computer Science. 2023, 50 (10): 223-229. doi:10.11896/jsjkx.220900108

Abstract

PDF(3062KB) ( 2049 )

References | Related Articles | Metrics

Extracting the relationship between entities from unstructured biomedical text data is of great significance for the development of biomedical informatization.At the same time,it is also a research hotspot in the field of natural language processing.At present,there are two difficulties in correctly extracting the relationship between entities in biomedical data.One is that in biomedicine,entity words are mostly composed of compound words and unknown words,which makes it difficult for the model to learn the semantic features inside the entity.Second,because there are few biomedical band labeling data and the amount of parameters of neural network is large,the neural network is prone to overfitting.Therefore,a biomedical relationship extraction method based on prompt learning is proposed in this paper.In this paper,an annotation label for entities is added to prompt entities to enhance entity semantics and contact context information.In addition,based on the traditional prompt optimization me-thod,this paper uses the continuity template to alleviate the performance deviation caused by the manual design of the template.At the same time,combined with the depth prefix to control the depth prompt ability of attention,the model can still achieve good results when dealing with a small amount of data.

Hybrid Bayesian Network Structure Learning via Evolutionary Order Search

LI Mingjia, QIAN Hong, ZHOU Aimin

Computer Science. 2023, 50 (10): 230-238. doi:10.11896/jsjkx.221000046

Abstract

PDF(2222KB) ( 1789 )

References | Related Articles | Metrics

Bayesian network is an effective tool for uncertainty knowledge representation and reasoning.Learning and discovering its structure is the basis of reasoning via this tool.Existing Bayesian network structure learning algorithms often encounter the dilemma of balancing effectiveness and efficiency in real-world applications such as intelligent education.On the one hand,score-and-search methods can find out the high-quality solutions,but they suffer from the high algorithmic complexity.On the other hand,hybrid methods are highly efficient but the quality of the found solutions is not satisfactory.To address the above dilemma,this paper proposes an evolutionary order search based hybrid Bayesian network structure learning method called EvOS.First,the proposed EvOS constructs an undirected graph skeleton through a constraint algorithm,and then applies an evolutionary algorithm to search for the optimal node order,and finally uses the found node order to guide the greedy search so as to obtain the Bayesian network structure.This paper conducts the empirical study to verify the effectiveness and efficiency of the proposed EvOS in the commonly-used benchmark datasets as well as the real-world task of educational knowledge structure discovery.Experimental results show that,compared with the score-and-search methods,EvOS is able to achieve up to 100 times speedup while maintaining the similar accuracy,and its effectiveness is significantly better than that of the hybrid methods.

vsocket:an RDMA-based Acceleration Method Compatible with Standard Socket

CHEN Yunfang, MAO Haotian, ZHANG Wei

Computer Science. 2023, 50 (10): 239-247. doi:10.11896/jsjkx.220800048

Abstract

PDF(3675KB) ( 2558 )

References | Related Articles | Metrics

In order to be compatible with Linux standard sockets and utilize RDMA to improve the performance of programs using sockets,this paper proposes to construct a middleware Viscore Socket adaptor,referred to as vsocket between the upper-la-yer application and the underlying RDMA.By intercepting the socket API,we seamlessly transfer the data stream sent and received by the upper-layer application through the Linux socket to the RDMA bearer.The vsocket bypasses kernel and implements memory management mechanism in user space for TCP and UDP.It utilizes RC type RDMA network to support TCP acceleration,uses UD type RDMA network to support UDP acceleration,and reuses Linux UDP to assist routing.Experimental results show that vsocket can ensure the compatibility of the Linux standard socket interface,get rid of the limitation of the Linux kernel network protocol stack,and improve the network performance.

Edge Server Placement Algorithm Based on Spectral Clustering

GUO Yingya, WANG Lijuan, GENG Haijun

Computer Science. 2023, 50 (10): 248-257. doi:10.11896/jsjkx.220900211

Abstract

PDF(2532KB) ( 2304 )

References | Related Articles | Metrics

With the rapid development of the Internet of Things(IoT) and 5G networks,mobile edge computing has attracted widespread attention from industry and academia for its low access latency,low bandwidth costs,and low energy consumption.In mobile edge computing,edge servers provide services for mobile user requests,and the placement of edge servers has an important impact on edge computing performance and user experience.At present,the placement algorithm of edge servers only considers the geographical location of server placement,and lacks the consideration of the number of users connected to the base station.Therefore,in the case of uneven distribution of actual users,the average user access delay caused by the server placement position obtained by the existing algorithm is large.In order to better solve the above problems,this paper proposes a latency minimization edge server placement algorithm based on spectral clustering.When solving the problem of edge server placement,the algorithm not only considers the geographical location of the base station,but also takes into account the important parameter of the number of users connected to different base stations,which can effectively reduce the average access latency of users and make the workload of each edge server more balanced at the same time.In the simulation experiment,this paper uses the real base station dataset of Shanghai Telecom to test the performance of the proposed server placement algorithm.Simulation experiment results show that the user-distributed access delay minimization edge server placement algorithm has significant advantages in solving the edge server placement problem.In terms of access latency,the performance of LAMP algorithm is increased by 37.9% compared with K-means algorithm.Compared with the K-means algorithm,the performance of the LAMP algorithm can be improved by up to 82.85% in terms of load balancing.The LAMP algorithm exhibits superior performance in reducing access latency and balancing edge server workloads.

UAV Geographic Location Routing Protocol Based on Cross Layer Link Quality State Awareness

ZHOU Yanling, MI Zhichao, LU Yanxia, WANG Hai

Computer Science. 2023, 50 (10): 258-265. doi:10.11896/jsjkx.230500221

Abstract

PDF(3071KB) ( 2361 )

References | Related Articles | Metrics

The geographical location routing protocol has been widely used in FANET networks due to its low overhead and good scalability.However,its strategy of relying on the nearest neighbor node as the relay in greedy forwarding process still has certain limitations.This paper proposes a cross layer link quality state aware unmanned aerial vehicle geographic location routing protocol(CLAQ-GPSR) suitable for frequently changing topology and congested network environments by sensing channel link quality.By establishing a communication security zone,establishing a measurement model for link load and inter flow interference,using delivery ratio ETX to measure link quality,and combining data from the physical layer,MAC layer,and network layer to comprehensively measure the most reliable relay nodes,communication quality can be improved.At the same time,the combination of left and right hand forwarding rules is used to accelerate the forwarding speed in path recovery and avoid routing loops and other issues that occur in traditional peripheral forwarding.Through comparative analysis on network simulation platforms,it is found that,compared with the traditional GPSR,W-GeoR,and DGF-ETX protocols,the proposed protocol has advantages in terms of packet delivery success rate,end-to-end latency,and average hop count.

Performance Analysis of Multi-server Gated Service System Based on BiLSTM Neural Networks

YANG Zhijun, HUANG Wenjie, DING Hongwei

Computer Science. 2023, 50 (10): 266-274. doi:10.11896/jsjkx.221000221

Abstract

PDF(2545KB) ( 2162 )

References | Related Articles | Metrics

In order to meet the requirements of fast operation,low delay,good performance and fairness,a multi-server gated service system is proposed and its predictive analysis is carried out using BiLSTM (bi-directional long short-term memory) neural networks.Multi-server access is used to reduce the network delay and improve system performance.Both synchronous and asynchronous approaches can be used when multiple servers are scheduled.Firstly,the system model of multi-server gated service is investigated.Secondly,the average queue length,average cycle period and average delay of multi-server gated service are solved on the basis of single server using the analytical methods of embedded Markov chain and probabilistic generating function.Meanwhile,simulation experiments are conducted using Matlab to compare the theoretical and simulated values of single server system and multi-server system respectively system analysis,comparing both multi-server synchronous and asynchronous approaches.Finally,a BiLSTM neural network is constructed to predict the performance of the multi-server system.Experiments show that the asynchronous approach of this multi-server system is superior to the synchronous and the single-server system,and the multi-server asynchronous system has better performance,lower delay and higher efficiency.Comparing the three basic multi-server service systems,the gated service system is more stable while ensuring fairness.And the use of BiLSTM neural network prediction algorithm can accurately predict the performance of the system and improve the computational efficiency,which is a guideline for the performance evaluation of the polling system.

Cost-minimizing Task Offload Strategy for Mobile Devices Under Service Cache Constraint

ZHANG Junna, CHEN Jiawei, BAO Xiang, LIU Chunhong, YUAN Peiyan

Computer Science. 2023, 50 (10): 275-281. doi:10.11896/jsjkx.220900185

Abstract

PDF(2869KB) ( 2424 )

References | Related Articles | Metrics

Edge computing provides more computing and storage capabilities at the edge of the network to effectively reduce execution delay and power consumption of mobile devices.Since applications consume more and more computing and storage resources,task offloading has become one of effective solutions to address the inherent limitations in mobile terminals.However,existing researches on task offloading often ignore the diversity of service requirements for different types of tasks and that edge servers have limited services capabilities,resulting in infeasible offloading decisions.Therefore,we study the task offloading pro-blem that can optimize the execution cost of mobile devices under service cache constraints.We first design a collaborative offloa-ding model integrated remote cloud,edge server and local device to balance the load of edge server.Meanwhile,cloud server is used to make up for the limited-service caching capacity of the edge server.Secondly,a task offloading algorithm suitable for cloud-edge-device collaboration is proposed to optimize the execution delay and energy cost of mobile devices.When the task is offloaded,the improved greedy algorithm is used to select the best edge server.Then,the offload decision of the task is determined by comparing the execution cost of the task at different locations.Experimental results show that the proposed algorithm can effectively reduce the execution cost of mobile devices compared with the comparison algorithms.

Bidirectional Quality Control Strategies Based on CIDA and PI-cosine in Crowdsourcing

LIU Qingju, PAN Qingxian, TONG Xiangrong, YU Song, PAN Yanan

Computer Science. 2023, 50 (10): 282-290. doi:10.11896/jsjkx.221000133

Abstract

PDF(2645KB) ( 2160 )

References | Related Articles | Metrics

With the popularity of mobile smart terminals,crowdsourcing to collect large-scale perceptual data becomes easier and easier.The selfishness of crowdworkers makes them want to get the most pay with the least effort,and even collude with each other and submit crowdsourced data arbitrarily,resulting in poor quality of crowdsourced task completion.This paper proposes a jury-based quality control strategy,a mechanism that solves the data validation problem.To address the behaviors that degrade the quality of crowdsourcing,this paper uses the proposed community influence detection algorithm(CIDA) to detect conspiracy leaders and their organizations after determining the presence of spam employees and conspiracy organizations,and finally uses an improved similarity detection algorithm(PI-Cosine) to screen out for spam employees.These two aspects are used to improve the quality of crowdsourcing data.Experiments show that the proposed method improves the accuracy of 12.3% over Cosine similarity detection algorithm in accuracy and F1-score measures.

Reliability Constraint-oriented Workflow Scheduling Strategy in Cloud Environment

LI Jinliang, LIN Bing, CHEN Xing

Computer Science. 2023, 50 (10): 291-298. doi:10.11896/jsjkx.220800039

Abstract

PDF(1836KB) ( 2225 )

References | Related Articles | Metrics

As more and more computationally intensive dependent applications are offloaded to the cloud environment for execution,the problem of workflow scheduling has received extensive attention.Aiming at the workflow scheduling problem of multi-objective optimization in cloud environment,and considering that the server may experience performance fluctuations and downtime during task execution,based on fuzzy theory,a triangular fuzzy number is used to represent task execution time and data transmission time.A genetic algorithm-based adaptive particle swarm optimization based GA(APSOGA) is proposed.The purpose is to comprehensively optimize the completion time and execution cost of the workflow under the reliability constraints of the workflow.In order to avoid the premature convergence problem of the traditional particle swarm optimization algorithm,the proposed algorithm introduces the random two-point crossover operation and single-point mutation operation of the genetic algorithm,which effectively improves the search performance of the algorithm.Experimental results show that,compared with other strategies,APSOGA-based scheduling strategy can effectively reduce the time and cost of reliability-constrained scientific workflows in cloud environments.

Cross-domain User Authentication via Wi-Fi Sensing of Continuous Activities

KONG Hao, YU Jiadi

Computer Science. 2023, 50 (10): 299-307. doi:10.11896/jsjkx.220900163

Abstract

PDF(2989KB) ( 1733 )

References | Related Articles | Metrics

Nowadays,Internet of Things(IoT)-based user authentication has been gradually developed.Some works utilize widespread Wi-Fi signals to sense user activities and extract individual uniqueness for user authentication.However,users must perform an independent activity under a known domain(i.e.,environment,location,and orientation),before the system can conduct user authentication.In order to break through the limitation of existing methods,this paper proposes a cross-domain user authentication method based on Wi-Fi signals,CroAuth,to realize user authentication across environments,locations,and orientations when users perform continuous activities.To release the requirement of performing independent activities,this paper proposes a continuous activity separation algorithm based on dynamic time warping,which can separate specific activity sequences from diversified continuous activities.Then,this paper designs a cross-domain user authentication method based on siamese neural network to extract domain-independent features,which can characterize essential behavioral uniqueness of each user under various environments,locations,and orientations.Finally,a knowledge distillation method is utilized to construct a few-shot cross-domain user authentication model.Experimental results show that CroAuth can authenticate users under cross-environment,location,and orientation scenarios when users perform diversified continuous activities.

IPSec VPN Closure Detection Method Based on Side-channel Features

SUN Yunxiao, LI Jun, WANG Bailing

Computer Science. 2023, 50 (10): 308-314. doi:10.11896/jsjkx.230500141

Abstract

PDF(2595KB) ( 1537 )

References | Related Articles | Metrics

IPSec VPN can be divided into closed networks and open networks according to different application scenarios.Closed networks are generally used to customize virtual private networks,and open network proxies are commonly used to avoid network auditing.Therefore,the identification and classification of IPSec VPN network types is of great significance for network supervision.According to the difference in traffic complexity between the two network types,a method for IPSec VPN closure detection using side-channel features of the encrypted traffic is proposed.The distribution of IPSec encrypted traffic frame length sequence and TCP maximum segment size in the tunnel is extracted,and information entropy is introduced to measure the distribution of MSS value.The information entropy of MSS value and the standard deviation of the frame length sequence are used as feature vectors.Machine Learning algorithms such as support vector machine and random forest are used for training and prediction.Experimental results indicate that the accuracy of closure detection using this classification method exceeds 96% and can effectively identify VPN tunnels used for open proxies.

Utility-optimized Local Differential Privacy Joint Distribution Estimation Mechanisms

YIN Shiyu, ZHU Youwen, ZHANG Yue

Computer Science. 2023, 50 (10): 315-326. doi:10.11896/jsjkx.221000053

Abstract

PDF(3180KB) ( 1515 )

References | Related Articles | Metrics

Compared with traditional centralized differential privacy,local differential privacy(LDP) has the advantage of not re-lying on trusted third parties,but it also has the problem of low data utility.The utility-optimized local differential privacy(ULDP) can improve the accuracy of estimation results by taking advantage of the sensitivity differences of different inputs.Two-dimensional data joint distribution calculation can be widely used in many data analysis scenarios.However,how to realize two-dimensional data joint distribution estimation under the ULDP model is still an important problem that has not yet been solved.Aiming at this problem,the definition of the two-dimensional ULDP model is given first,taking into account the four cases of whether the two attributes are sensitive or not.Secondly,under this model,for the joint distribution estimation problem,two mechanisms joint utility-optimized randomized response(JuRR) and cartesian product randomized response(CPRR) are proposed,and the unbiasedness of the estimation results is proved theoretically.Finally,comparative experiments are carried out on real datasets to discuss the influence of different parameters on the estimation error.Experimental results show that the proposed two mechanisms have better data utility.

Android Application Privacy Disclosure Detection Method Based on Static and Dynamic Combination

DING Xuhui, ZHANG Linlin, ZHAO Kai, WANG Xusheng

Computer Science. 2023, 50 (10): 327-335. doi:10.11896/jsjkx.220800181

Abstract

PDF(2752KB) ( 1586 )

References | Related Articles | Metrics

Under the background of big data,the problem of Android software stealing users' personal information is becoming more and more serious.Aiming at the problems of high false positive rate in static analysis and easy missing in dynamic analysis,a privacy disclosure detection method based on the combination of static and dynamic features is proposed.The multi-dimensional static features and dynamic features extracted from the application are fused,and the gradient descent algorithm is used to allocate optimal weights for SVM,RF,XGBoost,LightGBM and CatBoost,and the risk of privacy disclosure is detected by integrated learning weighted voting.Through the experimental analysis of 2 951 applications,the accuracy rate of this method reaches 95.14%,which is obviously better than a single feature and a single classifier,and can effectively detect the privacy disclosure risk of Android applications.

Smart Contract Vulnerability Detection System Based on Ontology Reasoning

CHEN Ruixiang, JIAO Jian, WANG Ruohua

Computer Science. 2023, 50 (10): 336-342. doi:10.11896/jsjkx.220900183

Abstract

PDF(2055KB) ( 1539 )

References | Related Articles | Metrics

Withthe development of the blockchain,smart contract based on Ethereum has attracted more and more attention from all walks of life,but it has also faced more security threats.For the security problems of Ethereum smart contracts,various vulnerability detection methods have emerged,such as symbolic execution,formal verification,deep learning and other technologies.However,most of the existing methods have incomplete detection types and lack interpretability.To solve these problems,a smart contract vulnerability detection system based on ontology reasoning for Solidity high-level language level is designed and implemented.The smart contract vulnerability source code is parsed into an abstract syntax tree,and the information is extracted.The extracted information is used to construct the vulnerability detection ontology,and the reasoning engine is used to infer the ontology vulnerability.In the experiment,other detection tools are selected to compare with this system,and these tools are used to detect 100 intelligent combined source samples.The results show that the system has a good detection effect,it can detect va-rious types of smart contract loopholes and can give the information about the cause of the vulnerability.

Reliable Smart Contract Automatic Generation Based on Event-B

ZHU Jian, HU Kai, WANG Jun, LI Jie, YE Yafei, SHI Xiyan

Computer Science. 2023, 50 (10): 343-349. doi:10.11896/jsjkx.220800134

Abstract

PDF(2636KB) ( 1595 )

References | Related Articles | Metrics

Smart contract is a new computable transaction agreement that executes contract terms in code.Its application scena-rios and scale are growing with each passing day,carrying up to billions of dollars of various assets.However,smart contracts may cause serious economics losses due to code defects.Therefore,the trusted development of smart contract is particularly critical.This paper proposes a method of trusted verification and automatic generation of smart contracts based on the set theory language Event-B,which is a formal method based on refinement and can be used for specification,design and verification of software systems.Through the model verification of smart contracts and the automatic generation technology of executable codes,an automatic conversion tool EB2S is developed,which bridges the semantic gap and technical barriers between formal models and smart contract programming languages.Finally,this paper selects a typical online payment smart contract scenario,and the smart contract design and verification method based on Event-B is applied to automatically generate the smart contract code,which verifies the effectiveness of the conversion tool.

Grouping Storage Optimization Method for Blockchain Ledger Based on Erasure Code

ZHANG Yushu, HE Xiaotong, XIAO Xiangli, ZHU Youwen, WANG Liangming

Computer Science. 2023, 50 (10): 350-361. doi:10.11896/jsjkx.220800193

Abstract

PDF(3536KB) ( 1803 )

References | Related Articles | Metrics

The traditional blockchain system adopts the full copy redundant storage mode and each node stores the same ledger,so the blockchain storage burden is very large.At present,the relevant blockchain storage optimization methods can reduce the data storage overhead,but still have the problems of poor scalability and low availability.Thus,this paper proposes a grouping storage optimization method for blockchain ledger based on erasure code.This method introduces a new blockchain node,i.e.,grouping storage(GS) node,to solve the above problems.Since the blockchain ledger storage cost is mainly in the block file,the GS node uses the erasure code to encode the block file,and stores the encoded block file in groups.In this way,each organization maintains the same ledger,which greatly reduces the storage overhead of the blockchain and improves the availability of the blockchain.For the storage expansion of the consortium blockchain,this paper improves and expands the file system of hyperle-dger fabric based on the GS node,and redesigns its process of storing,recovering,and synchronizing block files,which enables the scheme to work on the actual blockchain architecture.Finally,theoretical analysis and experimental results show that the proposed GS node has made significant progress in storage overhead,and performs well in scalability and availability.

Study on Adversarial Robustness of Deep Learning Models Based on SVD

ZHAO Zitian, ZHAN Wenhan, DUAN Hancong, WU Yue

Computer Science. 2023, 50 (10): 362-368. doi:10.11896/jsjkx.220800090

Abstract

PDF(1607KB) ( 1574 )

References | Related Articles | Metrics

The emergence of adversarial attacks poses a substantial threat to the large-scale deployment of deep neural networks(DNNs) in real-world scenarios,especially in security-related domains.Most of the current defense methods are based on heuristic assumptions and lack analysis of model robustness.How to improve the robustness of DNN and improve the interpretability and credibility of the robustness has become an essential part of the field of artificial intelligence security.This paper proposes to analyze the robustness of the model from the perspective of singular values.In the adversarial environment,the improvement of model robustness is accompanied by a smoother distribution of singular values.Further analysis shows that the smooth distribution of singular values means that the model has more diverse classification confidence sources and thus has higher adversarial robustness.Based on the analysis,an adversarial training algorithm based on singular value suppress(SVS) is proposed.Experiments show that the algorithm improves the robustness of the model and can achieve accuracy of 55.3% and 54.51% respectively on CIFAR-10 and SVHN when facing the powerful white-box attack PGD(Project Gradient Descent) method,exceeding the most representative adversarial training methods at present.

Semantic-based Multi-architecture Binary Function Name Prediction Method

SHAO Wenqiang, CAI Ruijie, SONG Enzhou, GUO Xixi, LIU Shengli

Computer Science. 2023, 50 (10): 369-376. doi:10.11896/jsjkx.220800175

Abstract

PDF(1770KB) ( 1511 )

References | Related Articles | Metrics

Rich readable source information is important for reverse work,especially high-quality function names are important for program understanding.However,software publishers often release executable stripped of source-level debugging information,either to prevent reversals or to streamline the size of the software,which makes reverse analysis more difficult due to the lack of readable information.To this end,a multi-architecture function name prediction(MFNP) method is proposed to resolve the differences between architectures using LLVM RetDec to decompile X86,ARM,and MIPS architecture binaries into intermediate language(IR) files.Morphological and semantic similarity comparison of function names in readable intermediate language.ll files,and similarity fusion of function names to reduce function name data sparsity.The basic blocks carrying the semantic information of sequential instructions and the control flow graph of function bodies with basic blocks as the basic units are used as semantic features of function bodies,combined with neural networks to achieve function name prediction of stripped binaries in three architectures,X86,MIPS and ARM.Compared to DEBIN,it additionally supports stripped binary function name prediction work under MIPS architecture DEBIN.The improvement in Precision and F1 is 13.86% and 11.93% respectively compared with NERO.The effectiveness of MFNP in selecting sequential instruction sequences and control flow graphs extracted with basic blocks as the basic unit as semantic features is verified.

Authenticated Encryption Scheme of Self-synchronous-like ZUC Algorithm

XU Rui, PENG Changgen, XU Dequan

Computer Science. 2023, 50 (10): 377-382. doi:10.11896/jsjkx.220800007

Abstract

PDF(1580KB) ( 1766 )

References | Related Articles | Metrics

Aiming at the security,efficiency and lightweight requirements of authentication encryption of ZUC algorithm,this paper proposes a kind of self-synchronous-like ZUC algorithm for associated data authentication encryption scheme ZUCAE.By improving the LFSR layer algorithm of ZUC stream cipher algorithm(ZUC-256),the scheme designs and implements a ZUC-SSL algorithm similar to self synchronous stream cipher,and uses this algorithm to make the ciphertext participate in the state update function for the generation of authentication code.This scheme encrypts the message through ZUC-256 algorithm,optimizes the initialization module,embeds the associated data into the initialization process,realizes the parallel generation of keystream and encryption,and authenticates the message before decryption,which reduces the calculation time and increases the security of the scheme.Security analysis results show that the algorithm can resist the current mainstream stream cipher related attacks based on LFSR design,and the design of self-synchronous-like stream cipher can enhance the security of authentication code.Compared with the efficiency experiments of AES-CGM and AEGIS,the results show that in the environment of large data scale,the efficiency is higher than that of AES-CGM,and is equivalent to AEGIS,so it has certain practicality.