Computer Science

Review of Fake News Detection on Social Media

CHEN Jing, ZHOU Gang, LI Shunhang, ZHENG Jiali, LU Jicang, HAO Yaohui

Computer Science. 2024, 51 (11): 1-14. doi:10.11896/jsjkx.240700101

Abstract

PDF(3717KB) ( 742 )

References | Related Articles | Metrics

Fake news on social media not only jeopardizes cyberspace security,but also plays a pivotal role in major events,severely misleads the public and has a negative affect on political and social order.Therefore,this paper outlines social media fake news detection techniques,establishing a theoretical foundation for building efficient detection technology and curbing the proli-feration of fake news on social media.Firstly,it deeply analyzes the connotation and essence of fake news,explores its generation mechanism and specific manifestations on social platforms,and defines the basic framework and objectives of the detection task.Next,from the perspective of semantic consistency,it focuses on three major levels:content semantics,social context awareness,and knowledge-driven,and compares and combs typical detection methods.On this basis,it deeply explores the latest research advancements in enhancing the explainability of detection algorithms.Furthermore,from the adversarial perspective,it deeply analyzes the challenges faced by current social media fake news detection tasks and the opportunities brought to research detection technology by large-scale language models.Finally,the future development of social media fake news detection technology is prospected.

Study on Fake News Detection Technology in Resource-constrained Environments

WU Chenglong, HU Minghao, LIAO Jinzhi, YANG Hui, ZHAO Xiang

Computer Science. 2024, 51 (11): 15-22. doi:10.11896/jsjkx.240700099

Abstract

PDF(2379KB) ( 398 )

References | Related Articles | Metrics

In recent years,social media has become a fertile ground for the spread and proliferation of fake news due to its openness and convenience.Compared to unimodal fake news,multimodal fake news,which combines various forms of information such as text and images,creates more confusing false content and has a more far-reaching effects.Existing methods for multimodal fake news detection predominantly rely on small models.However,the rapid development of multimodal large models offers new pers-pectives for addressing this issue.These models,though,are typically parameter-intensive and computationally demanding,making them challenging to deploy in environments with limited computational and energy resources.To address these challenges,this study proposes a multimodal fake news detection model based on the multimodal large model Long-CLIP.This model is capable of processing long texts and attending to both coarse-grained and fine-grained details.Additionally,by employing an efficient coarse-to-fine layer-wise pruning method,a more lightweight multimodal fake news detection model is obtained to adapt to resource-constrained scenarios.Finally,on the Weibo dataset,the proposed model is compared with current popular multimodal large models before and after fine-tuning and other pruning methods,and its effectiveness is verified.Results indicate that the Long-CLIP-based multimodal fake news detection model significantly reduces model parameters and inference time compared to current popular multimodal large models,while maintaining superior detection performance.After compression,the model achieves a 50% reduction in parameters and a 1.92 s decrease in inference time,with only a 0.01 drop in detection accuracy.

Fake News Detection Based on Cross-modal Interaction and Feature Fusion Network

PENG Guangchuan, WU Fei, HAN Lu, JI Yimu, JING Xiaoyuan

Computer Science. 2024, 51 (11): 23-29. doi:10.11896/jsjkx.231200186

Abstract

PDF(2298KB) ( 397 )

References | Related Articles | Metrics

In recent years,the surge in fake news has adversely affected people’s decision-making process.Many existing fake news detection methods emphasize the exploration and utilization of multimodal information,such as text and image.However,how to generate discriminative features for the detection task and effectively aggregate features of different modalities for fake news detection remains an open question.In this paper,we propose a novel fake news detection model,i.e.,cross-modal interaction and feature fusion network(CMIFFN).To generate discriminant features,a supervised contrastive learning-based feature learning module is designed.By performing intra-modality and inter-modality supervised contrastive learning simultaneously,it ensures that the similarity of heterogeneous features is smaller and the similarity of similar features is greater.In addition,in order to mine more useful multi-modal information,this paper designs a multi-stage cross-modal interaction module to learn cross-modal interaction features with graph structure information.The method introduces consistency evaluation-based attention me-chanism to effectively aggregate modality-specific features and cross-modal interaction features by learning multi-modal consistency weight.Experiments on two benchmark datasets Weibo and Twitter show that CMIFFN is significantly superior to the state-of-the-art multimodal fake news detection methods.

Multi-source Heterogeneous Data Progressive Fusion for Fake News Detection

YU Yongxin, JI Ke, GAO Yuan, CHEN Zhenxiang, MA Kun, ZHAO Xiaofan

Computer Science. 2024, 51 (11): 30-38. doi:10.11896/jsjkx.240700004

Abstract

PDF(2776KB) ( 334 )

References | Related Articles | Metrics

Social media platforms are inundated with a vast amount of unverified information,much of which originates from he-terogeneous data from multi-source,which spreads so widely and quickly that it poses a significant threat to individuals and society.Therefore,it is crucial to effectively detect and prevent fake news. Targeting the current limitations of fake news detection models,which typically rely on single data sources for news textual and visual information,resulting in strong subjective news reports and incomplete data coverage,a model is proposed for detecting fake news by progressively fusing multi-source heteroge-neous data.Firstly,multi-source heterogeneous data collection,screening,and cleaning are conducted to create a multi-source multimodal dataset containing reports about each event from diverse perspectives.Next,by inputting the features obtained from the textual feature extractor and visual feature extractor into the multi-source fusion module,a progressive fusion of features from various sources is achieved.Additionally,sentiment features extracted from text and frequency domain features extracted from images are incorporated into the model to enable multi-level feature extraction.Finally,this paper adopts the soft attention mechanism for feature integration.Experimental results and analysis show that the proposed model has better detection performance compared to existing popular methods,providing an effective solution for fake news detection in the era of big data.

Multimodal Adaptive Fusion Based Detection of Fake News in Short Videos

ZHU Feng, ZHANG Tinghui, LI Peng, XU He

Computer Science. 2024, 51 (11): 39-46. doi:10.11896/jsjkx.240700062

Abstract

PDF(1849KB) ( 410 )

References | Related Articles | Metrics

With the rapid development of Internet and social media,the dissemination route of news is no longer limited to traditional media channels.Semantically rich multimodal data becomes the carrier of news while fake news has been widely spread.As the proliferation of false news will have an unpredictable impact on individuals and society,the detection of false news has become a current research hotspot.Existing multimodal false news detection methods only focus on text and image data,which not only fail to fully utilize the multimodal information in short videos but also ignore the consistency and difference features between different modalities.As a result,it is difficult for them to give full play to the advantages of multimodal fusion.To solve this pro-blem,a fake news detection model for short videos based on multimodal adaptive fusion is proposed.This model extracts features from multimodal data in short videos,uses cross-modal alignment fusion to obtain the consistency and complementarity features among different modalities,and then achieves adaptive fusion based on the contribution of different modal features to the final fusion result.Finally,a classifier is used to achieve fake news detection.The results of the experiment conducted on a publicly avai-lable short video dataset demonstrate that the accuracy,precision,recall,and F1-score of the proposed model are higher than those of the state-of-the-art models.

Analysis of User Evaluation Indicator for AIGC Digital Illustration Design Principles

XU Jun, ZHOU Peijin, ZHANG Haijing, ZHANG Hao, XU Yuzhong

Computer Science. 2024, 51 (11): 47-53. doi:10.11896/jsjkx.240700085

Abstract

PDF(3126KB) ( 221 )

References | Related Articles | Metrics

Based on the digital illustration design principles and the technical principles of AIGC,a production process from lite-rary text to digital illustration has been constructed.Through experiments,the actual effects and existing problems of AIGC digi-tal illustration have been demonstrated.The current development status of digital illustration has been reviewed,and the relationship between AIGC and digital illustration design principles has been analyzed.The main processes and key technologies currently used for digital illustration with AIGC have been summarized and introduced.Then,multiple AI algorithms have been used to build a production process from literary text to digital illustration,and multiple sets of experiments have been conducted.Finally,a questionnaire has been designed based on indicators such as the degree of fit between text and image to evaluate users,and analyze the generation rules,characteristics,and usability of AIGC digital illustration.AIGC can meet certain narrative and artistic style requirements,but its effectiveness decreases as the narrative nature of the text increases.At the same time,it has poor performance for rare content,and its image details cannot represent complex narrative scenarios.Through theoretical analysis and experimental comparison,it can be concluded that AIGC has great advantages in terms of production efficiency with the help of artificial intelligence technology in digital illustration.However,due to a lack of understanding of narrative content,there are shortcomings in its image expression.Currently,it still relies on the high involvement of designers to solve practical problems,and it also requires collaboration from all parties to promote the healthy development of new technologies.

Graph Convolution Spatio-Temporal Attention Fusion and Graph Reconstruction Method forRumor Detection

CHEN Xin, RONG Huan, GUO Shangbin, YANG Bin

Computer Science. 2024, 51 (11): 54-64. doi:10.11896/jsjkx.240300189

Abstract

PDF(3247KB) ( 300 )

References | Related Articles | Metrics

The rapid development of the Internet has brought convenience to people’s social life,but it also creates conditions for the generation and spread of rumors.The fast propagation speed and bad impact of rumors have attracted wide social attention.However,in complex social networks,the dynamic change of rumor propagation state,the existence of interference information in the propagation process,and the uncertainty of propagation all bring difficulties to rumor detection.In order to solve the above problems,this study proposes a graph convolution spatio-temporal attention fusion and graph reconstruction method(STAFRGCN) for rumor detection,and all the speeches to be detected are detected twice to reduce the probability of misjudgment.Firstly,a temporal progressive convolution module(TPC) is used to integrate the propagation status information of the speeches to be detected in the time dimension.Then,attention is used to extract and fuse the main propagation feature information in two aspects of time and space respectively,and the fusion result is used for the first rumor detection.After that,the total graph structure of the detected speech propagation is adjusted based on long short-term memory(LSTM)prediction and graph reconstruction method.It is combined with the first detection results for the second detection.Experiments show that the detection accuracy of STAFRGCN on Twitter15,Twitter16 and Weibo datasets is 92.2%,91.8% and 96.5%,respectively.Compared with SOTA model(KAGN),the accuracy is increased by 3.0%,1.5% and 1.4% on the 3 datasets, respectively.

Online Log Parsing Method Based on Bert and Adaptive Clustering

LU Jiawei, LU Shida, LIU Sisi, WU Chengrong

Computer Science. 2024, 51 (11): 65-72. doi:10.11896/jsjkx.230900161

Abstract

PDF(2667KB) ( 280 )

References | Related Articles | Metrics

Log parsing is a technique for extracting valid information from raw log files,which can be used in areas such as system troubleshooting,performance analysis and security auditing.The main challenge of log parsing is the unstructured,diversity and dynamics of log data.Different systems and applications may use different log formats,and log formats may change over time.Therefore,this paper proposes BertLP,an online log parsing method that can automatically adapt to different log sources and log format variations.It uses a pre-trained language model,Bert,combined with an adaptive clustering algorithm for static and dynamic recognition of words in logs to group logs to generate log templates.Instead of manually defining log templates or regular expressions and performing frequency counts on words,BertLP automatically identifies log fields and types by learning semantic and structural features of log message.Comparative experiments on public log datasets show BertLP improves log parsing accuracy by 6.1% compared with the best available method and performs better on log parsing tasks.

Multi-view Attributed Graph Clustering Based on Contrast Consensus Graph Learning

LIU Pengyi, HU Jie, WANG Hongjun, PENG Bo

Computer Science. 2024, 51 (11): 73-80. doi:10.11896/jsjkx.231000198

Abstract

PDF(2603KB) ( 287 )

References | Related Articles | Metrics

Multi-view attribute graph clustering can divide nodes of graph data with multiple views into different clusters,which has attracted widespread attention from researchers in recent years.At present,many multi-view attribute graph clustering me-thods based on graph neural networks have been proposed and achieved considerable clustering performance.However,since graph neural networks are difficult to deal with graph noise that occurs during data collection,it is difficult for multi-viewattri-bute graph methods based on graph neural networks to further improve clustering performance.Therefore,a new multi-view attribute graph clustering method based on contrastive consensus graph learning is proposed to reduce the impact of noise on clustering and obtain better results.This method consists of four steps.First,graph filtering is used to remove noise on the graph while retaining the intact graph structure.Then,a small number of nodes are selected to learn the consensus graph to reduce computational complexity.Subsequently,graph contrast regularization is used to help learn the consensus graph.Finally,spectral clustering is used to obtain clustering results.A large number of experimental results show that compared with the current state-of-the-art methods,the proposed method can well reduce the impact of noise in graph data on clustering and achieve considerable clustering results with fast execution efficiency.

Advantage Weighted Double Actors-Critics Algorithm Based on Key-Minor Architecture for Policy Distillation

YANG Haolin, LIU Quan

Computer Science. 2024, 51 (11): 81-94. doi:10.11896/jsjkx.231000170

Abstract

PDF(3006KB) ( 237 )

References | Related Articles | Metrics

Offline reinforcement learning(Offline RL) defines the task of learning from a fixed batch of dataset,which can avoid the risk of interacting with environment and improve the efficiency and stability of learning.Advantage weighted actor-critic algorithm,which combines sample efficient dynamic programming with maximum likelihood strategy updating,makes use of a large number of offline data and quickly performs online fine-grained strategy adjustment.However,the algorithm uses a random experience replay mechanism,while the actor-critic model only uses one set of actors,and data sampling and playback are unbalanced.In view of the above problems,an advantage weighted double actors-critics algorithm based on policy distillation with data expe-rience optimization and replay is proposed(DOR-PDAWAC),which adopts the mechanism of preferring new data and replaying old and new data repeatedly,uses double actors to increase exploration,and uses key-minor architecture for policy distillation to divide actors into key actor and minor actor to improve performance and efficiency.Applying algorithm to the MuJoCo task in the general D4RL dataset,and experimental results show that the proposed algorithm achieves better performance in terms of lear-ning efficiency and other aspect.

Distance-generalized Based (α,β)-core Decomposition on Bipartite Graphs

ZHANG Yihao, HUA Zhengyu, YUAN Long, ZHANG Fan, WANG Kai, CHEN Zi

Computer Science. 2024, 51 (11): 95-102. doi:10.11896/jsjkx.231000130

Abstract

PDF(1773KB) ( 227 )

References | Related Articles | Metrics

(α,β)-core decomposition is a fundamental problem in graph analysis,and has been widely adopted for e-commerce fraud detection and interest group recommendation.Nevertheless,(α,β)-core model only considers the distance-1 neighborhood,which makes it unable to provide more fine-grained structure information.Motivated by this,(α,β,h)-core model is proposed in this paper,which requires the vertices in one/another part has at least α other vertices at distance not greater than h within the subgraph.Due to the distance-h neighborhoods being considered,the new model can identify more fine-grained structure information as verified in our experiments,which also makes the corresponding (α,β,h)-core decomposition challenging.To address this problem,an efficient algorithm based on computation-sharing strategy is proposed and time complexity is analyzed accordingly.As obtaining neighbors within distance h is time-consuming,a lower bound related to (α,β,h)-core is designed to avoid unnecessary distance-h neighbors computation to further improve the computational efficiency.Experimental results on eight real graphs de-monstrate the effectiveness of the proposed model and efficiency of its algorithm.

Hierarchical Hypergraph-based Attention Neural Network for Service Recommendation

YANG Dongsheng, WANG Guiling, ZHENG Xin

Computer Science. 2024, 51 (11): 103-111. doi:10.11896/jsjkx.231100010

Abstract

PDF(3272KB) ( 329 )

References | Related Articles | Metrics

With the rapid growth of various services and APIs on the Internet and the Web,it has become increasingly challenging for developers to quickly and accurately find APIs that meet their needs,thus requiring an efficient recommendation system.Currently,the application of graph neural networks in service recommendation has achieved great success,but many such methods are still limited to simple interactions and ignore the intrinsic relationships between mashups and API calls.To address this issue,this paper proposes a hierarchical hypergraph-based attention neural network for service recommendation method(H-HGSR) for API recommendation.First,eight types of hyperedges are defined,and the corresponding hypergraph adjacency matrix generation methods are explored.Then,node-level and hyperedge-level attention mechanisms are proposed.The node-level attention mechanism is used to aggregate important information from different neighbors under specific types of hypergraph adjacency matrices to capture high-order relationships between mashups and APIs.The hyperedge-level attention mechanism is used to weight the combination of node embeddings generated from different types of hypergraph adjacency matrices.By learning the importance of node-level and hyperedge-level attention,more accurate embedding representations can be obtained.Finally,a multi-layer perceptron neural network(MLP) is used for service recommendation.Extensive experiments are conducted on the Programmable Web real dataset,and the overall comparison results show that the proposed H-HGSR framework outperforms the state-of-the-art service recommendation methods.

Review of Visual Representation Learning

WANG Shuaiwei, LEI Jie, FENG Zunlei, LIANG Ronghua

Computer Science. 2024, 51 (11): 112-132. doi:10.11896/jsjkx.231100089

Abstract

PDF(2817KB) ( 343 )

References | Related Articles | Metrics

Representation learning is an important step of artificial intelligence algorithm,where well designed representation can boost downstream tasks.With the development of deep learning in computer vision,visual representation learning has become increasingly important,aiming at transforming complex visual information into representation that is easier for artificial intelligence algorithm to learn.In this paper,we focus on current research works widely used in visual representation learning,which are categorized as pre-trained visual representation learning,generative visual representation learning,contrastive visual representation learning,decoupled visual representation learning,and visual representation learning combined with language information accor-ding to the degrees and types of data dependency.Specifically,pre-trained visual representation learning is the application of supervised pre-training model in visual representation learning;generative visual representation learning uses generative model to learn visual representations;and contrastive visual representation learning focuses on the various network frameworks which using contrast learning to learn visual representations.Besides,the paper presents the applications of VAE and GAN in decoupled visual representation learning,as well as various approaches to improve visual representation learning with language information.Finally,evaluation metrics in visual representation learning and future perspectives are summarized.

Research Progress of Image 3D Object Detection in Autonomous Driving Scenario

ZHOU Yan, XU Yewen, PU Lei, XU Xuemiao, LIU Xiangyu, ZHOU Yuexia

Computer Science. 2024, 51 (11): 133-147. doi:10.11896/jsjkx.231000075

Abstract

PDF(2246KB) ( 220 )

References | Related Articles | Metrics

2D object detection techniques have significant limitations when applied to automatic driving scenarios due to the absence of description of the size,depth and other information of the physical environment.Numerous researchers have made extensive explorations in the field of image 3D object detection by aligning with the practical requirements of automatic driving.To conduct a comprehensive study in this domain,this paper reviews recent literature published both domestically and internationally.It introduces two main categories of methods:image-based 3D object detection and 3D object detection by fusing image and point cloud data.Furthermore,it further subdivides these categories based on the different approaches used to process input data by the network.The paper describes representative methods within each category,summarizes the strengths and weaknesses of each method,and conducts a comparative analysis of their performance.Additionally,it provides a detailed introduction to relevant datasets and evaluation metrics for 3D object detection in autonomous driving scenarios.Finally,the paper analyzes the challenges and difficulties in the field of image 3D object detection,and outlines potential future research directions.

Study on Road Crack Detection Based on Weakly Supervised Semantic Segmentation

ZHAO Weidong, LU Ming, ZHANG Rui

Computer Science. 2024, 51 (11): 148-156. doi:10.11896/jsjkx.231000148

Abstract

PDF(3481KB) ( 243 )

References | Related Articles | Metrics

Most of the existing weakly supervised semantic segmentation methods are based on the process of blocking before detection,which increases the annotation workload.However,the existing automatic block classification methods input all blocks into the model to predict the block category,increasing the number of blocks that are misjudged and affecting the performance of subsequent semantic segmentation.Aiming at the above problems,this paper proposes a road crack block classification model based on deep reinforcement learning.According to characteristics of road crack images,the states,actions,and rewards obtained by the agents are designed.The agent is trained to select crack blocks independently,and the selection results are used as block labels for multi-size block road crack detection.Through comparative experiments on several datasets,it is proved that the propsoed model outperforms existing methods in terms of road crack segmentation performance and crack width measurement accuracy.

Feature Interpolation Based Deep Graph Contrastive Clustering Algorithm

YANG Xihong, ZHENG Qun, ZHANG Jiaxin, WANG Pei, ZHU En

Computer Science. 2024, 51 (11): 157-165. doi:10.11896/jsjkx.231000209

Abstract

PDF(3500KB) ( 230 )

References | Related Articles | Metrics

Mixup is an effective data augmentation technique in the field of computer vision.It is widely used for expanding the training distribution by interpolating input images and labels to generate new samples.However,in the context of graph node clustering tasks,designing robust interpolation methods poses challenges due to the irregularity and connectivity of graph data,as well as the unsupervised nature of the problem.To address these challenges,we propose a novel approach that leverages a dedicated encoder with non-shared parameters to extract embedding features from different views of graph.This allows us to effectively integrate both the node features and structural information.We then introduce Mixup into the clustering task by performing mixed interpolation on the embedding features along with their corresponding pseudo-labels.To ensure the reliability of these pseudo-labels,we apply a threshold to filter out high-confidence predictions,while incorporating an exponential moving average(EMA) mechanism for updating model parameters and considering the historical information during training.Furthermore,we incorporate a graph contrastive learning module to enhance feature consistency across different views,reducing information redundancy and improving the discriminative power of the model.Extensive experiments on six datasets demonstrate the effectiveness of the proposed method.

Unsupervised Target Drift Correction and Tracking Based on Hidden Space Matching

FAN Xiaopeng, PENG Li, YANG Jielong

Computer Science. 2024, 51 (11): 166-173. doi:10.11896/jsjkx.230900078

Abstract

PDF(3747KB) ( 192 )

References | Related Articles | Metrics

Object tracking is a basic research issue in the field of computer vision.With the development oftracking technology,existing trackers mainly have two challenges,namely relying on a large amount of data annotation information and tracking drift,which seriously limits the improvement of tracker performance.In order to overcome the above challenges,unsupervised target tracking and hidden space matching methods are proposed.Firstly,image pairs are generated in the foreground via a correctable optical flow method.Secondly,the generated image pairs are utilized to train the siamese tracker from scratch.Finally,the hidden space matching method is used to solve the problem of losing track when the target deforms greatly,is occluded,goes out of the field of view and drifting.Experimental results show that the algorithm UHOT significantly improves on multiple datasets and demonstrates strong robustness in difficult scenarios.Compared with the latest unsupervised algorithm SiamDF,UHOT gaines 8% gain on the VOT dataset,comparable to state-of-the-art supervised siamese trackers.

High-precision Real-time Semantic Segmentation Algorithm Architecture for Autonomous Driving

GENG Huantong, LI Jiaxing, JIANG Jun, LIU Zhenyu, FAN Zichen

Computer Science. 2024, 51 (11): 174-181. doi:10.11896/jsjkx.231000009

Abstract

PDF(2932KB) ( 215 )

References | Related Articles | Metrics

The proportional integration differentiation(PID) semantic segmentation architecture mitigates the problem of overshooting in the dual-branch architecture,where fine-grained features are easily overwhelmed by surrounding contextual information.However,the high-resolution boundary branch in this architecture significantly impacts the inference speed.To address this issue,an efficient PID architecture based on spatial attention mechanisms and a lightweight auxiliary semantic branch is proposed.The designed lightweight attention fusion module is used to extract precise contextual information and guide the fusion of various feature information.Additionally,a fast aggregation pyramid pooling module is introduced to rapidly aggregate semantic information across multiple scales.Finally,a deep supervision training strategy,combined with the canny edge detection operator,is designed to enhance the training effectiveness.In comparison to the baseline,the proposed model achieves a 6% increase in accuracy at the cost of a slightly increased latency.It strikes a good balance between accuracy and speed on the Cityscapes,CamVid,and KITTI datasets,outperforming existing models in the same speed range.Notably,the model achieves an accuracy of 78.5% at 120.9 frames/s on the Cityscapes test set.

Few-Shot Learning Method Based on Symmetric Convolutional Block Network and PrototypeCalibration

LIU Shuai, BAI Xuefei, GAO Xiaofang

Computer Science. 2024, 51 (11): 182-190. doi:10.11896/jsjkx.230900022

Abstract

PDF(2949KB) ( 231 )

References | Related Articles | Metrics

To address the issues of poor generalization performance in few-shot learning models based on prototype networks and inaccurate class prototypes obtained from a small number of samples,a novel few-shot learning method is proposed in this paper.Firstly,a symmetric convolutional block network(SCB-Net) consisting of bidirectional convolutional block attention modules and residual blocks is used to adaptively learn the features at different depths of the image,so as to extract a more representative representation of the category features and effectively improve the generalization ability of the model.Secondly,an inverse Euclidean label propagation prototype calibration algorithm(IELP-PC) is introduced.It employs pseudo-labeling to augment the support set samples and subsequently calibrates the class prototypes using inverse Euclidean distance weighting for the support set samples,thereby improving the model’s classification accuracy.Experiment results on two commonly used datasets mini-ImageNet and tiered-ImageNet demonstrate the effectiveness of the proposed method.Compared with the baseline model,the proposed method improves the 5-way 1-shot accuracy by 6.44% and 7.83%,and the 5-way 5-shot accuracy by 2.68% and 2.02%,respectively.

Facial Forgery Detection Based on Key Frames and Fused Spatial-Temporal Features

CHENG Yan

Computer Science. 2024, 51 (11): 191-197. doi:10.11896/jsjkx.240100063

Abstract

PDF(3050KB) ( 251 )

References | Related Articles | Metrics

The deep learning-based facial forgery detection is commonly approached as a binary classification problem.The accuracy of model training results is not only affected by the quality and quantity of training data,but also related to training strategy and network architecture design..In this paper,we propose a new method based on key frames and spatial-temporal features.Firstly,the weighted optical flow energy analysis is used to detect the key frames in a video.Then,the optical flow and LBP features of the key frames are fused to form feature maps with spatial and temporal characteristics.After data augmentation,the feature maps are fed into the CNN model for training.Evaluations conducted on the FaceForensics++ and Celeb-df datasets de-monstrate that the proposed method achieves superior or comparable detection accuracy.Experimental results on cross-datasets show that the proposed method,utilizing the Efficientnet-V2 structure,achieves the best performance on the FaceForensics++ database with the accuracy of 90.1%.Furthermore,the overall performance of the XceptionNet structure surpasses that of other methods,achieving the accuracy over 80%,thus demonstrating superior generalization performance of the proposed method.

Crack Detection of Concrete Pavement Based on Attention Mechanism and Deep Feature Optimization

XIA Shufang, YUAN Bin, QU Zhong

Computer Science. 2024, 51 (11): 198-204. doi:10.11896/jsjkx.240100082

Abstract

PDF(3062KB) ( 262 )

References | Related Articles | Metrics

Automatic crack detection is the key to ensure the quality of concrete pavement and improve the efficiency of road maintenance.Aiming at the shortcomings of existing methods in paying attention to crack features and the problem of easy loss of crack detail information in deep feature maps,this paper proposes a network model that integrates attention mechanism and deep feature optimization strategy,using VGG-16 as the backbone network.Firstly,a lightweight shuffle attention mechanism is introduced after the middle and high level convolutions of the backbone network,aiming to improve the sensitivity of the network to crack features.Secondly,in order to further enhance the capture ability of crack features,the corresponding attention module is embedded in the side output of each stage.Finally,a spatial separable pyramid module is proposed and an attention fusion module is designed to optimize the deep feature map and restore more crack details.The side network assisted in generating the final prediction image by fusing the low-level and high-level features at multiple levels.The network uses the binary cross-entropy loss function as the evaluation function,and the trained network model can accurately identify the crack position from the input original image under complex background.To verify the effectiveness of the proposed method,it is compared with six methods on three datasets,DeepCrack,CFD,and Crack500.The proposed algorithm shows excellent performance,and the F-score value reaches 87.19%.

Scene Graph Generation Combined with Object Attribute Recognition

ZHOU Hao, LUO Tingjin, CUI Guoheng

Computer Science. 2024, 51 (11): 205-212. doi:10.11896/jsjkx.230900013

Abstract

PDF(3118KB) ( 211 )

References | Related Articles | Metrics

Scene graph generation(SGG) plays an important role in deep visual understanding tasks.Existing SGG methods mainly focus on the locations and categories of objects,as well as the relationship between objects,while ignoring that the object attributes also contain rich semantic information.This paper proposes a SGG model integrating with the object attributes.Firstly,to achieve multi-label object attribution recognition,we propose the composite classifiers that combine the multi-class classification trained by improved group cross entropy loss and binary classification trained by binary cross entropy loss,which can improve the accuracy and recall of multiple attribute predictions.Then,the branch of attribution recognition is fused into the SGG framework.As a kind of context information,the attribution features are fed into the relationship branch for better relationship classification.Finally,compared with several baseline models,our method has achieved better performance in both object attribute prediction and relationship recognition on VG150 dataset.

Review of Generative Reinforcement Learning Based on Sequence Modeling

YAO Tianlei, CHEN Xiliang, YU Peiyi

Computer Science. 2024, 51 (11): 213-228. doi:10.11896/jsjkx.231000037

Abstract

PDF(2636KB) ( 369 )

References | Related Articles | Metrics

Reinforcement learning is a branch of machine learning on how to learn decisions,which is a sequential decision-making problem that involves repeatedly interacting with the environment to find the optimal strategy through trial and error.Reinforcement learning can be combined with generative models to optimize their performance,and is typically used to fine-tune generative models and improve their ability to create high-quality content.The reinforcement learning process can also be seen as a general sequence modeling problem,modeling the distribution on task trajectories,and generating a series of actions through pre-training to obtain a series of high returns.Based on modeling input information,generative reinforcement learning can better handle uncertain and unknown environments,and more efficiently transform sequence data into strategies for decision-making.Firstly,an introduction is given to reinforcement learning algorithms and sequence modeling methods,and the modeling process of data sequences is analyzed.The development status of reinforcement learning is discussed according to different neural network models used.Based on this,relevant methods combined with generative models are summarized,and the application of reinforcement learning methods in generative pre-training models is analyzed.Finally,the development status of relevant technologies in theory and application is summarized.

Multi-granular and Generalized Long-tailed Classification Based on Leading Forest

YANG Jinye, XU Ji, WANG Guoyin

Computer Science. 2024, 51 (11): 229-238. doi:10.11896/jsjkx.231100112

Abstract

PDF(2995KB) ( 224 )

References | Related Articles | Metrics

Long-tailed classification is an inevitable and challenging task in the real world.Traditional methods usually focus only on inter-class imbalanced distributions,however,recent studies have begun to emphasize intra-class long-tailed distributions,i.e.,within the same class,there are far more samples with head attributes than tail ones.Due to the implicitness of the attributes and the complexity of their combinations,the intra-class imbalance problem is even more difficult to deal with.For this purpose,a generalized long-tailed classification framework(Cognisance) is proposed in the paper,aiming to build a multi-granularity joint solution model for the long-tailed classification problem through the invariant feature learning.Firstly,the framework constructs coarse-grained leading forest(CLF) through unsupervised learning to better characterize the distribution of samples about diffe-rent attributes within the class,and thus constructs different environments in the process of invariant risk minimization.Secondly,the framework designs a new metric learning loss,multi-center loss(MCL),to gradually eliminate confusing attributes during the feature learning process.Additionally,the framework does not depend on a specific model structure and can be integrated with other long-tailed classification methods as an independent component.Experimental results on datasets ImageNet-GLT and MSCOCO-GLT show that,the proposed method achieves the best performance,and existing methods all gain an improvement of 2%~8% in Top1-Accuracy metric by integrating with this framework.

KBQA Algorithm Introducing Core Entity Attention Evaluation

ZHAO Weidong, JIN Yanfeng, ZHANG Rui, LIN Yanzheng

Computer Science. 2024, 51 (11): 239-247. doi:10.11896/jsjkx.231000015

Abstract

PDF(1739KB) ( 205 )

References | Related Articles | Metrics

There are numerous knowledge base question answering(KBQA) researches on complex semantics and complex syntax,but most of them are based on the premise that the subject entity of the question has been obtained,and insufficient attention has been paid to the multi-intentions and multi-entities in the question,and the identification of the core entity in the interrogative sentence is the key to natural language understanding.To address this problem,a KBQA model introducing core entity attention is proposed.Based on the attention mechanism and attention enhancement techniques,the proposed model assesses the importance of the recognized entity mention,obtains the entity mention attention,removes the potential interfering items,captures the core entity of the user’s question,so as to solve the semantic understanding problem of multi-entity and multi-intention interrogative sentences.Evaluated results are introduced into the subsequent Q&A reasoning as importance weights.Finally,comparative experiments are conducted with KVMem,GraftNet,PullNet and other models in English MetaQA dataset,multi-entity question MetaQA dataset,and multi-entity question HotpotQA dataset.For multi-entity question,the proposed model achieves better experimental results on Hits@n,accuracy,recall and other evaluation indexes.

Incrementally and Flexibly Extracting Parallel Corpus from Web

LIU Xiaofeng, ZHENG Yucheng, LI Dongyang

Computer Science. 2024, 51 (11): 248-254. doi:10.11896/jsjkx.231000096

Abstract

PDF(1698KB) ( 191 )

References | Related Articles | Metrics

Extracting parallel corpus from the web is important for machine translation and other multilingual processing tasks.This paper proposes an incremental web parallel corpus extraction method,which incrementally updates language text length statistics for domains by continuously downloading,scanning and analyzing Common Crawl’s web crawling archive.For any given interested language pairs,web sites to be crawled are determined based on language text length statistics for domains and crawled according to the target language pairs,and non-target domains and links are discarded.It also proposes a new intermediatesentence alignment method,which globally aligns sentences based on semantic similarity within multilingual domains.Experiments show that:1)our extraction method can continuously obtain new parallel corpus and flexibly obtain the target language pair of interest via extracting the specified language pairs;2)the proposed intermediate method is significantly better than the global method in terms of alignment efficiency,and can complete the alignment that cannot be completed by local methods;3)out of 6 language directions,the extracted parallel corpora are superior to existing web open source parallel corpus in 4 medium-low resource languages and close to the best available web open source parallel corpus in 2 high-resource languages.

Knowledge Annotation Platform-based Knowledge Graph Construction and Application for Water Conservancy Hub Projects

ZHANG Junhui, ZAN Hongying, OU Jiale, YAN Ziyue, ZHANG Kunli

Computer Science. 2024, 51 (11): 255-264. doi:10.11896/jsjkx.231100079

Abstract

PDF(3114KB) ( 219 )

References | Related Articles | Metrics

The generation of a significant volume of heterogeneous data in water resources has facilitated the creation and utilization of domain knowledge graphs,but it has led to discrepancies in the construction processes of these graphs.To address the complexities involved in building water resources knowledge graphs,an efficient approach based on a knowledge annotation platform is proposed.Taking the intelligent application of knowledge in Xiaolangdi Water Conservancy Hub project as an example,using the engineering data of the hub,the proposed method is applied to construct a water conservancy hub project knowledge graph(WCHP-KG) in the field of water conservancy.Firstly,focusing on the Xiaolangdi Water Conservancy Hub project,a construction for conceptual classification and relationship description is established based on industry terminology and existing voca-bularies,forming the pattern layer of WCHP-KG.Through BiLSTM-CRF and sequence labeling models,under the guidance of water conservancy experts,a knowledge annotation platform is used to semi-automatically annotate and manually proofread unstructured texts,achieving knowledge fusion and constructing the data layer of WCHP-KG.Results indicate that WCHP-KG co-vers 43 water conservancy entities and 110 entity relationships.Through practical validation,the proposed WCHP-KG provides a solid structured knowledge base for applications related to the Xiaolangdi Water Conservancy Hub project,and provides a reliable reference for engineering decision-making and management,validating the efficacy of the proposed construction method.In the future,WCHP-KG will be further expanded and the construction process will be improved to meet the needs of more application scenarios and fields.

Emotion Elicited Question Generation Model in Dialogue Scenarios

XU Bei, XU Peng

Computer Science. 2024, 51 (11): 265-272. doi:10.11896/jsjkx.231000002

Abstract

PDF(2131KB) ( 220 )

References | Related Articles | Metrics

Human-machine dialog systems have been widely used in intelligent services.Existing human-machine dialog systems can perceive the interlocutor’s emotional state and give a response with an appropriate emotion based on context.However,it is difficult to ensure that a response with a specific emotion can elicit the same emotion from people.For example,a response with a “joy” emotion does not guarantee that people will experience a “joy” emotion.In some scenarios,human-machine dialogue systems need to guide users to reach a specific emotional state to facilitate the continuous development of a conversation or improve inter-action efficiency,such as dialogue psychological escort or online intelligent teaching.Current human-computer dialogue systems focus on coarse-grained emotion eliciting,such as “positive/negative”,and therefore are difficult to handle fine-grained emotion eliciting.On the other side,research on dialogue psychology indicates that “questions” in a conversation can significantly affect the emotions of interlocutors.Based on the above background,a question-generation model for emotional elicitation in dialogue scenarios is proposed.This model is based on the GPT pre-trained model and incorporates the knowledge of the emotion to be elicited into the response generation.The model also introduces a contextual emotional perception mechanism and a common sense knowledge fusion mechanism and uses multi-task learning to enhance the emotion perception ability and conversation response generation ability.Given that it is the first time to propose a question generation task for fine-grained emotion eliciting,an emotional eliciting dataset has been constructed for training and experiments.An automatic evaluation method based on prompt lear-ning has been designed.Finally,automatic evaluation and human evaluation demonstrate that the proposed model can generate questions that can effectively elicit target emotions.

Polyphone Disambiguation Based on Pre-trained Model

GAO Beibei, ZHANG Yangsen

Computer Science. 2024, 51 (11): 273-279. doi:10.11896/jsjkx.230900006

Abstract

PDF(1660KB) ( 242 )

References | Related Articles | Metrics

Grapheme-to-phoneme conversion(G2P) is an important part of the Chinese text-to-speech system(TTS).The key issue of G2P is to select the correct pronunciation for polyphonic characters among several alternatives.Existing methods usually struggle to fully grasp the semantics of words that contain polyphonic characters,and fail to effectively handle the imbalanced distribution in datasets.To solve these problems,this paper proposes a polyphone disambiguation method based on the pre-trained model RoBERTa,called cross-lingual translation RoBERTa(CLTRoBERTa).Firstly,the cross-lingual translation module gene-rates another translation of the word containing the polyphonic character as an additional input feature to improve the model’s semantic comprehension.Secondly,the hierarchical learning rate optimization strategy is employed to adapt the different layers of the neural network.Finally,the model is enhanced with the sample weight module to address the imbalanced distribution in the dataset.Experimental results show that CLTRoBERTa mitigates performance differences caused by uneven dataset distribution and achieves a 99.08% accuracy on the public Chinese polyphone with pinyin(CPP) dataset,outperforming other baseline models.

Decomposition Multi-objective Optimizaiton Algorithm with Weight Vector Generation StrategyBased on Non-Euclidean Geometry

SUN Liangxu, LI Linlin, LIU Guoli

Computer Science. 2024, 51 (11): 280-291. doi:10.11896/jsjkx.230900007

Abstract

PDF(4703KB) ( 214 )

References | Related Articles | Metrics

With the increase of the number of objectives,mult-objective problems(MOPs) are more and more difficult to solve.Decomposition-based multi-objective evolutionary algorithms show better performance.However,when solving MOPs with complex Pareto fronts,decomposition-based algorithms show poor diversity in population and the performance deteriorates.To address these issues,this paper proposes a decomposition multi-objective optimization algorithm with weight vector generation strategy based on non-Euclidean geometry.By fitting the non-dominated frontier in the non-Euclidean geometric space and estimating the parameters,the normal statistical sampling of the target variable of the non-dominated solution is used to generate the weight vector,so as to guide the evolution direction of the population and maintain the diversity of the population.Meanwhile,the neighborhood of sub-problems can be rebuilt periodically,to improve the efficiency of the co-evolution of decomposition algorithm and improve the performance of the algorithm.Experiment results based on the MaF benchmark test problems show that,compared with MOEA/D,NSGA-III and AR-MOEA algorithms,the proposed algorithm has significant performance in solving multi-objective optimization problems.

Equipment Anomaly Diagnosis Based on DGA and Sparse Support Vector Machine

PAN Lianrong, ZHANG Fuquan, HE Jinglong, YANG Jiayi

Computer Science. 2024, 51 (11): 292-297. doi:10.11896/jsjkx.230500096

Abstract

PDF(1751KB) ( 212 )

References | Related Articles | Metrics

In order to effectively improve the accuracy and efficiency of equipment anomaly diagnosis based on machine learning,a fault diagnosis model based on sparse support vector machine is proposed.Firstly,the principle of abnormal diagnosis and cha-racteristic gas are analyzed,and the relationship between fault types and characteristic gas is given.Secondly,the data is preprocessed from 4 aspects,including cleaning,normalization,balance and division.Then,in order to solve the problem of sparsity of least squares support vector machine,a method is proposed to map data samples to a high-dimensional kernel space,and cluster the mapped data in kernel space distance by spectral clustering algorithm,to realize the data preprocessing of least squares support vector machine,so as to realize its sparseness.Finally,the specific experimental analysis is carried out on a small sample dataset.The results show that,for 9 types of faults,compared with other diagnosis models based on different types of support vector machines,the proposed diagnosis model only needs 11 iterations to obtain the maximum fitness value,and the average diagnosis accuracy rate is 96.67%,with higher accuracy and efficiency.

Research on Semantic-aware Ciphertext Rtrieval in Cloud Environments:A Survey

LIU Yuanlong, DAI Hua, LI Zhangchen, ZHOU Qian, YI Xun, YANG Geng

Computer Science. 2024, 51 (11): 298-306. doi:10.11896/jsjkx.231000111

Abstract

PDF(1635KB) ( 234 )

References | Related Articles | Metrics

With the continuous development of cloud computing and big data technology,data owners are increasingly inclined to outsource their data to cloud servers.In order to ensure the security of these data,many privacy-preserving ciphertext retrieval techniques in cloud environments have been proposed.However,thetraditionalprivacy-preserving search schemes usually do not consider the semantic relationship between keywords and documents.To address this problem,in recent years,semantic-aware ciphertext retrieval schemes for cloud environments have become a research hotspot..This paper presents the existing research work on semantic-aware ciphertext retrieval in cloud environments,showcasing the system models,security models,and retrieval frameworks mainly adopted.It categorizes and summarizes existing semantic-aware searchable encryption schemes from the perspective of core technology for semantic expansion,illustrating their advantages and limitations.Finally,it concludes the existing research work and discusses future research directions in this field.

Review of Research on Blockchain Sharding Techniques

TAN Pengliu, XU Teng, TU Ruoxin

Computer Science. 2024, 51 (11): 307-320. doi:10.11896/jsjkx.231200078

Abstract

PDF(2471KB) ( 231 )

References | Related Articles | Metrics

Blockchain technology is characterized by decentralization and tamper resistance,and has a wide range of application prospects.However,it is difficult for blockchain systems to support large-scale distributed data management and transactions,so the performance and scalability of blockchain have become important research directions.At present,researchers have proposed some solutions to improve the performance and scalability of blockchain by modifying the data structure and consensus algorithm on the chain,and adding off-chain operation technology.Among them,the most practical method to achieve horizontal scalability with the increase of network scale is sharding technology.As an on-chain scaling method,sharding technology is a method to divide the entire blockchain network into multiple segments to facilitate the simultaneous processing of multiple transactions or contracts.Each shard can operate independently,with its own transaction history and state,improving the performance and sca-lability of the blockchain without sacrificing centralization.Previous studies on blockchain sharding technology have focused on introducing transaction consensus in sharding,while ignoring the sharding strategy mechanism and sharding architecture.Therefore,this paper first systematically analyzes the existing sharding blockchains,divides the design process of sharding blockchains into several parts:architecture setting,node selection,node allocation,transaction distribution,transaction processing,and sharding reconstruction,and analyzes the functions and properties of each part of the design process of sharding blockchains.Secondly,the sharding architecture is classified and summarized.This paper focuses on various sharding strategies and mechanisms,analyzes their advantages and disadvantages,compares mainstream sharding blockchain systems,and analyzes their scalability and reliability,including system throughput,delay,communication overhead,node randomness,sharding security,and cross-shard smart contracts.Finally,future research directions are proposed.

Knowledge Graph Based Approach to Cyberspace Geographic Mapping Construction

WU Yue, HU Wei, LI Chenglong, YANG Jiahai, LI Zhiqi, YIN Qin, XIA Ang, DANG Fangfang

Computer Science. 2024, 51 (11): 321-328. doi:10.11896/jsjkx.231000127

Abstract

PDF(2332KB) ( 244 )

References | Related Articles | Metrics

In the digital information era of rapid development of the Internet and increasing importance of cybersecurity,cyberspace geographic mapping is regarded as a new type of means of cognition and management of cyberspace.By synthesizing the information of cyberspace and geospatial information,it is able to display the cyberspace situation more comprehensively from multiple perspectives.However,the current research work on cyberspace geographic mapping lacks a fine-grained portrayal of cyberspace model,as well as specific construction methods and application methods of cyberspace geographic mapping.To address the above problems,with the goal of cyberspace cognition,this paper proposes a four-layer,four-level cyberspace hierarchical model with a time reference axis.In addition,in order to better understand the complex cyberspace environment,a specific framework for constructing a cyberspace geographic mapping,as well as a method for constructing a cyberspace ontology,is proposed in conjunction with the knowledge graph technology.Based on real mapping data from Censys,a prototype cyberspace geographic mapping of a simulated park network is successfully constructed.This study proposes an improved approach to the hierarchical structure of cyberspace,and also introduces knowledge mapping into the research field of cyberspace geography,which not only helps to improve the understanding of cyberspace,but also has practical application significance in cybersecurity,resource management,fault recovery,and decision making.

Intelligent Penetration Path Planning and Solution Optimization Based on Reinforcement Learning

LI Cheng’en, ZHU Dongjun, HE Jieyan, HAN Lansheng

Computer Science. 2024, 51 (11): 329-339. doi:10.11896/jsjkx.231000207

Abstract

PDF(2289KB) ( 275 )

References | Related Articles | Metrics

In the background of the widespread application of big data technology,the problems that traditional penetration testing overly relies on expert experience and manual operation have become more significant.Automated penetration testing aims to solve the above problems,so as to discover system security vulnerabilities more accurately and comprehensively.Finding the optimal penetration path is the most important task in automated penetration testing.However,current mainstream research suffers from the following problems:1)seeking the optimal path in the original solution space,which contains numberous redundant paths,significantly increases the complexity of problem-solving;2)evaluation of vulnerability exploitation and positive reward obtainment actions is not enough.The problem-solving can be optimized by eliminating a significant number of redundant penetration paths and employing exploit sample enhancement and positive reward sample enhancement methods.Therefore,this paper proposes the MASK-SALT-DQN algorithm by integrating solution space transformation and sample enhancement methods.It qualitatively and quantitatively analyzes the influence of the proposed algorithm on the model solving process,proposing the compression ratio to measure the benefits of solution space transformation.Experiments indicate that the proportion of redundant solution paths in the original solution space consistently remains over 83%,proving the necessity of solution space transformation.In addition,in standard experiment scenario,the theoretical compression ratio is 57.2,and the error between the experimental compression ratio and theoretical value is only 1.40%.Moreover,in comparison to baseline methods,MASK-SALT-DQN has the optimal performance in all experiment scenarios,which confirms its the effectiveness and superiority.

Malicious Encrypted Traffic Detection Method Based on Conversation Statistical Encoder Model

GONG Siyue, LIU Hui, WANG Baohui

Computer Science. 2024, 51 (11): 340-346. doi:10.11896/jsjkx.231000121

Abstract

PDF(2418KB) ( 236 )

References | Related Articles | Metrics

With the development and widespread application of network technology,encrypted traffic has become a key technology for protecting user privacy.However,malware and attackers also use encrypted traffic to hide their behaviors and evade traditional network intrusion detection systems.Existing malicious encrypted traffic detection methods have some pro-blems.Statistics-based methods rely on expert experience for feature extraction,and features of different protocols cannot be generalized.Deep learning methods based on raw inputs have incomplete information and field padding data issues,leading to insufficient semantic representation of encrypted traffic interactions.To solve the above problems,this paper proposes a method called “conversation statistic encoder model(CSEM)”.The method draws on the transformer encoder model and introduces a new traffic packet feature parsing method,and it is different from the traditional mode of inputting byte streams into deep neural networks.The proposed method can construct fixed-length vector representations for each traffic packet without padding zeros,while avoiding dependence on specific encrypted protocols in the feature extraction process.A hybrid deep neural network is constructed to provide a new idea for malicious encrypted traffic detection.The proposed method is verified on the DataCon dataset and self- built dataset,and the experimental results on Datacon dataset show a recall of 0.991 1,precision of 0.940 7,and F1 score of 0.965 2,reaching the current best level,and the F1 score is 9% higher than that of the random forest model.

Dynamic Instrumentation Method for Embedded Physical Devices

SI Jianpeng, HONG Zheng, ZHOU Zhenji, CHEN Qian, LI Tao

Computer Science. 2024, 51 (11): 347-355. doi:10.11896/jsjkx.230700091

Abstract

PDF(1959KB) ( 217 )

References | Related Articles | Metrics

Most existing dynamic instrumentation methods are based on the x86/x64 instruction set,which is poorly compatible with reduced instruction set(RISC) commonly used in embedded devices,and there are problems such as low instrumentation efficiency and large resource consumption when the dynamic instrumentation methods are applied to embedded devices.This paper proposes a dynamic instrumentation method for embedded physical devices(DIEB).DIEB uses control transfer instructions as probes in embedded devices to dynamically perform binary instrumentation on target processes.It proposes a lightweight method to interpret the execution of instructions,and sets the instruction execution area based on the operating environment.DIEB interprets the execution instructions in the simulation execution area to obtain the execution results.During the dynamic operation of the target process,DIEB interprets and executes control transfer instructions to obtain the destination address of the control transfer instructions,and tracks the execution flow of the target process so as to efficiently perform dynamic instrumentation on embedded devices with limited resources.Taking the ARM instruction set as the verification object,experiments are carried out on physical devices such as NetGear R7000.Experimental results show that the DIEB instrumentation process can run normally,and the time delay caused by instrumentation is much smaller than that of the ptrace-based instrumentation method.In addition,DIEB can run stably in a multi-threaded environment and accurately record the execution flow traces of concurrent threads.

PRFL:Privacy-preserving Robust Aggregation Method for Federated Learning

GAO Qi, SUN Yi, GAI Xinmao, WANG Youhe, YANG Fan

Computer Science. 2024, 51 (11): 356-367. doi:10.11896/jsjkx.231000158

Abstract

PDF(3296KB) ( 278 )

References | Related Articles | Metrics

Federated learning allows users to train a model together by exchanging model parameters and can reduce the risk of data leakage.However,studies have found that user privacy information can still be inferred through model parameters,and many studies have proposed model privacy-preserving aggregation methods.Moreover,malicious users can corrupt federated learning aggregation by submitting carefully constructed poisoning models,and with models aggregated under privacy protection,malicious users can implement more hidden poisoning attacks.In order to implement privacy protection while resisting poisoning attacks,a privacy-preserving federated learning robust aggregation method named PRFL is proposed.PRFL can not only effectively defends against poisoning attacks launched by Byzantine users,but also guarantee the privacy of the local model,the accuracy and efficiency of the global model.Specifically,a lightweight model privacy-preserving aggregation method under dual-server architecture is first proposed to achieve the privacy-preserving aggregation of the model,while guaranteeing the accuracy of global model without introducing overhead problems.Then a secret model distance computation method is proposed,which allows both servers to compute model distances without exposing the local model parameters,and poisoning model detection method is designed based on this method and local outlier factor(LOF) algorithm.Finally,security of PRFL is analysed.Experimental results on two real image datasets show that PRFL can obtain similar model accuracy to FedAvg under no attack,and PRFL can effectively defend against three advanced poisoning attacks and outperform existing Krum,Median,and Trimmed mean methods in both the data independent identically distributed(IID) and non-IID settings.

Federated Learning Model Based on Update Quality Detection and Malicious Client Identification

LEI Cheng, ZHANG Lin

Computer Science. 2024, 51 (11): 368-378. doi:10.11896/jsjkx.231100044

Abstract

PDF(4060KB) ( 214 )

References | Related Articles | Metrics

As a distributed machine learning,federated learning alleviates the problem of data islands,which only transmits model parameters between the server and the client without sharing local data and improves the privacy of training data,at the same time it also makes federated learning vulnerable to malicious client attacks.The existing research mainly focuses on intercepting updates uploaded by malicious clients.A federated learning model based on update quality detection and malicious client identification method,named umFL,is studied to improve the training performance of global models and the robustness of federated learning.Specifically,the client importance is calculated by obtaining the loss value of each round of client training.The subset of clients participating in each round of training is selected by update quality detection.The similarity between the updated local model and the previous round of global model is calculated to determine whether the client makes positive updates and the negative updates are filtered.Meanwhile,the beta distribution function is introduced to update the client reputation value.The clients with low reputation value are marked as malicious clients and excluded from participating in subsequent training.The effectiveness of the proposed algorithm on MNIST and CIFAR10 datasets is tested by using convolutional neural networks respectively.Experimental results show that under the attack of 20%~40% of malicious clients,the proposed model is still safe.Especially under the 40% malicious clients,the umFL model improves the model testing accuracy by 40% and 20% on MNIST and CIFAR10 respectively compared with traditional federated learning,and the model convergence speed is also improved by 25.6% and 22.8% respectively.

Application of Parameter Decoupling in Differentially Privacy Protection Federated Learning

WANG Zihang, YANG Min, WEI Zichong

Computer Science. 2024, 51 (11): 379-388. doi:10.11896/jsjkx.231200034

Abstract

PDF(5346KB) ( 245 )

References | Related Articles | Metrics

Federated learning(FL) is an advanced privacy preserving machine learning technique that exchanges model parameters to train shared models through multi-party collaboration without the need for centralized aggregation of raw data.Although participants in FL do not need to explicitly share data,many studies show that they still face various privacy inference attacks,leading to privacy information leakage.To address this issue,the academic community has proposed various solutions.One of the strict privacy protection methods is to apply Local differential privacy(LDP) technology to federated learning.This technology adds random noise to the model parameters before they are uploaded by participants,to effectively resist inference attacks from malicious attackers.However,the noise introduced by LDP can reduce the model performance.Meanwhile,the latest research suggests that this performance decline is related to the additional heterogeneity introduced by LDP between clients.A parameter decoupling based federated learning scheme(PD-LDPFL) with differential privacy protection is proposed to address the issue of FL performance degradation caused by LDP.In addition to the basic model issued by the server,each client also learns personalized input and output models locally.This scheme only uploads the parameters of the basic model with added noise during client transmission,while the personalized model is retained locally,adaptively changing the input and output distribution of the client’s local data to alleviate the additional heterogeneity introduced by LDP and reduce accuracy loss.In addition,research has found that even with a higher privacy budget,this scheme can naturally resist some gradient based privacy inference attacks,such as deep gradient leakage and other attack methods.Through experiments on three commonly used datasets,MNIST,FMNIST,and CIFAR-10,the results show that this scheme not only achieves better performance compared to traditional differential privacy federated learning,but also provides additional security.

DDoS Attack Detection Model Based on Statistics and Ensemble Autoencoders in SDN

LI Chunjiang, YIN Shaoping, CHI Haotian, YANG Jing, GENG Haijun

Computer Science. 2024, 51 (11): 389-399. doi:10.11896/jsjkx.230900028

Abstract

PDF(3622KB) ( 221 )

References | Related Articles | Metrics

Software-defined networking(SDN) is a novel network architecture that provides fine-grained centralized network management services.It is characterized by control and forwarding separation,centralized control,and open interface characteristics.Due to the centralized management logic of the control layer,controllers have becom the prime targets for distributed denial-of-service(DDoS)attacks.Traditional statistics-based DDoS attack detection algorithms often have problems such as high false-positive rates and fixed thresholds,while detection algorithms based on machine learning models are often involved in substantial computational resource consumption and poor generalization.To address these challenges,this study proposes a two-tier DDoS attack detection model based on statistical features and ensemble autoencoders.The statistics-based method extracts Rényi entropy features and sets a dynamic threshold to judge suspicious traffic.The ensemble autoencoder algorithm is then applied for a more accurate DDoS attack judgment of suspicious traffic.The double-layered model not only enhances detection performance and solves the problem of high false alarm rates,but also effectively shortens the detection time,thereby reducing the consumption of computational resources.Experimental results show that the model achieves high accuracy in different network environments,with the lowest F1 score on various datasets is more than 98.5%,demonstrating a strong generalization capability.

Multi-type K-nearest Neighbor Query Scheme with Mutual Privacy-preserving in Road Networks

ZENG Congai, LIU Yali, CHEN Shuyi, ZHU Xiuping, NING Jianting

Computer Science. 2024, 51 (11): 400-417. doi:10.11896/jsjkx.230900158

Abstract

PDF(4277KB) ( 238 )

References | Related Articles | Metrics

In the Internet of vehicles scenario,existing location-based service privacy-preserving schemes have issues such as not supporting parallel query of multi-type K-nearest neighbor points of interest,difficulty to protect the privacy of both the in-vehicle users and the location-based service provider(LBSP),and unable to resist malicious attacks.In order to solve the above issues,a multi-type K-nearest neighbor query scheme with mutual privacy-preserving in road networks,named as MTKNN-MPP is proposed.By applying the improved k-out-of-n oblivious transfer protocol to the K-nearest neighbor query scheme,it is realized that multi-type K-nearest neighbor points of interest can be queried at a time while protecting the privacy of the query content of in-vehicle user and the privacy of the points of interest information of LBSP.The addition of the onboard unit caching mechanism reduces computational cost and communication overhead.The security analysis shows that the MTKNN-MPP scheme can effectively protect the location privacy of in-vehicle users,query content privacy of in-vehicle users,and the privacy of points of interest information of LBSP,which ensures the anonymity of the vehicle’s identity and can resist malicious attacks such as collusion attacks,replay attacks,inference attacks,and man-in-the-middle attacks.Performance evaluation shows that compared with the existing typical K-nearest neighbor query schemes,the MTKNN-MPP scheme has higher security and the query latency in single-type K-nearest neighbor query and multi-type K-nearest neighbor query is reduced by 43.23%~93.70% and 81.07%~93.93%,respectively.