计算机科学 ›› 2024, Vol. 51 ›› Issue (6A): 231000052-6.doi: 10.11896/jsjkx.231000052
史晓苏1,2, 李欣1, 简玲2, 倪华健3
SHI Xiaosu1,2, LI Xin1, JIAN Ling2, NI Huajian3
摘要: 在互联网安全监管和网络违法犯罪打击整治的应用场景中,现有多媒体有害信息识别方法普遍存在运算效率不高、无法准确识别局部敏感信息,以及识别检测局限于单一的网络违法犯罪类型等问题。针对以上问题,文中提出了一种基于两阶段算法的多媒体有害信息识别模型。该模型将信息过滤与内容检测分阶段处理,将场景识别和元素目标检测分任务并行处理,第一阶段采用EfficientNet-B2构建高吞吐的前置过滤模块快速筛选掉80%正常内容的数据;第二阶段基于Meal-V2,Faster RCNN,NetVLAD网络构建3种不同网络结构的模块,适应多维度场景、多特征元素的识别要求。结果表明,模型运算效率在T4卡上达到57FPS,多媒体有害信息的识别准确率、召回率均超过97%;与传统模型相比,在NPDI和自建测试集上识别准确率分别最高提升3.09%和19.26%。
中图分类号:
[1]HUANG Y,KONG A W K.Using a CNN ensemble for detecting pornographic and upskirt images[C]//2016 IEEE 8th International Conference on Biometrics Theory,Applications and Systems(BTAS).IEEE,2016:1-7. [2]CONNIE T,AL-SHABI M,GOH M.Smart content recognition from images using a mixture of convolutional neural networks[M]//IT Convergence and Security 2017:Volume 1.Singapore:Springer Singapore,2017:11-18. [3]MOUSTAFA M.Applying deep learning to classify pornographicimages and videos[J].arXiv:1511.08899,2015. [4]XIE X.Research on detection technology of pornographic information based on deep learning[D].Chengdu:University of Electronic Science and Technology of China,2020. [5]ZHANG D L.Research on special video content detection algorithm based on deep features[D].Beijing:Minzu University of China,2019. [6]GU Y,LI J,JING B,et al.Internet content information detection and filtration system[J].Application Research of Computers,2008(9):2834-2835,2862. [7]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778. [8]TAN M,LE Q.Efficientnet:Rethinking model scaling for convolutional neural networks[C]//International Conference on Machine Learning.PMLR,2019:6105-6114. [9]GOU J,YU B,MAYBANK S J,et al.Knowledge distillation:A survey[J].International Journal of Computer Vision,2021,129:1789-1819. [10]TOUVRON H,CORD M,DOUZE M,et al.Training data-efficient image transformers & distillation through attention[C]//International Conference on Machine Learning.PMLR,2021:10347-10357. [11]TOUVRON H,VEDALDI A,DOUZE M,et al.Fixing thetrain-test resolution discrepancy[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems.2019:8252-8262. [12]MAHAJAN D,GIRSHICK R,RAMANATHAN V,et al.Exploring the limits of weakly supervised pretraining[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:181-196. [13]CAI H,ZHU L,HAN S.Proxylessnas:Direct neural architecture search on target task and hardware[J].arXiv:1812.00332,2018. [14]SHEN Z,SAVVIDES M.Meal v2:Boosting vanilla resnet-50 to 80%+ top-1 accuracy on imagenet without tricks[J].arXiv:2009.08453,2020. [15]REN S,HE K,GIRSHICK R,et al.Faster r-cnn:Towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149. [16]SHAFIQ M,GU Z.Deep residual learning for image recognition:a survey[J].Applied Sciences,2022,12(18):8972. [17]DONG S.The research on object detection algorithm based on improved SSD and FPN algorithm[D].Chongqing:Southwest University,2023. [18]ZHANG Z H.Study on scene recognition algorithm based on NetVLAD[D].Chongqing:Chongqing University,2022. [19]ARANDJELOVIC R,GRONAT P,TORII A,et al.NetVLAD:CNN architecture for weakly supervised place recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:5297-5307. [20]AVILA S,THOME N,CORD M,et al.Pooling in image representation:The visual codeword point of view[J].Computer Vision and Image Understanding,2013,117(5):453-446. |
|