Computer Science ›› 2024, Vol. 51 ›› Issue (6A): 231000052-6.doi: 10.11896/jsjkx.231000052

• Information Security • Previous Articles     Next Articles

Multimedia Harmful Information Recognition Method Based on Two-stage Algorithm

SHI Xiaosu1,2, LI Xin1, JIAN Ling2, NI Huajian3   

  1. 1 Information Network Security Academy,People’s Public Security University of China,Beijing 100091,China
    2 Network Security Corps,Shanghai Public Security Bureau,Shanghai 200025,China
    3 Shanghai SUPREMIND Technology Co.,Ltd,Zhejiang 310000,China
  • Published:2024-06-06
  • About author:SHI Xiaosu,born in 1996,postgraduate.Her main research interests include ima-ge processing and big data analysis.
    LI Xin,born in 1977,Ph.D,associate professor,is a professional member of CCF(No.51691M).His main research interests include cyber security and big data analysis.
  • Supported by:
    Applied Innovation Plan of the Ministry of Public Security of China(2020YYCXSHSJ019).

Abstract: In the application scenarios of Internet content security supervision and combating and rectifying Internet crimes,existing multimedia harmful information identification methods generally have problems such as low computational efficiency,inability to accurately identify local sensitive information,and identification capabilities are limited to a single type of cyber crimes.In order to solve the above problems,the paper proposes a multimedia harmful information recognition model based on a two-stage algorithm.This method processes information filtering and content detection in stages,and splits the tasks of scene recognition and element target detection.The first stage uses EfficientNet-B2 to build a high-throughput pre-filter model to quickly filter out 80% of images and short videos with normal content.In the second stage,three modules with different network structures are built based on Meal-V2,Faster RCNN,and NetVLAD networks to adapt to the recognition requirements of multi-dimensional scenes and multi-feature elements.The results show that the model’s computing efficiency reaches 57FPS(frames per second) on the T4 card,and the recognition accuracy and recall rate of multimedia harmful information exceed 97%.Compared with traditional mo-dels,the recognition accuracy rate on the NPDI dataset and the self-built test dataset increases by 3.09% and 19.26% respectively.

Key words: Two-stage algorithm, Multimedia, Harmful information recognition

CLC Number: 

  • TP391.41
[1]HUANG Y,KONG A W K.Using a CNN ensemble for detecting pornographic and upskirt images[C]//2016 IEEE 8th International Conference on Biometrics Theory,Applications and Systems(BTAS).IEEE,2016:1-7.
[2]CONNIE T,AL-SHABI M,GOH M.Smart content recognition from images using a mixture of convolutional neural networks[M]//IT Convergence and Security 2017:Volume 1.Singapore:Springer Singapore,2017:11-18.
[3]MOUSTAFA M.Applying deep learning to classify pornographicimages and videos[J].arXiv:1511.08899,2015.
[4]XIE X.Research on detection technology of pornographic information based on deep learning[D].Chengdu:University of Electronic Science and Technology of China,2020.
[5]ZHANG D L.Research on special video content detection algorithm based on deep features[D].Beijing:Minzu University of China,2019.
[6]GU Y,LI J,JING B,et al.Internet content information detection and filtration system[J].Application Research of Computers,2008(9):2834-2835,2862.
[7]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[8]TAN M,LE Q.Efficientnet:Rethinking model scaling for convolutional neural networks[C]//International Conference on Machine Learning.PMLR,2019:6105-6114.
[9]GOU J,YU B,MAYBANK S J,et al.Knowledge distillation:A survey[J].International Journal of Computer Vision,2021,129:1789-1819.
[10]TOUVRON H,CORD M,DOUZE M,et al.Training data-efficient image transformers & distillation through attention[C]//International Conference on Machine Learning.PMLR,2021:10347-10357.
[11]TOUVRON H,VEDALDI A,DOUZE M,et al.Fixing thetrain-test resolution discrepancy[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems.2019:8252-8262.
[12]MAHAJAN D,GIRSHICK R,RAMANATHAN V,et al.Exploring the limits of weakly supervised pretraining[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:181-196.
[13]CAI H,ZHU L,HAN S.Proxylessnas:Direct neural architecture search on target task and hardware[J].arXiv:1812.00332,2018.
[14]SHEN Z,SAVVIDES M.Meal v2:Boosting vanilla resnet-50 to 80%+ top-1 accuracy on imagenet without tricks[J].arXiv:2009.08453,2020.
[15]REN S,HE K,GIRSHICK R,et al.Faster r-cnn:Towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
[16]SHAFIQ M,GU Z.Deep residual learning for image recognition:a survey[J].Applied Sciences,2022,12(18):8972.
[17]DONG S.The research on object detection algorithm based on improved SSD and FPN algorithm[D].Chongqing:Southwest University,2023.
[18]ZHANG Z H.Study on scene recognition algorithm based on NetVLAD[D].Chongqing:Chongqing University,2022.
[19]ARANDJELOVIC R,GRONAT P,TORII A,et al.NetVLAD:CNN architecture for weakly supervised place recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:5297-5307.
[20]AVILA S,THOME N,CORD M,et al.Pooling in image representation:The visual codeword point of view[J].Computer Vision and Image Understanding,2013,117(5):453-446.
[1] QIAN Sheng-sheng, ZHANG Tian-zhu, XU Chang-sheng. Survey of Multimedia Social Events Analysis [J]. Computer Science, 2021, 48(3): 97-112.
[2] BAO Yu-xuan, LU Tian-liang, DU Yan-hui. Overview of Deepfake Video Detection Technology [J]. Computer Science, 2020, 47(9): 283-292.
[3] GUO Qi, CUI Jing-song. Covert Communication Method Based on Closed Source Streaming Media [J]. Computer Science, 2019, 46(9): 150-155.
[4] LIU Lu, ZHAO Guo-qing. Intelligent Incentive Mechanism for Fog Computing-based Multimedia Systems with Swarming Behavior [J]. Computer Science, 2019, 46(11): 94-99.
[5] LI Ling-li, BAI Guang-wei, SHEN Hang, WANG Tian-jing. Cluster-based Real-time Routing Protocol for Cognitive Multimedia Sensor Networks [J]. Computer Science, 2018, 45(10): 83-88.
[6] JIANG Bo, LI Tao-shen and GE Zhi-hui. Research of Smartphone Energy Saving Based on Buffer Threshold Adaptive Adjustment [J]. Computer Science, 2016, 43(1): 137-140.
[7] LI Rui-yao, BAI Guang-wei, SHEN Hang, DI Hai-yang and ZHAO Yun-hua. Multi-slice Multi-cover Routing Protocol in Wireless Multimedia Sensor Networks [J]. Computer Science, 2015, 42(9): 97-101.
[8] LIU Yang, TU Chun-long and ZHENG Feng-bin. Research of Neural Cognitive Computing Model for Visual and Auditory Cross-media Retrieval [J]. Computer Science, 2015, 42(3): 19-25.
[9] WANG Zhen,ZHANG Zhi-yong and CHANG Ya-nan. Multimedia DRM System for Android Platforms [J]. Computer Science, 2014, 41(5): 129-132.
[10] YU Hong and ZHU Li-li. Big Data Reliable Transmission Control Mechanism Based on Power Function Curve and Netwrok Coding for Wireless Multimedia Sensor Networks [J]. Computer Science, 2014, 41(12): 91-94.
[11] TIAN Wen-feng and BU Xian-de. Design and Development of Basic Functions of JTAPI-based Dispatching Server [J]. Computer Science, 2013, 40(Z6): 296-298.
[12] ZHANG Xiao-ling,QIN Feng-mei and QIU Yu-hui. Work-based Process-oriented Curriculum Development Research and Practice [J]. Computer Science, 2013, 40(Z11): 421-422.
[13] ZHOU Kun and FU Yi-de. Real-time Geographic Routing Algorithm in Wireless Multimedia Sensor Networks [J]. Computer Science, 2013, 40(10): 68-71.
[14] . Research and Design of Multimedia Digital Products Copyright Protection Model [J]. Computer Science, 2013, 40(1): 98-102.
[15] . Researches on Policy-based Network Convergence Architecture [J]. Computer Science, 2012, 39(9): 74-77.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!