计算机科学 ›› 2026, Vol. 53 ›› Issue (1): 423-429.doi: 10.11896/jsjkx.241200005

• 信息安全 • 上一篇    

基于鲁棒分区水印的深度学习模型保护方法

吕正浩1,2, 咸鹤群1,3   

  1. 1 青岛大学计算机科学技术学院 山东 青岛 266071;
    2 中国科学院信息工程研究所网络空间安全防御重点实验室 北京 100085;
    3 密码与网络空间安全(黄埔)研究院 广州 510700
  • 收稿日期:2024-12-02 修回日期:2025-03-14 发布日期:2026-01-08
  • 通讯作者: 咸鹤群(xianhq@126.com)
  • 作者简介:(15688778816@163.com)
  • 基金资助:
    国家自然科学基金(62102212);网络空间安全防御重点实验室开放课题(2024-ZD-04)

Deep Learning Model Protection Method Based on Robust Partitioned Watermarking

LYU Zhenghao1,2, XIAN Hequn1,3   

  1. 1 College of Computer Science and Technology, Qingdao University, Qingdao, Shandong 266071, China;
    2 Key Laboratory of Cyberspace Security Defense, Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100085, China;
    3 Cryphotography and Cyber Security Whampo Institute, Guangzhou 510700, China
  • Received:2024-12-02 Revised:2025-03-14 Online:2026-01-08
  • About author:LYU Zhenghao,born in 1999,postgra-duate.His main research interests include AI robustness and intellectual property protection.
    XIAN Hequn,born in 1979,Ph.D,professor,master’s supervisor.His main research interests include cryptography and network and information systems security.
  • Supported by:
    National Natural Science Foundation of China(62102212) and Open Fund Project of the Key Laboratory of Cyberspace Security Defense(2024-ZD-04).

摘要: 机器学习涉及到昂贵的数据收集和训练成本,模型所有者可能会担心自己的模型遭到未授权的复制或使用,损害到模型所有者的知识产权。因此,如何有效保护这些模型的知识产权成为一个亟待解决的问题。为此,研究人员提出了模型水印的概念。类似于数字水印技术将水印嵌入图像的方式,模型水印通过将特定的标识嵌入机器学习模型中,以达到版权确认的目的。然而,现有的水印方案在实际应用中存在一些局限性。首先,水印的嵌入不可避免地会对模型性能产生一定影响;其次,水印可能会通过微调等技术手段被移除。针对此类问题,提出一种新型的神经网络水印方案,采用区域化和分阶段的嵌入方式。这种方法不仅旨在最大限度地减少对模型性能的影响,还力图提升水印本身的鲁棒性。在MNIST,CIFAR-10和CIFAR-100数据集上的实验验证了该方案的有效性。实验结果表明,该水印方案在保持水印存活率的同时,对模型性能的影响极小,相较于现有的基线水印方案,模型性能提升幅度最高可达18个百分点。此外,所提出的方案对微调等攻击手段表现出较强的鲁棒性,并且不受模型剪枝操作的影响。即便攻击者试图完全移除水印,也必须以显著降低模型性能为代价。

关键词: 深度神经网络, 模型水印, 版权验证, 人工智能安全, 水印鲁棒性, 模型性能

Abstract: Machine learning often involves high costs related to data collection and model training,which raises concerns for mo-del owners about unauthorized replication or misuse,potentially infringing on their intellectual property(IP).Consequently,the protection of intellectual property in machine learning models has become a pressing issue.In response,researchers have introduced the concept of model watermarking.Similar to how digital watermarking embeds identifiable marks into images,model watermarking involves embedding unique identifiers into machine learning models to facilitate copyright verification.However,exis-ting watermarking techniques face several limitations in practical applications.Firstly,embedding watermarks inevitably affects model performance to some degree.Secondly,watermarks can be removed through techniques such as model fine-tuning.To address these challenges,this paper proposes a novel neural network watermarking scheme,employing a regional and staged embedding approach.This method not only aims to minimize the impact on model performance but also seeks to enhance the robustness of the watermark itself.Experiments conducted on the MNIST,CIFAR-10,and CIFAR-100 datasets validate the effectiveness of the proposed scheme.The results demonstrate that this watermarking approach maintains a high watermark retention rate while having minimal impact on model performance.Compared to existing baseline watermarking schemes,this method achieves performance improvements of up to 18 percentage points.Additionally,the proposed scheme exhibits strong robustness against attacks such as fine-tuning and remains unaffected by model pruning operations.Even if adversaries attempt to completely remove the watermark,they would have to significantly degrade the model’s performance as a trade-off.

Key words: Deep neural network, Model watermarking, Copyright verification, Artificial intelligence security, Watermark robustness, Model performance

中图分类号: 

  • TP309
[1]BALCZEWSKI E A,CAO J,SINGH K.Risk prediction and machine learning:a case-based overview[J].Clinical Journal of the American Society of Nephrology,2023,18(4):524-526.
[2]NALISNICK E,SMYTH P,TRAN D.A brief tour of deeplearning from a statistical perspective[J].Annual Review of Statistics and Its Application,2023,10(1):219-246.
[3]ZHANG H,SHAO H.Exploring the Latest Applications ofOpenAI and ChatGPT:An In-Depth Survey[J].CMES-Compu-ter Modeling in Engineering & Sciences,2024,138(3):2061-2102.
[4]XU P,JI X,LI M,et al.Small data machine learning in materials science[J].NPJ Computational Materials,2023,9(1):42.
[5]DAIDONE M,FERRANTELLI S,TUTTOLOMONDO A.Machine learning applications in stroke medicine:Advancements,challenges,and future prospectives[J].Neural Regeneration Research,2024,19(4):769-773.
[6]LAI Q,YANG L,HU G,et al.Constructing multiscroll memristive neural network with local activity memristor and application in image encryption[J].IEEE Transactions on Cybernetics,2024,54(7):4039-4048.
[7]GOLDBERG Y.A primer on neural network models for natural language processing[J].Journal of Artificial Intelligence Research,2016,57:345-420.
[8]MEHRISH A,MAJUMDER N,BHARADWAJ R,et al.A review of deep learning techniques for speech processing[J].Information Fusion,2023,99:101869.
[9]CHIB P S,SINGH P.Recent Advancements in End-to-End Autonomous Driving Using Deep Learning:A Survey[J].IEEE Transactions on Intelligent Vehicles,2024,9(1):103-118.
[10]KIM J,KIM J,KIM H,et al.CNN-based network intrusion detection against denial-of-service attacks[J].Electronics,2020,9(6):916.
[11]LI Y,YAN H,HUANG T,et al.Model architecture level privacy leakage in neural networks[J].Science China Information Sciences,2024,67(3):132101.
[12]AKHTAR N,MIAN A.Threat of adversarial attacks on deep learning in computer vision:A survey[J].IEEE Access,2018,6:14410-14430.
[13]PENG S,CHEN Y,XU J,et al.Intellectual property protection of DNN models[J].World Wide Web,2023,26(4):1877-1911.
[14]OREKONDY T,SCHIELE B,FRITZ M.Knockoff Nets:Stea-ling functionality of black-box models[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:4954-4963.
[15]WU H,ZHANG J,LI Y,et al.Overview of artificial intelligence model watermarking[J].Journal of Image Graphics,2023,28(6):1792-1810.
[16]KAHNG A B,LACH J,MANGIONE-SMITH W H,et al.Watermarking techniques for intellectual property protection[C]//Proceedings of the 35th Annual Design Automation Conference.1998:776-781.
[17]KUMAR J,KUMAR M.Comparison of image compressionmethods on various images[C]//2015 International Conference on Advances in Computer Engineering and Applications.IEEE,2015:114-118.
[18]HE Y,XIAO L.Structured pruning for deep convolutional neural networks:A survey[J].IEEE Transactions on Pattern Ana-lysis and Machine Intelligence,2023,46(5):2900-2919.
[19]CHURCH K W,CHEN Z,MA Y.Emerging trends:A gentle introduction to fine-tuning[J].Natural Language Engineering,2021,27(6):763-778.
[20]UCHIDA Y,NAGAI Y,SAKAZAWA S,et al.Embedding watermarks into deep neural networks[C]//Proceedings of the 2017 ACM on International Conference on Multimedia Retrie-val.2017:269-277.
[21]ADI Y,BAUM C,CISSE M,et al.Turning your weakness intoa strength:Watermarking deep neural networks by backdooring[C]//27th USENIX Security Symposium(USENIX Security 18).2018:1615-1631.
[22]LEE S,SONG W,JANA S,et al.Evaluating the robustness of trigger set-based watermarks embedded in deep neural networks[J].IEEE Transactions on Dependable and Secure Computing,2022,20(4):3434-3448.
[23]YOSINSKI J,CLUNE J,BENGIO Y,et al.How transferableare features in deep neural networks?[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems.2014:3320-3328.
[24]LECUN Y,BOTTOU L,BENGIO Y,et al.Gradient-basedlearning applied to document recognition[C]//Proceedings of the IEEE.1998:2278-2324.
[25]KRIZHEVSKY A,HINTON G.Learning multiple layers of features from tiny images[EB/OL].https://www.cs.utoronto.ca/~kriz/learning-features-2009-TR.pdf.
[26]LIAN D,ZHOU D,FENG J,et al.Scaling & shifting your features:A new baseline for efficient model tuning[J].Advances in Neural Information ProcessingSystems,2022,35:109-123.
[27]ZHANG Y,WU H,LIN F,et al.Deep learning model pruning technology in image recognition[J].Journal of Nanjing University of Science and Technology,2023,47:699-707.
[28]FAN L,NG K W,CHAN C S.Rethinking deep neural network ownership verification:Embedding passports to defeat ambiguity attacks[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems.2019:4714-4723.
[29]CHEN H,ROUHANI B D,KOUSHANFAR F.Blackmarks:Blackbox multibit watermarking for deep neural networks[J].arXiv:1904.00344,2019.
[30]LYU P,MA H,CHEN K,et al.MEA-Defender:A Robust Watermark against Model Extraction Attack[J].arXiv:2401.15239,2024.
[31]LYU P,LI P,ZHU S,et al.Ssl-wm:A black-box watermarking approach for encoders pre-trained by self-supervised learning[J].arXiv:2209.03563,2022.
[32]LIU H,WU Y H,LI X D,et al.Deep neural network modelcopyright protection framework based on external samples[J].Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition),2025,37(3):405-416.
[33]PENG W P,LIU J B,PING Y,et al.Model protection scheme for fusion of internal and external feature watermarks[J].Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition),2024,36(4):765-774.
[34]PODILCHUK C I,DELP E J.Digital watermarking:algorithms and applications[J].IEEE Signal Processing Magazine,2001,18(4):33-46.
[35]JIA H,CHOQUETTE-CHOO C A,CHANDRASEKARAN V,et al.Entangled watermarks as a defense against model extraction[C]//30th USENIX Security Symposium(USENIX Security 21).2021:1937-1954.
[36]ZHANG J,GU Z,JANG J,et al.Protecting intellectual propertyof deep neural networks with watermarking[C]//Proceedings of the 2018 on Asia Conference on Computer and Communications Security.2018:159-172.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!