计算机科学 ›› 2022, Vol. 49 ›› Issue (10): 74-82.doi: 10.11896/jsjkx.210900137

• 高性能计算* 上一篇    下一篇

基于FPGA的高性能可扩展SM4-GCM算法实现

翟嘉琪1, 李斌1, 周清雷1,2, 陈晓杰2   

  1. 1 郑州大学信息工程学院 郑州 450001
    2 数学工程与先进计算国家重点实验室 郑州 450001
  • 收稿日期:2021-09-16 修回日期:2022-03-14 出版日期:2022-10-15 发布日期:2022-10-13
  • 通讯作者: 李斌(iebinli@zzu.edu.cn)
  • 作者简介:(zhaijiaqi777@163.com)
  • 基金资助:
    国家自然科学基金(61702518);国家重点研发计划“公共安全风险防控与应急技术装配”重点专项(2018XXXXXXX01)

Implementation of FPGA-based High-performance and Scalable SM4-GCM Algorithm

ZHAI Jia-qi1, LI Bin1, ZHOU Qing-lei1,2, CHEN Xiao-jie2   

  1. 1 School of Information Engineering,Zhengzhou University,Zhengzhou 450001,China
    2 State Key Laboratory of Mathematical Engineering and Advanced Computing,Zhengzhou 450001,China
  • Received:2021-09-16 Revised:2022-03-14 Online:2022-10-15 Published:2022-10-13
  • About author:ZHAI Jia-qi,born in 1998,postgra-duate.His main research interests include high-performance computing and information security.
    LI Bin,born in 1986,Ph.D,lecturer.His main research interests include high-performance computing and information security.
  • Supported by:
    National Natural Science Foundation of China(61702518) and National Key R & D Program “Public Safety Risk Preventionand Controland Emergency Technology Assembly” Key Special Project(2018XXXXXXX01).

摘要: 在大数据和5G技术蓬勃发展的背景下,高速通信系统中的信息加密成为了新的研究热点,如何在保证数据高安全性的同时,提高数据吞吐率,降低加密算法适配不同应用场景的难度成为了重要的研究课题。针对传统软件实现的SM4-GCM算法吞吐率小、难以在多变的5G及大数据场景下应用的问题,文中基于FPGA可重构的特点,深入剖析SM4-GCM算法的特征,利用Mastrovito算法、Karatsuba算法、快速求余算法,设计了两种高性能、数控分离、可扩展的电路结构,分别采用全流水线技术和四度并行技术对SM4-GCM算法进行加速优化,在保证高安全性的同时,达到了较高吞吐率,并且可灵活移植于各种应用场景。实验结果表明,所提出的两种方案中的单个SM4-GCM模块的吞吐率分别达到了28.16 Gbps和28.8 Gbps,在性能、可扩展性等方面均优于同类已发表的设计。

关键词: SM4, 伽罗华/计数器模式, FPGA, 高吞吐率, 可扩展

Abstract: In the context of vigorous development of big data and 5G technology,information encryption in high-speed communication systems has become a new research hotspot.How to increase data throughput and reduce the difficulty of adapting encryption algorithms to different application scenarios while ensuring high data security has become important research topics.Aiming at the problem that traditional software’s SM4-GCM algorithm has a low throughput rate and is difficult to apply in changing 5G and big data scenarios,this paper analyzes the characteristics of SM4-GCM algorithm based on the reconfigurable characteristics of FPGA,using Mastrovito,Karatsuba and fast remainder algorithms.Two high-performance,CNC-separated and expandable circuit structures are designed.Full-pipeline technology and four-degree parallel technology are used to accelerate the optimization of SM4-GCM algorithm.While ensuring high security,it can achieve a high throughput rate,and can be flexibly transplanted to various application scenarios.Experimental results show that the throughput rates of the proposed two solutions in this paper for a single SM4-GCM module have reach 28.16 Gbps and 28.8 Gbps,respectively,which are superior to similar published designs in terms of performance and scalability.

Key words: SM4, Galois/Counter Mode, FPGA, High throughput rate, Scalable

中图分类号: 

  • TP309
[1]GB/T 32907-2016 Information Security Technology SM4 BlockCipher Algorithm [S].Beijing:China Standard Press,2016.
[2]IEEE Std 802.1AEbn.IEEE Standard for Local and Metropolitan Area Networks-Media Access Control(MAC) SecurityAmendment 1:Galois Counter Mode-Advanced Encryption Standard-256(GCM-AES-256) Cipher Suite,September 2011,[OL].http://www.ieee802.org/l/pages/802.laebn.html.
[3]FU T S,LI S G.A High-throughput ASIC implementation ofSM4 algorithm CBC mode[J].Microelectronics and Computer,2016,33(10):13-18.
[4]LI L,YANG F,PAN Y M,et al.An implementation method for SM4-GCM on FPGA[C]//2017 IEEE 2nd Advanced Information Technology,Electronic and Automation Control Conference(IAEAC).IEEE,2017:1921-1925.
[5]CHENG W Z,ZHENG F Y,PAN W Q,et al.High-performance symmetric cryptography server with GPU acceleration[C]//International Conference on Information and Communications Security.Cham:Springer,2017:529-540.
[6]WANG Z F,TANG Z J.High-throughput ASIC implementation of SM4 algorithm CTR mode[J].Electronic Devices,2019,42(1):173-177.
[7]QIU S,BAI G Q.Power analysis of a FPGA implementation of SM4[C]//Fifth International Conference on Computing,Communications and Networking Technologies(ICCCNT).IEEE,2014:1-6.
[8]LI J,XIE W B,LI L C,et al.Parallel Implementation and Optimization of SM4 Based on CUDA[C]//EAI International Conference on Applied Cryptography in Computer and Communications.Cham:Springer,2021:93-104.
[9]OSCAR F,SRINIVASAN S,RAMESH C,et al.A Survey on High-Throughput Non-Binary LDPC Decoders:ASIC,FPGA,and GPU Architectures[J].IEEE Communications Surveys & Tutorials,2021,24(1):524-556.
[10]LIU J J,SHI J J,ZHANG D J,et al.Hardware implementation and application of SM4 algorithm in wireless communication [J].Computer Engineering and Applications,2016,52(17):118-122.
[11]XU J F,YANG Y H.Parallel mapping of SM4 algorithm on coarse-grained array platform[J].Application of Electronic Technology,2017,43(4):39-42.
[12]ZHANG X,ZHOU Q L,LI B.Research and Implementation of Reconfigurable SM4 Cipher Algorithm Based on HRCA[J].Journal of Network and Information Security,2020,6(5):101-109.
[13]ZHANG J,WU W L.Authenticated encryption algorithm based on SM4 round function design[J].Acta Electronica Sinica,2018,46(6):1294-1299.
[14]SA'ED A,REEM J,BASSAM J M,et al.Performance evaluation of the SM4 cipher based on field-programmable gate array implementation[J].IET Circuits,Devices & Systems,2021,15(2):121-135.
[15]MOZAFFARI-KERMANI M,REYHANI-MASOLEH A.Efficient and high-performance parallel hardware architectures for the AES-GCM[J].IEEE Transactions on Computers,2011,61(8):1165-1178.
[16]SANDHYA K,AMITABH D,KESHAB K P.FPGA implementation and comparison of AES-GCM and Deoxys authenticated encryption schemesp[C]//2017 IEEE International Symposium on Circuits and Systems(ISCAS).2017:1-4.
[17]ZHANG Z,WANG X,HAO Q,et al.High-efficiency parallelcryptographic accelerator for real-time guaranteeing dynamic data security in embedded systems[J].Micromachines,2021,12(5):560-584.
[18]KARIM M A,ROSELYNE C A HABIB M,et al.AES-GCM and AEGIS:Efficient and High Speed Hardware Implementations[J].Journal of Signal Processing Systems,2017,88(1):1-12.
[19]AHMAD,NABIHAH,LIM M W,et al.Advanced Encryption Standard with Galois Counter Mode using Field Programmable Gate Array[J].Journal of Physics:Conference Series,2018,1019(1):1-7.
[20]LI Y,MA X P,ZHANG Y,et al.Mastrovito form of non-recursive Karatsuba multiplier for all trinomials[J].IEEE Transactions on Computers,2017,66(9):1573-1584.
[21]GUERON S,KOUNAVIS M.Efficient implementation of theGalois Counter Mode using a carry-less multiplier and a fast reduction algorithm[J].Information Processing Letters,2010,110(14):549-553.
[22]SUNAR B,KOC C K.Mastrovito multiplier for all trinomials[J].IEEE Transactions on Computers,1999,48(5):522-527.
[23]HALBUTOGULLARI A,KOC C K.Mastrovito multiplier for general irreducible polynomials[J].IEEE Transactions on Computers,2000,49(5):503-518.
[24]SKOWYRA R,XU L,GU G F,et al.Effective topology tampering attacks and defenses in software-defined networks[C]//2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks(DSN).IEEE,2018:374-385.
[25]HE S Y,LI H,LI F H.The FPGA optimization implementation method of SM4 algorithm[J].Journal of Xidian University,2021,48(3):155-162.
[26]YANG G Q,DING H C,ZOU J,et al.A big data securityscheme based on high-performance cryptography[J].Computer Research and Development,2019,56(10):2207-2215.
[27]GUAN Z,LI Y,SHANG T,et al.Implementation of SM4 onFPGA:Trade-off analysis between area and speed[C]//2018 IEEE International Conference on Intelligence and Safety for Robotics(ISR).IEEE,2018:192-197.
[28]QU S X.Research and implementation of GCM encryption authentication algorithm based on FPGA[D].Beijing:Beijing University of Posts and Telecommunications,2010.
[29]VLIEGEN J,REQARAZ O,MENTENS N.Maximizing thethroughput of threshold-protected AES-GCM implementations on FPGA[C]//2017 IEEE 2nd International Verification and Security Workshop(IVSW).IEEE,2017:140-145.
[1] 朝乐门, 王锐.
数据科学平台:特征、技术及趋势
Data Science Platform:Features,Technologies and Trends
计算机科学, 2021, 48(8): 1-12. https://doi.org/10.11896/jsjkx.210600033
[2] 王登天, 周华, 钱荷玥.
LDPC自适应最小和译码算法及其FPGA实现
LDPC Adaptive Minimum Sum Decoding Algorithm and Its FPGA Implementation
计算机科学, 2021, 48(6A): 608-612. https://doi.org/10.11896/jsjkx.200800134
[3] 齐延荣, 周夏冰, 李斌, 周清雷.
基于FPGA的CNN图像识别加速与优化
FPGA-based CNN Image Recognition Acceleration and Optimization
计算机科学, 2021, 48(4): 205-212. https://doi.org/10.11896/jsjkx.200600089
[4] 王喆, 唐麒, 王玲, 魏急波.
一种基于模拟退火的动态部分可重构系统划分-调度联合优化算法
Joint Optimization Algorithm for Partition-Scheduling of Dynamic Partial Reconfigurable Systems Based on Simulated Annealing
计算机科学, 2020, 47(8): 26-31. https://doi.org/10.11896/jsjkx.200500110
[5] 庄园, 郭强, 张洁, 曾云辉.
大规模申威众核环境下二维数据计算的可扩展方法
Large Scalability Method of 2D Computation on Shenwei Many-core
计算机科学, 2020, 47(8): 87-92. https://doi.org/10.11896/jsjkx.191000011
[6] 陈利锋, 朱路平.
一种基于云端加密的FPGA自适应动态配置方法
Encrypted Dynamic Configuration Method of FPGA Based on Cloud
计算机科学, 2020, 47(7): 278-281. https://doi.org/10.11896/jsjkx.190700110
[7] 赵博, 杨明, 汤志伟, 蔡玉鑫.
基于FPGA的智能视频加速检索系统
Intelligent Video Surveillance Systems Based on FPGA
计算机科学, 2020, 47(6A): 609-611. https://doi.org/10.11896/JsJkx.190700118
[8] 叶少杰, 汪小益, 徐才巢, 孙建伶.
BitXHub:基于侧链中继的异构区块链互操作平台
BitXHub:Side-relay Chain Based Heterogeneous Blockchain Interoperable Platform
计算机科学, 2020, 47(6): 294-302. https://doi.org/10.11896/jsjkx.191100055
[9] 朱丽花, 王玲, 唐麒, 魏急波.
一种针对动态部分可重构SoC软硬件划分的高效MILP模型
Efficient MILP Model for HW/SW Partitioning of Dynamic Partial Reconfigurable SoC
计算机科学, 2020, 47(4): 18-24. https://doi.org/10.11896/jsjkx.190300001
[10] 赖欣, 曾纪炜.
几何类航空数据与关系型数据库映射转换研究
Study on Mapping Transformation from Geometric Aviation Data to Relational Database
计算机科学, 2020, 47(11A): 570-572. https://doi.org/10.11896/jsjkx.200400040
[11] 李斌, 周清雷, 斯雪明, 陈晓杰.
基于FPGA集群的Office口令恢复优化实现
Optimized Implementation of Office Password Recovery Based on FPGA Cluster
计算机科学, 2020, 47(11): 32-41. https://doi.org/10.11896/jsjkx.200500040
[12] 周惠婷, 周杰.
基于改进NC-OFDM算法的仿真设计与分析
Simulation and Analysis on Improved NC-OFDM Algorithm
计算机科学, 2020, 47(10): 263-268. https://doi.org/10.11896/jsjkx.190800043
[13] 吴斌烽.
基于微服务架构的物联网中间件设计
Design of IoT Middleware Based on Microservices Architecture
计算机科学, 2019, 46(6A): 580-584.
[14] 朱仁杰.
扩大故障注入范围的SM4差分故障攻击研究
Study on SM4 Differential Fault Attack Under Extended Fault Injection Range
计算机科学, 2019, 46(11A): 493-495.
[15] 贾迅, 钱磊, 邬贵明, 吴东, 谢向辉.
FPGA应用于高性能计算的研究现状和未来挑战
Research Advances and Future Challenges of FPGA-based High Performance Computing
计算机科学, 2019, 46(11): 11-19. https://doi.org/10.11896/jsjkx.191100500C
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!