计算机科学 ›› 2023, Vol. 50 ›› Issue (11): 15-22.doi: 10.11896/jsjkx.220900250
邓林, 张瑶, 罗家豪
DENG Lin, ZHANG Yao, LUO Jiahao
摘要: 面对日益复杂的处理器设计和有限的设计周期,如何有效地快速进行性能评估,是每一个处理器设计团队需要解决的问题。完整的性能测试集需要运行较长的时间,特别是在硅前验证阶段,高昂的时间成本导致设计团队无法使用完整的性能测试集进行性能评估分析。文中介绍了一种通用处理器快速性能评测方法(Fast-Eval),Fast-Eval性能评测方法基于SimPoint技术,使用FastParallel-BBV方法、最优模拟点的选取以及模拟点的热迁移等方法,显著缩短了BBV生成时间和性能测试时间。实验结果表明,相比完整运行SPEC CPU 2006 REF数据规模测试程序获得的性能数据,所提方法在ARM64处理器上BBV生成时间缩短为原来的16.88%,性能评估时间缩短为原来的1.26%,性能评估结果的平均相对误差为0.53%;在FPGA开发板上测试集的平均相对误差可以达到0.40%,运行时间仅为完整运行时间的0.93%。
中图分类号:
[1]ZHANG Q L,HOU R,YANG S B,et al.The role of architecture simulator in processor design process[J].Computer Research and Development,2019,56(12):2702-2719. [2]BUTKO A,GARIBOTTI R,OST L,et al.Accuracy evaluation of gem5 simulator system[C]//7th International Workshop on Reconfigurable and Communication-centric Systems-on-chip(ReCoSoC).IEEE,2012:1-7. [3]HEIRMAN W,CARLSON T,EECKHOUT L.Sniper:Scalable and accurate parallel multi-core simulation[C]//8th Interna-tional Summer School on Advanced Computer Architecture and Compilation for High-Performance and Embedded Systems(ACACES-2012).High-Performance and Embedded Architecture and Compilation Network of Excellence(HiPEAC),2012:91-94. [4]BINKERT N,BECKMANN B,BLACK G,et al.The gem5 si-mulator[J].ACM SIGARCH Computer Architecture News,2011,39(2):1-7. [5]TA T,CHENG L,BATTEN C.Simulating multi-core RISC-Vsystems in gem5[C]//Workshop on Computer Architecture Research with RISC-V.2018. [6]LUO T,WANG X,QU C,et al.An FPGA-based hardware emulator for neuromorphic chip with RRAM[J].IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,2018,39(2):438-450. [7]PATEL H V,RATHOD S S,SHAH P H.An FPGA basedHardware Emulator for Neuromorphic Chip[C]//2020 International Conference on Electronics and Sustainable Communication Systems(ICESC).IEEE,2020:1131-1136. [8]LIU S,LAU F C M,SCHAFER B C.Accelerating FPGA proto-typing through predictive model-based HLS design space exploration[C]//Proceedings of the 56th Annual Design Automation Conference.2019:1-6. [9]DENNIS D K,PRIYAM A,VIRK S S,et al.Single cycle RISC-V micro architecture processor and its FPGA prototype[C]//2017 7th International Symposium on Embedded Computing and System Design(ISED).IEEE,2017:1-5. [10]JIANG X Z.Software-Hardware Co-emulation Automation Ve-rification Platform Design[D].Xi’an:Xidian University,2019. [11]SUKHWANI B,ROEWER T,HAYMES C L,et al.Contutto:A novel FPGA-based prototyping platform enabling innovation in the memory subsystem of a server class processor[C]//Proceedings of the 50th Annual IEEE/ACM International Sympo-sium on Microarchitecture.2017:15-26. [12]GUO H,HUANG L B,ZHENG Z,et al.Proto-perf:A fast and accurate performance evaluation method for general purpose processor prototype system[J].Computer Engineering and Science,2021,43(4):579-585. [13]Valgrind.Valgrind Documentation[EB/OL].(2022-10-24)[2022-09-26].https://valgrind.org/docs/manual/valgrind_ma-nual.pdf. [14]PHANSALKAR A,JOSHI A,JOHN L K.Analysis of redun-dancy and application balance in the SPEC CPU2006 benchmark suite[C]//Proceedings of the 34th Annual International Symposium on Computer architecture.2007:412-423. [15]CRIU.Checkpoint/restore in user space[EB/OL].(2013-12-26)[2022-09-26].https://criu.org/CRIU:About. [16]QEMU.QEMU is a generic and open source machine emulator and virtualizer[EB/OL].(2020-07-07)[2022-09-26].https://wiki.qemu.org/Main_Page. [17]WEAVER V M,MCKEE S A.Using dynamic binary instrumen-tation to generate multi-platform simpoints:Methodology and accuracy[C]//International Conference on High-Performance Embedded Architectures and Compilers.Berlin:Springer,2008:305-319. [18]CALDER B,SHERWOOD T,HAMERLY G,et al.Simpoint:Picking representative samples to guide simulation[J/OL].https://sites.cs.ucsb.edu/~sherwood/pubs/CHAPTER-simpoint.pdf. [19]SHERWOOD T,PERELMAN E,HAMERLY G,et al.Auto-matically characterizing large scale program behavior[J].ACM SIGPLAN Notices,2002,37(10):45-57. [20]LIKAS A,VLASSIS N,VERBEEK J J.The global k-means clustering algorithm[J].Pattern Recognition,2003,36(2):451-461. [21]HENNING J L.SPEC CPU2006 benchmark descriptions[J].ACM SIGARCH Computer Architecture News,2006,34(4):1-17. |
|