计算机科学 ›› 2022, Vol. 49 ›› Issue (11A): 211000012-7.doi: 10.11896/jsjkx.211000012
宋立国1, 胡承秀2, 王亮1
SONG Li-guo1, HU Cheng-xiu2, WANG Liang1
摘要: 处理器正在由单核处理器向众核处理器发展,文章首先介绍了目前众核处理器的发展状况;然后重点从能效、性能和可靠性3个方面,分体系结构、片上存储和软件等不同层次综合分析国外众核处理器最新研究成果;结合后摩尔时代集成电路发展趋势,指出自适应技术和三维集成技术将是众核处理器发展的重点。文章最后认为,众核处理器未来发展将是不同拓扑结构、软件编程与硬件定义、经典设计与新器件、新工艺的创新融合。
中图分类号:
[1]SOURAV D,JANARDHAN R D,PARTHA P P.Energy-Efficient and Reliable 3D Network-on-Chip(NoC):Architectures and Optimization Algorithms[C]//2016 IEEE/ACM international conference on Computer-Aided Design(ICCAD).2016:1-6. [2]LEE D J,DAS S,DOPPA J R.Performance and ThermalTradeoffs for Energy-Efficient Monolithic 3D Network-on-Chip[J].ACM Transactions on Design Automation of Electronic Systems,2018,23(5):1-25. [3]CHATTERJEE A,KIM R G,DOPPA J R.Power Management of Monolithic 3D Manycore Chips with Inter-tier Process Variations[J].ACM Journal on Emerging Technologies in Computing Systems,2021,17(2):1-19. [4]TAEYOUNG K,SUN Z Y,CHEN H B.Energy and Lifetime Optimizations for Dark Silicon Manycore Microprocessor Considering Both Hard and Soft Errors[J].IEEE Transactions on Very Large Scale Integration Systems,2017,25(9):2561-2574. [5]KIM J S,TAYLOR M B,MILLER J,et al.Energy characterization of a tiled architecture processor with on-chip networks[C]//Proceedings of the International Symposium on Low Po-wer Electronics and Design.2003:424-427. [6]IntelCorp.From a Few Cores to Many: A Terascale Computing Research Overview[EB/OL].https://www.intel.com/content/dam/www/public/us/en/documents/technology-briefs/intel-labs-tera-scale-research-paper.pdf [7]WANG H,PEH L S,MALIK S.A Technology-Aware and Energy-Oriented Topology Exploration for On-Chip Networks[C]//Proceedings of Design,Automation and Test in Europe.2005:1238-1243. [8]PINTO A,CARLONI L P,SANGIOVANNI VINCENTELLI A L.Efficient Synthesis of Networks on Chip[C]//Proceedings of the 21st International Conference on Computer Design.2003:146-150. [9]SOUZA M A,FREITAS H C,MEHAUT J F.Design Space Exploration of Energy Efficient NoC-and Cache-Based Many-Core Architecture Using Distributed L2 and Adaptive L3 Caches[C]//2018 30th International Symposium on Computer Architecture and High Performance Computing.2018:402-409. [10]DOPPA J R,KIM R G.Adaptive Manycore Architectures for Big Data Computing[C]//2017 Eleventh IEEE/ACM International Symposium on Networks On-Chip.2017:1-8. [11]BOKHARI H,JAVAID H,SHAFIQUE M.SuperNet:Multi-mode Interconnect Architecture for Manycore Chips[C]//Proceeding of the 52nd Annual Design Automation Conference.2015:1-6. [12]SCIONTI A,MAZUMDAR S,PORTERO A.Software Defined Network-on-Chip for Scalable CMPs[C]//2016 International Conference on High Performance Computing & Simulation(HPCS).2016:112-115. [13]GONZALEZ R,HOROW I.Energy dissipation in general purpose microprocessors[J].IEEE Journal of Solid State Circuits,1996,31(9):1277-1284. [14]ISHMANN V,IRWINMJ K,IRWIN M J,et al.Energy-driven integrated hardware-software optimizations using simple-power[C]//Proceedings of the 27th Annual International Symposium on Computer Architecture.2000:95-106. [15]SUBRAMANIAN R,SMARAGDAKIS Y,LOH G H.Adaptive caches:effective shaping of cache behavior to workloads [C]//Proceedings of the 39th Annual IEEE/ACM Int Symp on Microarchitecture.2006:385-396. [16]TITOSGIL R,FLORES A,FERNANDEZ-PASCUAL R.Way-Combining Directory:An Adaptive and Scalable Low-Cost Coherence Directory[C]//Proceedings of the International Confe-rence on Supercomputing.2017:1-10. [17]BANAKAR R,STEINKE S,LEE B,et al.Scratchpad Memory:A Designed Alternative for Cache On-chip memory in Embedded System[C]//Proceedings of the Tenth International Symposium on Hardware/Software Codesign(CODES 2002).2002:73-78. [18]BANAKAR R,STEINKE S,LEE B,et al.Comparison of cache and scratchpad-based memory systems with respect to perfor-mance,area and energy consumption[R].Fakultät für Informatik,TU Dortmund,2001. [19]ALVAREZ L,VILANOVA L,MORETO M.Coherence Protocol for Transparent Management of Scratchpad Memories in Shared Memory Manycore Architectures[C]//Proceedings of the 42nd Annual International Sympo-sium on Computing Architecture.2015:720-732. [20]KIM N,AHN J,CHOI K.Benzene:An Energy-Efficient Distrib-uted Hybrid Cache Architecture for Manycore Systems[J].ACM Transactions on Architecture and Code Optimization,2018,15(1):1-23. [21]WANG X Q,XI J J,WANG Y H.An Efficient Task Mapping for Manycore Systems[C]//2020 IEEE International Sympo-sium on Circuits and Systems.2020:1-4. [22]CAPOTONDI A,HAUGOU G,MARONGIU A,et al.Runtime Support for Multiple Offload-Based Programming Models on Embedded Manycore Accelerators[C]//Proceedings of the 2015 International Workshop on Code Optimisation for Multi and Manycores.2015:1-10. [23]LE T T,ZHAO D,WU H Y,et al.Optimizing the Heterogeneous Network On-Chip Design in Manycore Architectures[C]//2017 30th IEEE International System-on-Chip Conference.2017:184-189. [24]HUANG W H,CHEN J J,REINEKE J.MIRROR:Symmetric timing analysis for real-time tasks on multicore platforms with shared resources[C]//Proceedings of the 53rd Annual Design Automation Conference.2016:1-6. [25]DAVIDSON S,XIE S,TORNG C,et al.The Celerity Open-Source 511-Core RISC-V Tiered Accelerator Fabric:Fast Architectures and Design Methodologies for Fast Chips[J].IEEE Micro,2018(3/4):30-41. [26]DESHWAL A,JAYAKODI N K,et al.MOOS:A Multi-Objective Design Space Exploration and Optimization Framework for NoC Enabled Manycore Systems[J].ACM Transaction on Embedded Computing Systems,2019,18(5):1-23. [27]ABBAS E K,AXEL J,LU Z H,et al.Mathematical formalisms for performance evaluation network on chip [J].ACM computing Surveys,2013,45(3):1-41. [28]OGRAS U Y,MARCULESCU R.Application-specific network-on-chip architecture customization via long-range link insertion[C]//Proceedings of the 2005International Conference on Computer aided Design(ICCAD 05).Washington,DC:IEEE Computer Society,2005:246-253. [29]WANG W,QIAO L,YANG G W,et al.Performance Analysis of the 2-D Networks-on-Chip[J].Journal of Computer Research and Development,2009,46(10):1601-1611. [30]BENOÎT D,GRAILLAT A.Network-on-Chip Service Guaran-tees on the Kalray MPPA-256 Bostan Processor[C]//Procee-dings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems.2017:35-40. [31]JERGER N E,PEH L S,LIPASTI M.Circuit-Switched Cohe-rence[C]//Proceedings of the the 2nd International Symposium on Networks-on-Chip.2008:193-202. [32]LI Y,LOU R A.ALPHA:A Learning-Enabled High-Perfor-mance Network-on-Chip Router Design for Heterogeneous Manycore Architectures[J].IEEE Transactions on Sustainable Computing,2021,6(2):274-288. [33]HAN X,FU Y,JIANG J.Reconfigurable MPB Combined with Cache Coherence Protocol in Many-core[C]//2016 IEEE Advanced Information Management,Communicates,Electronic and Automation Control Conference.2016:385-388. [34]MASING L,KRE F,SRIVATSA A,et al.In-NoC circuits forlow-latency cache coherence in distributed shared-memory architectures[C]//2018 IEEE 12th International Symposium on Embedded Multicore/Many-core Systems-on-Chip.2018:138-145. [35]BURGIO P,MARONGIU A,VALENTE P,et al.A memory-centric approach to enable timing-predictability within embedded many-core accelerators[C]//2015 CSI Symposium on Real-Time and Embedded Systems and Technologies.2015:1-8. [36]CHEN K,LI S,AHN J H,et al.History-Assisted Adaptive-Granularity Caches(HAAG$) for High Performance 3D DRAM Architectures[C]//Proceedings of the 29th ACM on International Conference on Supercomputing.2015:251-261. [37]NGUYEN T M,WENTZLAFF D.MORC:A Manycore-Oriented Compressed Cache[C]//Proceedings of the 48th International Symposium on Microarchitecture.2015:76-88. [38]TANG X L,KANDEMIR M T,KARAKOY M,et al.Co-optimizing Memory-Level Parallelism and Cache-Level Parallelism[C]//Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation.2019:935-945. [39]KISLAL O,KOTRA J,TANG X L,et al.POSTER:Location-Aware Computation Mapping for Manycore Processors[C]//2017 26th International Conference on Parallel Architectures and Compilation Techniques.2017:138-139. [40]KANDEMIR M T,TANG X L,ZHAO H,et al.Distance-in-Time versus Distance-in-Space[C]//Proceedings of the 42nd ACM SIGPLAN Conference on Programming Language Design and Implementation.2021:665-680. [41]KAMEDA H,FA E S,RYU I,et al.A performance comparison of dynamic vs.static load balancing policies in a mainframe-personal computer network model [C]//Proceedings of the 39th IEEE Conference on Decision and Control,Sydney,Australia.2000:1415-1420. [42]MIOMANDRE H G,HASCOET J L,DESNOS K,et al.Embed-ded Runtime for Reconfigurable Dataflow Graphs on Manycore Architectures[C]//Proceedings of the 9th Workshop and 7th Workshop on Parallel Programming and RunTime Managemant Techniques for Manycore Architectures and Design Tools and Architectures for Multicore Embedded Computing Platforms.2018:51-56. [43]ZHANG X,JAVAID H,SHAFIQUE M,et al.ADAPT:AnADAptive Manycore Methodology for Software Pipelined Applications[C]//The 20th Asic and South Pacific Design Automation Conference.2015:701-706. [44]BORKAR S.Designing reliable systems from unreliable components:the challenges of transistor variability and degradation [J].Micro IEEE,2005,25(6):10-16. [45]SIVAKUMAR P,KISTLER M,KECKLER S W,et al.Modeling the effect of technology trends on soft error rate of combinatorial logic [C]//Proceedings International Conference on Dependable Systems and Networks.2002:389-398. [46]WONG R,LI J,FU A,et al.(α,k)-Anonymous data publishing[J].Journal of Intelligent Information Systems,2009,33(2):209-234. [47]YANG G M,YANG J,ZHANG J P.Semi-supervised clustering-based anonymous data publishing[J].Journal of Harbin Engineering University,2011,33(11):1489-1495. [48]TENG J F,ZHONG C.Clustering-based sensitive attribute-diversity anonymization algorithms[J].Computer Engineering and Design,2010,31(20):4378-4381. [49]WANG K,LOURI A,KARANTH A,et al.IntelliNoC:A Holistic Design Framework for Energy-Efficient and Reliable On-Chip Communication for Manycores[C]//Proceedings of the 46th International Symposium on Computing Architecture.2019:589-600. [50]BALBONI M,BERTOZZI D,FLICH J,et al.Synergistic Use of Multiple On-Chip Networks for Ultra-Low Latency and Scalable Distributed Routing Reconfiguration[C]//2015 Design,Automation & Test in Europe Conference & Exhibition.2015:806-811. [51]HASELMAN M,HAUCK S.The Future of Integrated Cir-cuits:A Survey of Nanoelectronics[C]//Proceeding of the IEEE.2010:11-38. [52]KAI-CHIANG W,MARCULESCU D.Joint logic restructuring and pin reordering against NBTI-induced performance degradation[C]//Proceedings Design,Automation & Test in Europe Conference & Exhibition(DATE’09).2009:75-80. [53]BUTZEN P F,BEM V D,REIS A I,et al.Transistor network restructuring against NBTI degradation[J].Microelectronics Reliability,2010,50(9/10/11):1298-1303. [54]ABELLA J,VERA X,GONZALEZ A.Penelope:The NBTI-Aware Processor[C]//40th Annual IEEE/ACM International Symposium on Proc.Microarchitecture(MICRO 2007).2007:85-96. [55]SONG J,YINHE H,LEI Z,et al.M-IVC:Using Multiple Input Vectors to Minimize Aging-Induced Delay[C]//Proceedings Asian Test Symposium(ATS’09).2009:437-442. [56]WANG Y,LUO H,HE K,et al.Temperature-Aware NBTIModeling and the Impact of Standby Leakage Reduction Techniques on Circuit Performance Degradation[J].IEEE Transactions on Dependable and Secure Computing,2011,8(5):756-769. [57]MINTARNO E,SKAF J,RUI Z,et al.Self-Tuning for Maximized Lifetime Energy-Efficiency in the Presence of Circuit Aging[J].IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,2011,30(5):760-773. [58]LIDE Z,DICK R P.Scheduled voltage scaling for increasing lifetime in the presence of NBTI[C]//Proceedings Design Automation Conference,2009.Asia and South Pacific,2009:492-497. [59]BASOGLU M,ORSHANSKY M,EREZ M.NBTI-awareDVFS:A new approach to saving energy and increasing processor lifetime[C]//2010 ACM/IEEE International Symposium on Proc.Low-Power Electronics and Design(ISLPED).2010:253-258. [60]KUMAR S V,KIM C H,SAPATNEKAR S S.Adaptive Techniques for Overcoming Performance Degradation Due to Aging in CMOS Circuits[J].IEEE Transactions on Very Large Scale Integration(VLSI) Systems,2011,19(4):603-614. [61]KIM T,SUN Z,COOK C,et al.Dynamic reliability management for near-threshold dark silicon processors[C]//Proceedings of the 35th International Conference on Computer-Aided Design.2016:1-7. [62]RATHORE V,CHATURVEDI V,SRIKANTHAN T.Per-formance Constraint-Aware Task Mapping to Optimize Lifetime Reliability of Manycore Systems[C]//Proceedings of the 26th edition on Great Lakes Symposium on VLSI.2016:377-380. [63]RATHORE V,CHATURVEDI V,SINGH A K,et al.Towards Scalable Lifetime Reliability Management for Dark Silicon Manycore Systems[C]//2019 IEEE 25th International Sympo-sium on On-Line Testing and Roburst System Design.2019:204-207. [64]ZHENG H,WANG K,LOURI A.Adapt-NoC:A Flexible Network-on-Chip Design for Heterogeneous Manycore Architectures[C]//2021 IEEE International Symposium on High-Performance Computer Architecture.2021:723-735. [65]NAZARIAN S,BOGDAN P.S4oC:A Self-Optimizing,Self-Adapting Secure System-on-Chip Design Framework to Tackle Unknown Threats — A Network Theoretic,Learning Approach[C]//2020 IEEE International Symposium on Circuits and Systems.2020:1-8. [66]CAO Y J,QIAN D P,WU W G,et al.Adaptive Scheduling Algorithm Based on Dynamic Core-Resource Partitions for Many-Core Processor Systems[J].Journal of Software,2012,23(2):240-252. [67]MUSAVVIR S,CHATTERJEE A,KIM R G,et al.Power,Performance,and Thermal Trade-offs in M3D-enabled Manycore Chips[C]//2020 Design Automation & Test in Europe Confe-rence & Exhibition.2020:1752-1757. |
|