计算机科学 ›› 2022, Vol. 49 ›› Issue (11A): 211000012-7.doi: 10.11896/jsjkx.211000012

• 交叉&应用 • 上一篇    下一篇

众核处理器研究技术综述和分析

宋立国1, 胡承秀2, 王亮1   

  1. 1 北京微电子技术研究所 北京 100076
    2 北京宇航系统工程研究所 北京 100076
  • 出版日期:2022-11-10 发布日期:2022-11-21
  • 通讯作者: 宋立国(songlg123456@163.com)

Summary and Analysis of Research on ManyCore Processor Technologies

SONG Li-guo1, HU Cheng-xiu2, WANG Liang1   

  1. 1 Beijing Microelectronics Technology Institute,Beijing 100076,China
    2 Beijing Institute of Aerospace Systems Engineering,Beijing 100076,China
  • Online:2022-11-10 Published:2022-11-21
  • About author:SONG Li-guo,born in 1973,Ph.D researcher.His main research interests include high-performance computing and many-core processor architecture.

摘要: 处理器正在由单核处理器向众核处理器发展,文章首先介绍了目前众核处理器的发展状况;然后重点从能效、性能和可靠性3个方面,分体系结构、片上存储和软件等不同层次综合分析国外众核处理器最新研究成果;结合后摩尔时代集成电路发展趋势,指出自适应技术和三维集成技术将是众核处理器发展的重点。文章最后认为,众核处理器未来发展将是不同拓扑结构、软件编程与硬件定义、经典设计与新器件、新工艺的创新融合。

关键词: 众核处理器, 片上网络, 存储结构, 软件调度

Abstract: Processors have been developing from single-core to manycore.The latest research results abroad on manycore are comprehensively analyzed.The development status of many-core processors is first introduced,and then the related recent papers are summarized and retrieved from three aspects:architecture,on-chip storage and software.The main contributions and basic ideas of these papers are analyzed from the perspectives of energy efficiency,performance and reliability.Finally,combined with the development trend of integrated circuits in the post Moore era,two main technical direction are expounded which are the emerging adaptive architecture technology and three-dimensional integration technology of manycore processors.

Key words: Manycore processor, Network-on-chip, Memory-on-chip, Software scheduling

中图分类号: 

  • TP368
[1]SOURAV D,JANARDHAN R D,PARTHA P P.Energy-Efficient and Reliable 3D Network-on-Chip(NoC):Architectures and Optimization Algorithms[C]//2016 IEEE/ACM international conference on Computer-Aided Design(ICCAD).2016:1-6.
[2]LEE D J,DAS S,DOPPA J R.Performance and ThermalTradeoffs for Energy-Efficient Monolithic 3D Network-on-Chip[J].ACM Transactions on Design Automation of Electronic Systems,2018,23(5):1-25.
[3]CHATTERJEE A,KIM R G,DOPPA J R.Power Management of Monolithic 3D Manycore Chips with Inter-tier Process Variations[J].ACM Journal on Emerging Technologies in Computing Systems,2021,17(2):1-19.
[4]TAEYOUNG K,SUN Z Y,CHEN H B.Energy and Lifetime Optimizations for Dark Silicon Manycore Microprocessor Considering Both Hard and Soft Errors[J].IEEE Transactions on Very Large Scale Integration Systems,2017,25(9):2561-2574.
[5]KIM J S,TAYLOR M B,MILLER J,et al.Energy characterization of a tiled architecture processor with on-chip networks[C]//Proceedings of the International Symposium on Low Po-wer Electronics and Design.2003:424-427.
[6]IntelCorp.From a Few Cores to Many: A Terascale Computing Research Overview[EB/OL].https://www.intel.com/content/dam/www/public/us/en/documents/technology-briefs/intel-labs-tera-scale-research-paper.pdf
[7]WANG H,PEH L S,MALIK S.A Technology-Aware and Energy-Oriented Topology Exploration for On-Chip Networks[C]//Proceedings of Design,Automation and Test in Europe.2005:1238-1243.
[8]PINTO A,CARLONI L P,SANGIOVANNI VINCENTELLI A L.Efficient Synthesis of Networks on Chip[C]//Proceedings of the 21st International Conference on Computer Design.2003:146-150.
[9]SOUZA M A,FREITAS H C,MEHAUT J F.Design Space Exploration of Energy Efficient NoC-and Cache-Based Many-Core Architecture Using Distributed L2 and Adaptive L3 Caches[C]//2018 30th International Symposium on Computer Architecture and High Performance Computing.2018:402-409.
[10]DOPPA J R,KIM R G.Adaptive Manycore Architectures for Big Data Computing[C]//2017 Eleventh IEEE/ACM International Symposium on Networks On-Chip.2017:1-8.
[11]BOKHARI H,JAVAID H,SHAFIQUE M.SuperNet:Multi-mode Interconnect Architecture for Manycore Chips[C]//Proceeding of the 52nd Annual Design Automation Conference.2015:1-6.
[12]SCIONTI A,MAZUMDAR S,PORTERO A.Software Defined Network-on-Chip for Scalable CMPs[C]//2016 International Conference on High Performance Computing & Simulation(HPCS).2016:112-115.
[13]GONZALEZ R,HOROW I.Energy dissipation in general purpose microprocessors[J].IEEE Journal of Solid State Circuits,1996,31(9):1277-1284.
[14]ISHMANN V,IRWINMJ K,IRWIN M J,et al.Energy-driven integrated hardware-software optimizations using simple-power[C]//Proceedings of the 27th Annual International Symposium on Computer Architecture.2000:95-106.
[15]SUBRAMANIAN R,SMARAGDAKIS Y,LOH G H.Adaptive caches:effective shaping of cache behavior to workloads [C]//Proceedings of the 39th Annual IEEE/ACM Int Symp on Microarchitecture.2006:385-396.
[16]TITOSGIL R,FLORES A,FERNANDEZ-PASCUAL R.Way-Combining Directory:An Adaptive and Scalable Low-Cost Coherence Directory[C]//Proceedings of the International Confe-rence on Supercomputing.2017:1-10.
[17]BANAKAR R,STEINKE S,LEE B,et al.Scratchpad Memory:A Designed Alternative for Cache On-chip memory in Embedded System[C]//Proceedings of the Tenth International Symposium on Hardware/Software Codesign(CODES 2002).2002:73-78.
[18]BANAKAR R,STEINKE S,LEE B,et al.Comparison of cache and scratchpad-based memory systems with respect to perfor-mance,area and energy consumption[R].Fakultät für Informatik,TU Dortmund,2001.
[19]ALVAREZ L,VILANOVA L,MORETO M.Coherence Protocol for Transparent Management of Scratchpad Memories in Shared Memory Manycore Architectures[C]//Proceedings of the 42nd Annual International Sympo-sium on Computing Architecture.2015:720-732.
[20]KIM N,AHN J,CHOI K.Benzene:An Energy-Efficient Distrib-uted Hybrid Cache Architecture for Manycore Systems[J].ACM Transactions on Architecture and Code Optimization,2018,15(1):1-23.
[21]WANG X Q,XI J J,WANG Y H.An Efficient Task Mapping for Manycore Systems[C]//2020 IEEE International Sympo-sium on Circuits and Systems.2020:1-4.
[22]CAPOTONDI A,HAUGOU G,MARONGIU A,et al.Runtime Support for Multiple Offload-Based Programming Models on Embedded Manycore Accelerators[C]//Proceedings of the 2015 International Workshop on Code Optimisation for Multi and Manycores.2015:1-10.
[23]LE T T,ZHAO D,WU H Y,et al.Optimizing the Heterogeneous Network On-Chip Design in Manycore Architectures[C]//2017 30th IEEE International System-on-Chip Conference.2017:184-189.
[24]HUANG W H,CHEN J J,REINEKE J.MIRROR:Symmetric timing analysis for real-time tasks on multicore platforms with shared resources[C]//Proceedings of the 53rd Annual Design Automation Conference.2016:1-6.
[25]DAVIDSON S,XIE S,TORNG C,et al.The Celerity Open-Source 511-Core RISC-V Tiered Accelerator Fabric:Fast Architectures and Design Methodologies for Fast Chips[J].IEEE Micro,2018(3/4):30-41.
[26]DESHWAL A,JAYAKODI N K,et al.MOOS:A Multi-Objective Design Space Exploration and Optimization Framework for NoC Enabled Manycore Systems[J].ACM Transaction on Embedded Computing Systems,2019,18(5):1-23.
[27]ABBAS E K,AXEL J,LU Z H,et al.Mathematical formalisms for performance evaluation network on chip [J].ACM computing Surveys,2013,45(3):1-41.
[28]OGRAS U Y,MARCULESCU R.Application-specific network-on-chip architecture customization via long-range link insertion[C]//Proceedings of the 2005International Conference on Computer aided Design(ICCAD 05).Washington,DC:IEEE Computer Society,2005:246-253.
[29]WANG W,QIAO L,YANG G W,et al.Performance Analysis of the 2-D Networks-on-Chip[J].Journal of Computer Research and Development,2009,46(10):1601-1611.
[30]BENOÎT D,GRAILLAT A.Network-on-Chip Service Guaran-tees on the Kalray MPPA-256 Bostan Processor[C]//Procee-dings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems.2017:35-40.
[31]JERGER N E,PEH L S,LIPASTI M.Circuit-Switched Cohe-rence[C]//Proceedings of the the 2nd International Symposium on Networks-on-Chip.2008:193-202.
[32]LI Y,LOU R A.ALPHA:A Learning-Enabled High-Perfor-mance Network-on-Chip Router Design for Heterogeneous Manycore Architectures[J].IEEE Transactions on Sustainable Computing,2021,6(2):274-288.
[33]HAN X,FU Y,JIANG J.Reconfigurable MPB Combined with Cache Coherence Protocol in Many-core[C]//2016 IEEE Advanced Information Management,Communicates,Electronic and Automation Control Conference.2016:385-388.
[34]MASING L,KRE F,SRIVATSA A,et al.In-NoC circuits forlow-latency cache coherence in distributed shared-memory architectures[C]//2018 IEEE 12th International Symposium on Embedded Multicore/Many-core Systems-on-Chip.2018:138-145.
[35]BURGIO P,MARONGIU A,VALENTE P,et al.A memory-centric approach to enable timing-predictability within embedded many-core accelerators[C]//2015 CSI Symposium on Real-Time and Embedded Systems and Technologies.2015:1-8.
[36]CHEN K,LI S,AHN J H,et al.History-Assisted Adaptive-Granularity Caches(HAAG$) for High Performance 3D DRAM Architectures[C]//Proceedings of the 29th ACM on International Conference on Supercomputing.2015:251-261.
[37]NGUYEN T M,WENTZLAFF D.MORC:A Manycore-Oriented Compressed Cache[C]//Proceedings of the 48th International Symposium on Microarchitecture.2015:76-88.
[38]TANG X L,KANDEMIR M T,KARAKOY M,et al.Co-optimizing Memory-Level Parallelism and Cache-Level Parallelism[C]//Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation.2019:935-945.
[39]KISLAL O,KOTRA J,TANG X L,et al.POSTER:Location-Aware Computation Mapping for Manycore Processors[C]//2017 26th International Conference on Parallel Architectures and Compilation Techniques.2017:138-139.
[40]KANDEMIR M T,TANG X L,ZHAO H,et al.Distance-in-Time versus Distance-in-Space[C]//Proceedings of the 42nd ACM SIGPLAN Conference on Programming Language Design and Implementation.2021:665-680.
[41]KAMEDA H,FA E S,RYU I,et al.A performance comparison of dynamic vs.static load balancing policies in a mainframe-personal computer network model [C]//Proceedings of the 39th IEEE Conference on Decision and Control,Sydney,Australia.2000:1415-1420.
[42]MIOMANDRE H G,HASCOET J L,DESNOS K,et al.Embed-ded Runtime for Reconfigurable Dataflow Graphs on Manycore Architectures[C]//Proceedings of the 9th Workshop and 7th Workshop on Parallel Programming and RunTime Managemant Techniques for Manycore Architectures and Design Tools and Architectures for Multicore Embedded Computing Platforms.2018:51-56.
[43]ZHANG X,JAVAID H,SHAFIQUE M,et al.ADAPT:AnADAptive Manycore Methodology for Software Pipelined Applications[C]//The 20th Asic and South Pacific Design Automation Conference.2015:701-706.
[44]BORKAR S.Designing reliable systems from unreliable components:the challenges of transistor variability and degradation [J].Micro IEEE,2005,25(6):10-16.
[45]SIVAKUMAR P,KISTLER M,KECKLER S W,et al.Modeling the effect of technology trends on soft error rate of combinatorial logic [C]//Proceedings International Conference on Dependable Systems and Networks.2002:389-398.
[46]WONG R,LI J,FU A,et al.(α,k)-Anonymous data publishing[J].Journal of Intelligent Information Systems,2009,33(2):209-234.
[47]YANG G M,YANG J,ZHANG J P.Semi-supervised clustering-based anonymous data publishing[J].Journal of Harbin Engineering University,2011,33(11):1489-1495.
[48]TENG J F,ZHONG C.Clustering-based sensitive attribute-diversity anonymization algorithms[J].Computer Engineering and Design,2010,31(20):4378-4381.
[49]WANG K,LOURI A,KARANTH A,et al.IntelliNoC:A Holistic Design Framework for Energy-Efficient and Reliable On-Chip Communication for Manycores[C]//Proceedings of the 46th International Symposium on Computing Architecture.2019:589-600.
[50]BALBONI M,BERTOZZI D,FLICH J,et al.Synergistic Use of Multiple On-Chip Networks for Ultra-Low Latency and Scalable Distributed Routing Reconfiguration[C]//2015 Design,Automation & Test in Europe Conference & Exhibition.2015:806-811.
[51]HASELMAN M,HAUCK S.The Future of Integrated Cir-cuits:A Survey of Nanoelectronics[C]//Proceeding of the IEEE.2010:11-38.
[52]KAI-CHIANG W,MARCULESCU D.Joint logic restructuring and pin reordering against NBTI-induced performance degradation[C]//Proceedings Design,Automation & Test in Europe Conference & Exhibition(DATE’09).2009:75-80.
[53]BUTZEN P F,BEM V D,REIS A I,et al.Transistor network restructuring against NBTI degradation[J].Microelectronics Reliability,2010,50(9/10/11):1298-1303.
[54]ABELLA J,VERA X,GONZALEZ A.Penelope:The NBTI-Aware Processor[C]//40th Annual IEEE/ACM International Symposium on Proc.Microarchitecture(MICRO 2007).2007:85-96.
[55]SONG J,YINHE H,LEI Z,et al.M-IVC:Using Multiple Input Vectors to Minimize Aging-Induced Delay[C]//Proceedings Asian Test Symposium(ATS’09).2009:437-442.
[56]WANG Y,LUO H,HE K,et al.Temperature-Aware NBTIModeling and the Impact of Standby Leakage Reduction Techniques on Circuit Performance Degradation[J].IEEE Transactions on Dependable and Secure Computing,2011,8(5):756-769.
[57]MINTARNO E,SKAF J,RUI Z,et al.Self-Tuning for Maximized Lifetime Energy-Efficiency in the Presence of Circuit Aging[J].IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,2011,30(5):760-773.
[58]LIDE Z,DICK R P.Scheduled voltage scaling for increasing lifetime in the presence of NBTI[C]//Proceedings Design Automation Conference,2009.Asia and South Pacific,2009:492-497.
[59]BASOGLU M,ORSHANSKY M,EREZ M.NBTI-awareDVFS:A new approach to saving energy and increasing processor lifetime[C]//2010 ACM/IEEE International Symposium on Proc.Low-Power Electronics and Design(ISLPED).2010:253-258.
[60]KUMAR S V,KIM C H,SAPATNEKAR S S.Adaptive Techniques for Overcoming Performance Degradation Due to Aging in CMOS Circuits[J].IEEE Transactions on Very Large Scale Integration(VLSI) Systems,2011,19(4):603-614.
[61]KIM T,SUN Z,COOK C,et al.Dynamic reliability management for near-threshold dark silicon processors[C]//Proceedings of the 35th International Conference on Computer-Aided Design.2016:1-7.
[62]RATHORE V,CHATURVEDI V,SRIKANTHAN T.Per-formance Constraint-Aware Task Mapping to Optimize Lifetime Reliability of Manycore Systems[C]//Proceedings of the 26th edition on Great Lakes Symposium on VLSI.2016:377-380.
[63]RATHORE V,CHATURVEDI V,SINGH A K,et al.Towards Scalable Lifetime Reliability Management for Dark Silicon Manycore Systems[C]//2019 IEEE 25th International Sympo-sium on On-Line Testing and Roburst System Design.2019:204-207.
[64]ZHENG H,WANG K,LOURI A.Adapt-NoC:A Flexible Network-on-Chip Design for Heterogeneous Manycore Architectures[C]//2021 IEEE International Symposium on High-Performance Computer Architecture.2021:723-735.
[65]NAZARIAN S,BOGDAN P.S4oC:A Self-Optimizing,Self-Adapting Secure System-on-Chip Design Framework to Tackle Unknown Threats — A Network Theoretic,Learning Approach[C]//2020 IEEE International Symposium on Circuits and Systems.2020:1-8.
[66]CAO Y J,QIAN D P,WU W G,et al.Adaptive Scheduling Algorithm Based on Dynamic Core-Resource Partitions for Many-Core Processor Systems[J].Journal of Software,2012,23(2):240-252.
[67]MUSAVVIR S,CHATTERJEE A,KIM R G,et al.Power,Performance,and Thermal Trade-offs in M3D-enabled Manycore Chips[C]//2020 Design Automation & Test in Europe Confe-rence & Exhibition.2020:1752-1757.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!