计算机科学 ›› 2016, Vol. 43 ›› Issue (4): 188-191.doi: 10.11896/j.issn.1002-137X.2016.04.038
杨际祥
YANG Ji-xiang
摘要: 多核并行编程的开发效率和加速比是影响多核进一步发展的两个重要问题。针对这两个问题,设计并实现了一个轻量级的多核多线程库(UCMLib)。该库基于任务原语概念,提供了数据并行性和任务并行性两种表达逻辑并行性的模式;对多线程编程的复杂性进行了封装和抽象,为开发者提供了高级的编程方法而不必显式地考虑锁和竞争,并降低了并行编程难度以提高开发效率。UCMLib的任务调度器基于对任务队列和工作者线程的有效构建和管理来提高并行程序的加速比。性能测试表明,当计算规模增大时,UCMLib在数据并行性与任务并行性两方面获得了比TPL库略优的加速比。最后给出了可能的性能改进方法以及需要进一步研究的问题。
[1] Yang Ji-xiang,Tan Guo-zhen,Wang Rong-sheng.Some key issues and their research progress in multicore software [J].ACTA Electronica Sinica,2010,38(9):2140-2146(in Chinese) 杨际祥,谭国真,王荣生.多核软件的几个关键问题及其研究进展[J].电子学报,2010,38(9):2140-2146 [2] Karl-Filip F.Wool-A work stealing library [J].ACM SIGAR-CH Computer Architecture News,2009,36(5):93-100 [3] Vikranth B,Rajeev W,Raghavendra R C,et al.Topology Aware Task stealing for On-Chip NUMA Multi-Core Processors[J].Procedia Computer Science,2013,18(1):379-388 [4] Kim W,Voss M.Multicore desktop programming with IntelThreading Building Blocks [J].IEEE Software,2011,28(1):23-31 [5] Koutny T.Experience with Lamport Clock Ordered Events with Intel Threading Building Blocks in a Glucose-Level Prediction Software[C]∥Proceedings of International Work-Conference on Bioinformatics and Biomedical Engineering (IWBBIO’14).Washington:IEEE CS Press,2014:515-526 [6] Navarro A,Asenjo R,Corbera F,et al.A case study of different task implementations for multioutput stages in non-trivial parallel pipeline applications [J].Parallel Computing,2014,48(8):374-393 [7] Leiserson C E.The Cilk++ concurrency platform[J].The Journal of Supercomputing,2010,51(3):244-257 [8] Acar U A,Chargueraud A,Rainey M.Scheduling parallel programs by work stealing with private deques [J].ACM SIGPLAN Notices,2013,48(8):219-228 [9] Singler J,Sanders P,Putze F.MCSTL:the multi-core standardtemplate library[C]∥Proceedings of 13th International Euro-Par Conference (Euro-Par’07).Heidelberg:Springer-Verlag,2007:682-694 [10] OpenMP home page.The OpenMP API specification for parallel programming .(2015-2-3)http://www.openmp.org [11] Lea D.A Java fork/Join framework[C]∥Proceedings of theACM 2000 conference on Java Grande .New York:ACM Press,2000:36-43 [12] Tardieu O,Wang H C,Lin H B.A work-stealing scheduler for X10’s task parallelism with suspension [J].ACM SIGPLAN Notices,2012,47(8):267-276 [13] Leijen D,Schulte W,Burckhardt S.The design of a task parallel library[C]∥Proceeding of the 24th ACM SIGPLAN Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA’09).New York:ACM Press,2009:227-241 [14] Qiu J,Scott B,Seung-Hee B,et al.Performance of WindowsMulticore Systems on Threading and MPI[C]∥Proceedings of the 10th IEEE/ACM International Conference on Cluster,Cloud and Grid Computing (CCGrid’10).Washington:IEEE CS Press,2010:814-819 [15] Mackie R I.Dynamic analysis of structures on multicore computers-Achieving efficiency through object oriented design[J].Advances in Engineering Software,2013,66(12):3-9 |
No related articles found! |
|