利用空间优化的增强学习Sarsa改进预取算法

doi:10.11896/j.issn.1002-137X.2019.03.048

计算机科学 ›› 2019, Vol. 46 ›› Issue (3): 327-331.doi: 10.11896/j.issn.1002-137X.2019.03.048

利用空间优化的增强学习Sarsa改进预取算法

梁媛,袁景凌,陈旻骋

(武汉理工大学计算机科学与技术学院武汉 430070)

收稿日期:2018-01-26 修回日期:2018-04-01 出版日期:2019-03-15 发布日期:2019-03-22
通讯作者: 袁景凌(1975-),女,教授,博士生导师,CCF会员,主要研究方向为绿色计算、机器学习和数据挖掘,E-mail:yuanjingling@126.com(通信作者)
作者简介:梁媛(1994-),女,硕士生,CCF学生会员,主要研究方向为机器学习,E-mail:lyuan0416@whut.edu.cn;陈旻骋(1990-),男,博士生,主要研究方向为机器学习、数据挖掘。
基金资助:
国家自然科学基金(61303029),湖北省自然科学基金重点类项目——创新群体项目(2017CFA012),湖北省技术创新专项重大项目(2017AAA122)资助

Prefetching Algorithm of Sarsa Learning Based on Space Optimization

LIANG Yuan, YUAN Jing-ling, CHEN Min-cheng

School of Computer Science and Technology,Wuhan University of Technology,Wuhan 430070,China

Received:2018-01-26 Revised:2018-04-01 Online:2019-03-15 Published:2019-03-22

摘要/Abstract

摘要： 数据中心是高性能计算机的集群中心,CPU集群运行繁忙,不规则的数据结构和算法频繁使用,使得大多数基于时空局部性的预取技术不再适用。文中引用语义局部性的概念,使用增强学习Sarsa算法来近似语义位置,预测不规则数据结构和算法未来的内存访问。由于状态空间和动态空间过大,采用Deep Q-learning方法优化状态-动作空间,将新状态与旧状态拟合,相似则采取相似的做法,从而提高泛化能力。在标准数据集SPECCPU 2006上的实验证明,所提方法的泛化能力强,能够有效提高Cache的命中率。

关键词: Deep Q-learning, Sarsa, 语义局部性, 预取技术, 状态-动作空间优化

Abstract: As the cluster centers for high-performance computers,data centers are busy with CPU clusters.Irregular data structures and algorithms are frequently used,so that most prefetching technologies based on spatio-temporal locality are no longer applicable.This paper referred to the concept of semantic locality,used the reinforcement learning Sarsa algorithm to approximate semantic locations,and predicted irregular data structures and future memory accesses of algorithms.Due to the large state space and action space,this paper used Deep Q-learning method to optimize the State-action space to fit the new state with the old one and took a similar approach if the two states are similar to improve the generalization ability.The experiment on the standard data set SPECCPU 2006 proves that this method has a widegene-ralization ability and can improve the Cache hit rate effectively.

Key words: Deep Q-learning, Prefetching technology, Sarsa, Semantic locality, State-action space optimization

中图分类号:

TP302

梁媛,袁景凌,陈旻骋. 利用空间优化的增强学习Sarsa改进预取算法[J]. 计算机科学, 2019, 46(3): 327-331. https://doi.org/10.11896/j.issn.1002-137X.2019.03.048

LIANG Yuan, YUAN Jing-ling, CHEN Min-cheng. Prefetching Algorithm of Sarsa Learning Based on Space Optimization[J]. Computer Science, 2019, 46(3): 327-331. https://doi.org/10.11896/j.issn.1002-137X.2019.03.048

参考文献

[1]SMITH A J.Cache Memories[J].Acm Computing Surveys,
1982,14(3):473-530.
[2]WENISCH T F.Temporal memory streaming[D].Pittsburgh:Carnegie Mellon University,2007.
[3]SOMOGYI S,WENISCH T F,AILAMAKI A,et al.Spatio-
temporal memory streaming[J].Acm Sigarch Computer Architecture News,2009,37(3):69-80.
[4]COOKSEY R N,JOURDAN S J.Method and apparatus for
next-line prefetching from a predicted memory address:US,US 7093077 B2[P].2006.
[5]JOSEPH D,GRUNWALD D.Prefetching using Markov predictors[C]∥Proceedings of the 24th Annual International Sympo-sium on Computer Architecture.IEEE,1997:252-263.
[6]SOMOGYI S,WENISCH T F,AILAMAKI A,et al.Spatial
Memory Streaming[C]∥International Symposium on Computer Architecture.IEEE,2006:252-263.
[7]SOMOGYI S,WENISCH T F,AILAMAKI A,et al.Spatio-
temporal memory streaming[J].AcmSigarch Computer Architecture News,2009,37(3):69-80.
[8]PELED L,MANNOR S,WEISER U,et al.Semantic locality and context-based prefetching using reinforcement learning[C]∥International Symposium on Computer Architecture.ACM,2015:285-297.
[9]MOZER S,M C,HASSELMO M.Reinforcement Learning:An Introduction[J].IEEE Transactions on Neural Networks,2005,16(1):285-286.
[10]IPEK E,MUTLU O,CARUANA R.Self-Optimizing Memory Controllers:A Reinforcement Learning Approach[C]∥International Symposium on Computer Architecture.IEEE,2008:39-50.
[11]SUTTON R S,MCALLESTER D,SINGH S,et al.Policy Gradient Methods for Reinforcement Learning with Function Approximation[J].Advances in Neural Information Processing Systems,2000,12:1057-1063.
[12]TAN F,YAN P,GUAN X.Deep Reinforcement Learning:From Q-Learning to Deep Q-Learning[C]∥International Conference on Neural Information Processing.Springer,Cham,2017:475-483.
[13]LIN L M,WANG H,WANG Y X.Sarsa Reinforcement Lear-
ning Algorithm Based on Neural Networks[J].Computer Technology & Development,2006(1):30-32.(in Chinese)
林联明,王浩,王一雄.基于神经网络的Sarsa强化学习算法.计算机技术与发展,2006(1):30-32.
[14]DURYEA E,GANGER M,HU W.Exploring Deep Reinforcement Learning with Multi Q-Learning[J].Intelligent Control & Automation,2016,7(4):129-144.
[15]HENNING J L.SPEC CPU2006 benchmark descriptions[J].
Acm Sigarch Computer Architecture News,2006,34(4):1-17.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

利用空间优化的增强学习Sarsa改进预取算法

Prefetching Algorithm of Sarsa Learning Based on Space Optimization

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 2

Metrics

本文评价

推荐阅读 0

[1]	李斌, 刘全. 基于最小二乘的双权重学习法 Double Weighted Learning Algorithm Based on Least Squares 计算机科学, 2020, 47(12): 210-217. https://doi.org/10.11896/jsjkx.191100084
[2]	曹新平刘美华韩真古志民张建鑫. 预取技术研究进展计算机科学, 2003, 30(8): 28-30.