双马尔可夫决策过程联合模型

Computer Science ›› 2009, Vol. 36 ›› Issue (9): 161-166.

Associated Model of Bi-Markov Decision Processes

WANG Zhen-zhen, XING Han-cheng

Online:2018-11-16 Published:2018-11-16

Abstract

Abstract: Human thought is often divided two levels while dealing with problems. First people always treat problems from a whole perspective, i. c.,they have a general plan, then they specifically deal with details. I}he human itself is a good example for having a multi-resolutional characteristic. It can not only generalize bottom-up among multi-levels (the granule of viewpoint abou印roblem becomes "rough" , analogous to abstract) , but also instantiate top-down (the granule of viewpoint becomes "thin",analogous to specification). So we constructed a semi Markov decision process consisting of two Markov decision processes running respectively on two levels-the ideal space (generalization) and the actual space (instantiation). It is called an associated bi-Markov decision model. Then we discussed how to find optimal policy under this associated model. Finally an example was given to show that the associated bi Markov decision process model can economically economize "mind" and is a good tradeoff between the computational validity and computational feasibility.

Key words: Markov decision processes, Rcinforcement learning, Optimal policy

WANG Zhen-zhen, XING Han-cheng. Associated Model of Bi-Markov Decision Processes[J].Computer Science, 2009, 36(9): 161-166.