Computer Science ›› 2009, Vol. 36 ›› Issue (9): 161-166.
Previous Articles Next Articles
WANG Zhen-zhen, XING Han-cheng
Online:
Published:
Abstract: Human thought is often divided two levels while dealing with problems. First people always treat problems from a whole perspective, i. c.,they have a general plan, then they specifically deal with details. I}he human itself is a good example for having a multi-resolutional characteristic. It can not only generalize bottom-up among multi-levels (the granule of viewpoint abou印roblem becomes "rough" , analogous to abstract) , but also instantiate top-down (the granule of viewpoint becomes "thin",analogous to specification). So we constructed a semi Markov decision process consisting of two Markov decision processes running respectively on two levels-the ideal space (generalization) and the actual space (instantiation). It is called an associated bi-Markov decision model. Then we discussed how to find optimal policy under this associated model. Finally an example was given to show that the associated bi Markov decision process model can economically economize "mind" and is a good tradeoff between the computational validity and computational feasibility.
Key words: Markov decision processes, Rcinforcement learning, Optimal policy
WANG Zhen-zhen, XING Han-cheng. Associated Model of Bi-Markov Decision Processes[J].Computer Science, 2009, 36(9): 161-166.
0 / / Recommend
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
URL: https://www.jsjkx.com/EN/
https://www.jsjkx.com/EN/Y2009/V36/I9/161
Cited