TY - JOUR
T1 - A K-step look-ahead analysis of Value Iteration algorithms for Markov decision processes
AU - Herzberg, Meir
AU - Yechiali, Uri
PY - 1996/2/8
Y1 - 1996/2/8
N2 - We introduce and analyze a general look-ahead approach for Value Iteration Algorithms used in solving both discounted and undiscounted Markov decision processes. This approach, based on the value-oriented concept interwoven with multiple adaptive relaxation factors, leads to accelerating procedures which perform better than the separate use of either the concept of value oriented or of relaxation. Evaluation and computational considerations of this method are discussed, practical guidelines for implementation are suggested and the suitability of enhancing the method by incorporating Phase 0, Action Elimination procedures and Parallel Processing is indicated. The method was successfully applied to several real problems. We present some numerical results which support the superiority of the developed approach, particularly for undiscounted cases, over other Value Iteration variants.
AB - We introduce and analyze a general look-ahead approach for Value Iteration Algorithms used in solving both discounted and undiscounted Markov decision processes. This approach, based on the value-oriented concept interwoven with multiple adaptive relaxation factors, leads to accelerating procedures which perform better than the separate use of either the concept of value oriented or of relaxation. Evaluation and computational considerations of this method are discussed, practical guidelines for implementation are suggested and the suitability of enhancing the method by incorporating Phase 0, Action Elimination procedures and Parallel Processing is indicated. The method was successfully applied to several real problems. We present some numerical results which support the superiority of the developed approach, particularly for undiscounted cases, over other Value Iteration variants.
KW - Adaptive relaxation factor
KW - Look-ahead analysis
KW - Markov processes
KW - Modified policy iteration
KW - Value iteration
UR - http://www.scopus.com/inward/record.url?scp=0030574666&partnerID=8YFLogxK
U2 - 10.1016/0377-2217(94)00208-8
DO - 10.1016/0377-2217(94)00208-8
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:0030574666
SN - 0377-2217
VL - 88
SP - 622
EP - 636
JO - European Journal of Operational Research
JF - European Journal of Operational Research
IS - 3
ER -