Speeding Up the Convergence of Value Iteration in Partially Observable Markov Decision Processes

N. L. Zhang; W. Zhang

doi:10.1613/jair.761

PDF PS HTML

Published: Feb 1, 2001

DOI: https://doi.org/10.1613/jair.761

N. L. Zhang

W. Zhang

Abstract

Partially observable Markov decision processes (POMDPs) have recently become popular among many AI researchers because they serve as a natural model for planning under uncertainty. Value iteration is a well-known algorithm for finding optimal policies for POMDPs. It typically takes a large number of iterations to converge. This paper proposes a method for accelerating the convergence of value iteration. The method has been evaluated on an array of benchmark problems and was found to be very effective: It enabled value iteration to converge after only a few iterations on all the test problems.

Issue

Vol. 14 (2001)

Section

Articles

afiliatedsites

JAIR is published by AI Access Foundation, a nonprofit public charity whose purpose is to facilitate the dissemination of scientific results in artificial intelligence. JAIR, established in 1993, was one of the first open-access scientific journals on the Web, and has been a leading publication venue since its inception.

Learn more

Article Sidebar

Main Article Content

Abstract

Article Details