to ISI Home Page
isd home
About ISD
education at isd
employment
environment
news
people
research
AI Seminars
div3admin

environment
Manuela Veloso
CMU


"Bounding the Suboptimality of Reusing Subproblems"

1/25/1999: [time not recorded]
[location not recorded]

Abstract: I have always been interested in the problem of reusing solution plans to simpler problems to solve more complex ones. It is a well recognized difficult and open question to understand the trade-offs in solution quality of this reuse process. We have been studying this problem within the context of action selection in a non-deterministic environment. Markov decision processes (MDPs) provide a framework for representing this action selection problem, and there are a number of algorithms that learn optimal policies within this formulation. This framework has also been used to study state space abstraction, problem decomposition, and policy reuse. These techniques sacrifice optimality of their solution for improved learning speed. In this talk, we examine the suboptimality of reusing policies that are solutions to subproblems. This is done within a restricted class of MDPs, namely those where non-zero reward is received only upon reaching a goal state. We introduce the definition of a subproblem within this class and provide motivation for how reuse of subproblem solutions can speed up learning. The contribution of this work is the derivation of a tight bound on the loss in optimality from this reuse. We examine a bound that is based on Bellman error, which applies to all MDPs, but does not provide us with a tight bound. We contribute our own theoretical result that gives an empirically tight bound on this suboptimality. This is joint work with Michael Bowling.


Last updated: Mon Jun 19 17:44:06 2006

 

 

 

 

 
USC Home Page ISI Home Page