Sridhar Mahadevan
Michigan State University
donotspam.mahadeva@cse.msu.edu
"Time, Value, and Memory: A Framework for Autonomous Learning and Sequential Decision-Making"
12/13/2000: [time not recorded]
[location not recorded]
Abstract: Autonomous agents, whether biological or synthetic, are embedded in an
external environment which they primarily experience and act on
sequentially. The sequential nature of perception and behavior raises
a fundamental set of challenges in determining optimal courses of
action for achieving long-term goals. The consequences of a specific
action may not be experienced until many steps later. Furthermore,
perceptual constraints may hide some of the information needed to make
appropriate decisions unambiguously. Finally, the computational
complexity of making optimal decisions may be intractable.
This talk describes a general mathematical framework for modeling
autonomous learning and sequential decision-making, based on three
fundamental building blocks: time, value, and memory. Time refers to
the structure of events underlying decisions. Values predict the
future, and reflect both long-term goals and uncertainty in perception
and action. Memory summarizes the past: it is an essential component
of action in perceptually aliased situations.
Recent algorithms we developed will be presented that exploit
hierarchy and modularity to represent temporally extended actions,
multi-agent task coordination, and memory organization. These
algorithms are tested on several case studies which illustrate the
interdisciplinary scope of the framework, including selective visual
attention, multi-agent manufacturing and scheduling, and spatial
navigation.
Last updated: Mon Jun 19 17:44:06 2006
 |