SOALA : An Overview
This
project develops an Autonomous Learning Agent (ALA) as a building block for self-organized
group of agents . Each of these agents learns to reconfigure a part of the
environment. Collaboratively, they reconfigure the entire environment.
Each
ALA explores a 3D visual environment and builds a model. It uses this model to
perform a reconfiguration task. Our ALA has the following new features.
1. Each agent builds an object-centered, grounded, affordance-based, and multi-layered
(
OGAM ) model of the environment.
2. This model enables the agent to perform reconfiguration
tasks within the environment.
3. A multi-layered architecture
allows the agent to learn the model.
4. A set of novel graph
learning algorithms perform Exploration driven Actionable Model
Generalization (EDAMG). They
build the model by guiding the agent in prediction, exploratory navigation and
scene recognition.
The ALA agents learn to perform a complex task in the environment – reconfiguring the objects. Figure 1 illustrates the reconfiguration task by using a set of blocks.

Figure 1. Reconfiguration Task In the environment
The reconfiguration task is specified to the
agent through a reward function. Figure 2 illustrates the agent making changes
in the environment, by moving the blocks, in order to maximize the reward
function.

Figure 2. The reconfiguring
behavior of the agent after learning the model of the environment.
The design of the ALA views that the agent
explores in a space of visual relationships with object features. This space is
continuous and can be open. This is in sharp contrast with the view that robots
navigate in an environment associated with a bounded global coordinate system
[Thrun 1999]. The model built by the agent has the following unique
features.
�
Object_centered
Model: The model is built using the features that specify the relationships
between the agent and the features of the objects in the environment.
�
Grounded
Model: The features in the model are grounded in the visual data
received as percepts from the objects.
�
Affordance_based
Model: The features in each model state are functions of action
parameters of the agent. For example, the length of a feature is a function of
the number of pulses to an effector.
�
Multi-layered
Model: The agent builds models of incremental complexity. This
complexity is addressed by building the models along the dimensionality of
object properties – shape, behavior, interaction and task-specificity.
The architecture allows a multi-layered model to be
built and makes use of the model. For this, the architecture has learning and
execution methods organized at the similar levels as the OGAM models. This architecture
differs from SSH [Kuipers
2000 ] architecture where each layer of a model is pre-specified by a set
of semantics.
Each learning method in our architecture builds a
unique graph structure. For example, one method builds the shape graph of an
object, whereas another builds a behavioral graph of the same object. In the
process of building a graph the latter method makes use of the graph generated
by the first method. Thus our learning algorithms are layered. The difference
between a method at a layer from the one at another is not merely at the
generalization level of a layer, but at the semantics of the layer. Our learning algorithms are
different from the conventional learning algorithms in the following ways.
�
Actionable Models: Produce models that
are actionable – increase the ability of the agent to act in the real
world, not merely to classify a pre-enumerated data.
�
Incremental Model Building oriented
Exploration: Guide the agent to explore the environment for data
such that the agent can build models incrementally.
�
Exploration supportive Model Usage:
Use the model, at every stage of building models for navigation, prediction and
scene recognition.
� Actionable Incremental Generalization: The agent incrementally discretizes its continuous relationship space in order to explore data based on its navigational capabilities in order to generalize its models.
In order for different reconfiguring ALAs to
perform collaborative reconfiguration, we have added one more layer to our
architecture. This layer is based on the Hormone based messaging mechanism. The
agents can reorganize themselves within the environment based on the models
they have developed.