SOALA : An Overview

 

This project develops an Autonomous Learning Agent (ALA) as a building block for self-organized group of agents . Each of these agents learns to reconfigure a part of the environment. Collaboratively, they reconfigure the entire environment.

Each ALA explores a 3D visual environment and builds a model. It uses this model to perform a reconfiguration task. Our ALA has the following new features.

1.      Each agent builds an object-centered, grounded,  affordance-based, and multi-layered ( OGAM ) model of the environment.

2.      This model enables the agent to perform reconfiguration tasks within the environment.

3.      A multi-layered architecture allows the agent to learn the model.

4.      A set of novel graph learning algorithms perform Exploration driven Actionable Model Generalization (EDAMG).   They build the model by guiding the agent in prediction, exploratory navigation and scene recognition. 

Reconfiguration Task:

The ALA agents learn to perform a complex task in the environment – reconfiguring the objects. Figure 1 illustrates the reconfiguration task by using a set of blocks.

 

Figure showing the first and last states of the world


Figure 1. Reconfiguration Task In the environment

The reconfiguration task is specified to the agent through a reward function. Figure 2 illustrates the agent making changes in the environment, by moving the blocks, in order to maximize the reward function.

The figure showing output behavior

Figure 2. The reconfiguring behavior of the agent after learning the model of the environment.

OGAM Model:

The design of the ALA views that the agent explores in a space of visual relationships with object features. This space is continuous and can be open. This is in sharp contrast with the view that robots navigate in an environment associated with a bounded global coordinate system [Thrun 1999]. The model built by the agent has the following unique features. 

         Object_centered Model: The model is built using the features that specify the relationships between the agent and the features of the objects in the environment.

         Grounded Model: The features in the model are grounded in the visual data received as percepts from the objects.

         Affordance_based Model: The features in each model state are functions of action parameters of the agent. For example, the length of a feature is a function of the number of pulses to an effector.

         Multi-layered Model: The agent builds models of incremental complexity. This complexity is addressed by building the models along the dimensionality of object properties – shape, behavior, interaction and task-specificity.

Architecture:

The architecture allows a multi-layered model to be built and makes use of the model. For this, the architecture has learning and execution methods organized at the similar levels as the OGAM models. This architecture differs from SSH [Kuipers 2000 ] architecture where each layer of a model is pre-specified by a set of semantics.

Graph Learning Algorithms:

Each learning method in our architecture builds a unique graph structure. For example, one method builds the shape graph of an object, whereas another builds a behavioral graph of the same object. In the process of building a graph the latter method makes use of the graph generated by the first method. Thus our learning algorithms are layered. The difference between a method at a layer from the one at another is not merely at the generalization level of a layer, but at the semantics of the layer.   Our learning algorithms are different from the conventional learning algorithms in the following ways.

        Actionable Models: Produce models that are actionable – increase the ability of the agent to act in the real world, not merely to classify a pre-enumerated data.

        Incremental Model Building oriented Exploration: Guide the agent to explore the environment for data such that the agent can build models incrementally.  

        Exploration supportive Model Usage: Use the model, at every stage of building models for navigation, prediction and scene recognition.

        Actionable Incremental Generalization: The agent incrementally discretizes its continuous relationship space in order to explore data based on its navigational capabilities in order to generalize its models.   

Self-Organizing Agents :

In order for different reconfiguring ALAs to perform collaborative reconfiguration, we have added one more layer to our architecture. This layer is based on the Hormone based messaging mechanism. The agents can reorganize themselves within the environment based on the models they have developed.