This section shows how distributed resource management can be viewed as a scheduling problem and why it may be advantageous to do so.
The basic problem in distributed resource management is coordinating the actions of loosely-coupled resources so that they collaboratively accomplish certain tasks: e.g., three sensors should scan the same region of space at the same time in order to obtain appropriate data for tracking a target. While performing their actions, the resources are subject to certain constraints - relating to their physical design, their interaction architecture, legal requirements, etc. - that must be respected by the coordination mechanism.
With tightly-coupled resources, it might be possible to perform coordination on an action-by-action basis: i.e., when a resource finishes its current action, it communicates with the other resources to determine what action it should perform next. However, in the ANTs context, resources are loosely coupled: the communication latency is significant compared with the durations of the resources' actions and/or the required real-time responsiveness. This means, given that coordination is typically NP-hard, that by the time a resource has determined what action to perform, it will likely already be too late to perform the action.
However, if the tasks to be performed and the consequences of the resources' actions are reasonably predictable, then the resources can do some look-ahead to reduce the significance of the communication latency. For example, in the electronic warfare challenge problem, a target is assumed to move more-or-less in a straight line with constant speed: it may deviate or turn, but it should not instantaneously jump from one end of the sensor network to the other. Thus, instead of determining just its next action, a resource can predict, based on the target's motion, what is going to be required of it over some reasonably extended time period and determine an appropriate sequence of actions.
Such a sequence of actions may be considered a schedule if a definite start time can be assigned to each action. If the outcomes of some actions can determine whether or not subsequent actions are executed or affect those actions' start times, then the schedule might be considered a temporal plan.
Of course, sometimes a resource's predictions will turn out to be inaccurate. Rather than obliviously persevere with its existing schedule, a resource may monitor the real world and if it seems to be deviating too much from what was predicted, dynamically reschedule its actions. However, if this happens too frequently, then the resources will end up devoting too much time and effort to scheduling: scheduling is an appropriate approach only when predictions are reasonably good on average.
Several ANTs projects developed a scheduling-based approach to distributed resource management, as detailed below.
Honeywell:
Kestrel Institute:
Altarum:
In a typical constraint optimization or satisfaction problem, the main subjects of consideration are a set of variables (of known types) to which values are to be assigned, and a set of constraints, each of which determines which combinations of values are permissible. For example, suppose that a set of integer variables are to be assigned values in the range 1 to 3, that constraints exist between certain pairs of the variables and that each constraint are such that the two nodes in a constrained pair have to have different values. This problem is essentially an instance of the well-known problem of coloring the nodes of a graph with a fixed number of colors.
Variables: xi
∈ {1, 2, 3}
Constraints: for all {i,j} ∈ C, xi ≠ xj
where C is some set of unordered pairs of variable indices.
There are several classes of constraint problems:
Constraint problems are challenging to solve exactly because they are often impossible to decompose; that is, there is often to way to split a given problem into two or more independent sub-problems that could be solved separately and their solutions readily composed to give a solution to the original problem. Put another way, a change in the assignment to one variable potentially can affect all other variables, requiring their values to be changed to (re)establish satisfaction or true optimality. This the case even for scalable problems, since even though a single variable's change may directly affect only a limited number of other variables, those other variables' changes can further propagate to yet more variables, and so on.
In practical applications, however, approximate solutions may be acceptable; e.g., an approximate solution to a satisfaction problem may contain violations of a small fraction of the constraints, and an approximate solution to an optimization problem may have a total penalty that is only marginally higher than the true minimum. The quality of an approximate solution may be measured in terms of the fraction of constraints that are violated or the difference between the total penalty of the violated constraints and the true minimum. Finding approximate solutions may be much easier in practice that finding exact solutions for at least four reasons:
One problem that arises with approximate solutions, however, is that the quality of the best solutions may be unknown: for satisfaction problems, the smallest number of constraints that have to be violated may be unknown; for optimization problems, the smallest, total penalty that has to be incurred may be unknown. This of course introduces the usual problem for local search algorithms - it may not be known if a solution that is locally optimal is globally optimal.
Given that constraint problems may take a significant time to solve, even approximately, a problem statement may need to consider whether or not intermediate, approximate solutions are available and, if so, how their quality varies over time. For example, one of the themes of the ANTs program is "good-enough, soon-enough solutions", based on the observation that for many real-time problems, it is more important to get some solution in time than to get a perfect solution too late. Consequently, ANTs problems formulated as constraint problems would emphasize the quality of solutions produced in the short term over, say, the asymptotic solution quality.
When a constraint problem is also dynamic, then a problem statement may emphasize the solution quality during the transition period: given that a high-quality solution has been computed for some sets of variables/constraints, how does the solution quality behave when the variables/constraints change? For ANTs problems, it is important that the solution be quickly adapted to the change, rapidly returning to near optimal.
A constraint problem is distributed when the task of assigning values to variables is performed by communicating agents, each of which has knowledge of only a limited part of the problem and/or assignment. For example, an agent may directly know only the values of those variables it is responsible for assigning; the values assigned to other variables must be communicated to it by other agents. Moreover, an agent may not even be aware of the existence of other variables, particularly in a dynamic constraint problem, unless and until information about their existence is communicated to it.
When considering distributed constraint problems, it is important to also consider the communication graph - i.e., which nodes can directly communicate with each other - because communication is often the limiting factor in how quickly distributed nodes can make decisions: if nodes make decisions faster than the decisions can be communicated to other, affected nodes then incoherent behavior is likely to result. Consequently, it often seems natural to measure time in distributed problems in terms of the number of decision-communication steps.
A common, simplifying assumption is that (the agents for) nodes connected by a constraint can directly communicate. However, for some types of physical resources, a more realistic assumption is that the communication range is limited, which has two variants: (i) any two nodes within a certain distance of each other can communicate; (ii) a node can simultaneously transmit to all nodes within a certain distance (which is a simple model of low-power, short-range radio communication).
It should be noted that a limited communication range means that communication over arbitrary distances does not have a uniform cost (e.g., measured in terms of communication time): the greater the distance between nodes, the more transmission "hops" a message between them will require. This may have consequences for solution techniques, such as those employing spanning trees or hierarchical organization, that tacitly assume a communication cost model that is independent of distance.
Distributed, real-time constraint optimization problems are of interest to the ANTs program because they reflect many of the major problems associated with distributed resource management (lack of global knowledge, communication latency, resource inter-dependencies, irreducible problems) but are presented in a clean mathematical formulation that, while only occasionally admitting useful analysis, nevertheless provides for simple experiments relating to important properties (like scalability and stability) with well-defined metrics on the outcomes (based on the quality of solutions).
Indeed, a particular form of scheduling (determining periodic schedules for mutually exclusive resources) can be directly formulated as the well-known constraint satisfaction problem of coloring the nodes of a graph: each node is to be assigned a "color" from a fixed set of colors, such that nodes connected by an edge have different colors; the number of colors may be fixed in advance, or may be required to be minimized.
Several ANTs projects investigated distributed constraint optimization, as detailed below.
University of Massachusetts:
Kestrel Institute:
Washington University at St. Louis:
The challenge problem infrastructure includes several, interchangeable software modules for target tracking. Each module is designed to be instantiated for each target that is to be tracked, after which the instantiation, or tracker, is continually fed sets of more-or-less simultaneous radar measurements relating to that target. (Preferably, each set of measurements would contain data from at least three radar units and span no more than two seconds.) The tracking API provides functions for estimating a target's position and velocity at any time, based on a straight-line extrapolation of its track.
Details of the algorithms used for tracking are provided in the references. Here, a brief, high-level description is provided of how radar data is interpreted. Each radar measurement may provide an amplitude, representing the strength of the reflected signal, and a frequency, representing the Doppler shift caused by the target's motion towards or away from the radar.
In order to simplify signal processing, the challenge problem used standardized
targets, consisting of a corner reflector made of copper disc of a standardized
size. Because the targets are essentially interchangeable and corner reflectors
provide a more-or-less uniform reflection regardless of the angle of incidence
of the radar beam (at least in theory), the reflected signal strength depends
only on the position of the target relative to the radar unit. Specifically:
reflected(R, phi) = K/R2 e-phi2/A2
where reflected is the strength of the signal received by the radar unit's
detector from a target at a distance R from the unit and at an angle
phi (in the range -180o to 180o)
from the mid-line of the emitter's sector, and where K and A are
constants for a given head describing, respectively, the power and beam width
of the head's emitter.
For a given measured signal strength, this equation determines a contour of possible target positions, approximating an oval with one end located just behind the radar unit. For example, Figure 2.2.5 shows three contours, corresponding to signal strengths of 10, 100 and 1000, for a radar head positioned at the origin and looking along the positive horizontal axis. Note that the strongest signal corresponds to the smallest contour.
Figure 2.2.5: Contours showing the possible positions of targets for various signal strengths
In order to reduce an estimate of a target's position to a single point, in general three such contours must be intersected - see Figure 2.2.6, in which the contours have been approximated using ovals. However, all physical sensors are subject to noise which typically results in the contours may not intersect in a single point, so generally a best-estimate of position must be determined. (For the particular hardware used for the challenge problem, measurement noise can arise from dark current in the detector, fluctuations in the emitter, interference from environmental electro-magnetic sources and deviations of the physical targets from the theoretical model of a corner reflector.)
Figure 2.2.6: Three intersecting contours fix a target's position
Moreover, to reduce the effects of noise, it is generally desirable to combine estimates derived from new measurements with estimates derived by extrapolating the target's track. Also, each frequency measurement gives an estimate of the target's radial speed (i.e., the component of the target's velocity along a vector from the radar unit to the target) which can also be combined into an overall position estimate.
University of South Carolina:
Washington University at St. Louis:
This section on negotiation is closely linked to the work described in section 3.2 on distributed optimization and distributed constraint satisfaction. This is especially true if you take the perspective that negotiation among agents is, in reality, an implementation of a distributed search. This perspective is especially valid in the situation where agents are cooperative, i.e., the goal of the search is to maximize some global utility function. It is this cooperative view of agent interaction that is the primary focus of the negotiation techniques described in this section. There are a wide variety of different approaches that have been explored in this program, many of them that were adaptive to the available communication bandwidth for the negotiation and the amount of time for the negotiation to complete.
The University of Kansas team developed two approaches to negotiation. One approach was based on the use of case-based reasoning (CBR) technology, where the cases were learned as a result of repeated experimentation in the EW challenge problem's real-time multisensor target tracking environment. In this work, agents attempt to optimize the use of their own consumable resources while adhering to the global goal, i.e., accurate and effective multisensor target tracking. Agents negotiate based on different strategies which are selected and instantiated based on previously stored cases that take into account their resources, including system-level ones such as CPU allocation. The other approach studied by this team involved a coalition formation model for a cooperative multiagent system in which an agent forms sub-optimal coalitions based on its local (incomplete) view of the noisy, dynamic, and uncertain world, and its need to respond to events within time constraints. Their model had two stages: 1) when an agent detects an event in the world, it first compiles a list of coalition candidates that it thinks would be useful (coalition initialization), and then 2) the agent negotiates with the candidates (coalition finalization). A negotiation in this case is an exchange of information and knowledge for constraint satisfaction until both parties agree on a deal or one opts out. Each successful negotiation adds a new member to the agent's final coalition.
The University of Southern California/ISI team also explored a number of approaches to the distributed resource allocation problems inherent in the EW challenge problem. One approach showed how a class of distributed resource allocation problems could be mapped into a distributed constraint satisfaction problem, and developed analytical results about the complexity of the constraint satisfaction problem depending on the characteristics of the distributed resource allocation problem. They also developed a more general approach to solving resource allocation problems by mapping them into a distributed constraint optimization (DCOP) formalism. They developed a new polynomial-space algorithm for DCOP, called Adopt, which is able to model a large class of collaboration problems in multi-agent systems where a solution within given quality parameters must be found. Existing methods for DCOP are not able to provide theoretical guarantees on global solution quality while operating both efficiently and asynchronously. Adopt is guaranteed to find an optimal solution, or a solution within a user-specified distance from the optimal, while allowing agents to execute asynchronously and in parallel. Adopt obtains these properties via a distributed search algorithm with several novel characteristics, including the ability for each agent to make local decisions based on currently available information and without necessarily having global certainty. Theoretical analysis shows that Adopt provides provable quality guarantees, while experimental results show that Adopt is significantly more efficient than synchronous methods. The speedups are shown to be partly due to the novel search strategy employed and partly due to the asynchrony of the algorithm. The ISI team also explored an alternative approach involving a probabilistic representation of resources and tasks. This representation was used to deal with uncertainty and dynamics of the environment. In this approach, local reasoning was used to deal with delays in the distributed resource allocation algorithm.
The University of Massachusetts-Amherst team looked at the use of negotiation not only for distributed resource allocation in the EW challenge problem, but also more generally in terms of distributed scheduling problems. For the EW problem they developed a cooperative negotiation protocol for the distributed resource allocation problem while conforming to soft real-time constraints in a dynamic environment. In this protocol, called SPAM, two central principles were used that allow it to operate in constantly changing conditions. First, they framed the allocation problem as an optimization problem, similar to a Partial Constraint Satisfaction Problem (PCSP), and used relaxation techniques to derive conflict (constraint violation) free solutions. Second, by using overlapping mediated negotiations to conduct the search, they were able to prune large parts of the search space by using a form of arc-consistency. This allows the protocol to quickly identify situations where the problem is over-constrained and to identify the appropriate fix for the over-constrained problem. From the global perspective, the protocol has a hill-climbing behavior and, because it was designed to work in dynamic environments, is an approximate one. Their work on distributed scheduling attacked the problem of how to order negotiations in an agent when there were multiple negotiations required by the agent in order to acquire the resources needed for it to complete its activities successfully. The UMass team presented a technique for ordering the negotiations based on the use of a partial-order schedule and a measure of the schedule, called flexibility, which enables an agent to reason explicitly about the interactions among multiple negotiation issues.
The SRI team has explored a class of task allocation mechanisms that are incremental and can be tuned to the computational resource limitations of agents. Their focus was on distributed task and resource allocation problems involving coalitions of cooperative agents that must negotiate among themselves on the distribution of tasks. Their emphasis was on the design of mechanisms with desirable real-time and dynamic properties. Their work was in the following primary areas: 1) the design of time-bounded commitment networks that are extensions of task-auctions and contract nets and that support a notion of reciprocal commitment; 2) anytime algorithms for combinatorial task allocation that take into account both positive and negative task interactions, 3) organizational frameworks for efficient task allocation in highly dynamic domains involving hundreds of agents, and 4) logical tools for analyzing dynamic emergent properties of agent societies. An example of their work is an anytime algorithm they developed for adapting a task allocation negotiation to a dynamically changing environment in which both the set of tasks and the set of available resources can change during the negotiation. Task allocations are proposed by a mediator to a cooperative team of agents that then bid in the context of possible contributions from other agents. When a new task arrives, progress on an ongoing negotiation need not be discarded. Unlike auction or contract net protocols, agents can augment their bids with information that compactly captures important elements of their local state in the form of summaries of potential positive and negative interactions with other tasks. A mediator can then make intelligent use of such information from prior interactions to guide the search for a revised allocation within a new context.
The Honeywell Research Labs team showed how negotiation among agents for task allocation could be implemented where there are real-time performance guarantees for critical missions (as represented by a UAV application domain) that require cooperation. In particular, they developed methods for teams of MASA-CIRCA agents to build coordinated plans that include explicit runtime communications to support distributed real-time reactivity to the environment. These teams build plans in which different agents use their unique capabilities to guarantee that the team will respond in a coordinated fashion to mission-critical events. By reasoning explicitly about different agent roles, the agents can identify what communications must be exchanged in different situations. And, by reasoning explicitly about domain deadlines and communication time, the agents can build reactive plans that provide end-to-end performance guarantees spanning multiagent teams. The MASA-CIRCA agents communicate and negotiate at all levels of the architecture to coordinate their activities. They also studied how agents can cooperate to revise their plans as they attempt to ensure that they do not over-utilize their local resource capacities. An agent in a multiagent environment should, in principle, be prepared for all environmental events as well as all events that could conceivably be caused by other agents' actions. The resource requirements to execute such omnipotent plans are usually overwhelming, however. Thus, an agent must decide which tasks to perform and which to ignore in the multiagent context. Their strategy is to have agents selectively communicate relevant details of their plans so that each gets a sufficiently accurate view of the events others might cause. Reducing uncertainties about the world trajectory improves the agents' resource allocation decisions and decreases their resource consumptions.
The Altarum team approached the issue of negotiation from a more theoretical perspective based on an information-theoretic model. This led them to consider negotiation in the context of other processes that can transfer information among agents, such as those mediated by the environment. They illustrated these concepts using the Minority Game, an abstract model of resource allocation that achieves coordination without negotiation in the classical sense. This work illustrates two important lessons: 1) indirect (environmentally mediated) negotiation can achieve agent coordination even without conventional negotiation; 2) the information flow between agents and the environment is likely to affect the dynamics of systems even when they use more conventional negotiation. This team also explored from a more formal perspective the distinction among different forms of agent interactions such as "coherent," "collaborative," "cooperative," "competitive," or "coordinated." They argue that all of these are specializations of the more foundational category of correlation, which can be measured by the joint information of a system. They also propose congruence as a category orthogonal to the others, reflecting the degree to which correlation and its specializations satisfy user requirements. They then explored the degree to which lack of correlation can arise purposefully, and showed the need to use formal stochasticity in cases where such lack of correlation is truly necessary (such as in stochastic search).
An important focus of the ANT's program was showing that complex and adaptive resource allocation strategies could be implemented in soft real-time through cooperative behavior among agents. What we mean by soft real time is situations where real-time constraints on the completion time for resource allocation decisions can be satisfied for the most part, but there is no guarantee that they will be satisfied 100% of the time. This is appropriate for environments where missing a deadline is not catastrophic or where the percentage of situations that cannot be handled in real time are extremely rare, and given available resources there is no way to guarantee that the system can effectively handle all contingencies. A number of groups approached this problem by building their resource allocation strategies in the context of an agent architecture that has soft real-time properties. Other approaches exploited an organizational structure to reduce the computational burden inherent in resource allocation decisions by limiting the scope and range of these decisions.
One approach that was developed was the SRTA soft real-time agent control architecture developed by the University of Massachusetts - Amherst team. The SRTA architecture has a sophisticated control engine with relatively low overhead that provides several key features which enable simplified planning involving dynamic goals to meet real-time constraints. They include: 1) the ability to quickly generate plans and schedules for goals that are appropriate for the available resources and applicable constraints, such as deadlines and earliest start times; 2) the ability to merge new goals with existing ones and multiplex their solution schedules; and 3) the ability to efficiently handle deviations in expected plan behavior that arise out of variations in resource usage patterns and unexpected action characteristics. This architecture uses a domain-independent hierarchical task/plan network to represent alternative ways of satisfying a goal with different levels of resource usage utility and uncertainty. It exploits a variety of mechanisms including meta-level costing, partial-order scheduling, and plan caching to achieve real-time levels of performance for the EW sensor network challenge problem.
Another approach to real-time agent control, which is more formally oriented than the SRTA architecture, is the MASA-CIRCA architecture developed by the Honeywell team. The application domain in which this work was evaluated was the UAV environment involving a small number of vehicles. More generally this architecture can be applied to developing autonomous, flexible control systems for mission-critical applications. These applications require hybrid real-time control systems, capable of effectively managing both discrete and continuous controllable parameters to maintain system safety and achieve system goals. Using the MASA-CIRCA architecture for adaptive real-time control systems, these controllers are synthesized automatically and dynamically, online, while the platform is operating. Unlike many other AI planning systems, these automatically generated control plans have strong temporal semantics and provide safety guarantees, ensuring that the controlled system will avoid all forms of mission-critical failure. One of the important directions that this work has taken was to develop meta-level control components in the architecture to manage the time involved in constructing and refining plans. This time must be carefully controlled because it is time not available for domain activities.
Another body of work involves the use of organizational structuring as a tool for dealing with issues of scale in large agent societies. In such societies, the resource allocation decision problem is from a real-time computational perspective-infeasible, if the scope and range of these allocation decisions are not in some way limited. This work by the UMass group has been done in the context of the EW sensor network challenge problem. They show how partitioning of the environment through an organization's structure can lead to better locality of resource allocation decisions and more constrained communication. Details are also provided on the creation and maintenance processes associated with the different organizational structures being used. This also involves issues of how the organization can be initially created, or what is referred to as organizational self-design. In this case, negotiation protocols are presented for iteratively creating the initial organization given an unknown placement of sensor nodes in the environment, and evolving the organization as nodes die and new nodes enter the system.
A related effort is the pheromone learning technique demonstrated by the Altarum team. This technique is inspired by Takens' theorem in nonlinear dynamical systems. This theorem asserts that a system's trajectory in its state space can, under appropriate circumstances, be recovered from an observable that is not itself a state variable. In particular, information locally visible to a single agent might capture enough of the system's global state to enable the agent to make local decisions that advance the system's global objectives, without the need for access to global information per se. This insight can be used to adjust the organizational structure of a system dynamically. Using local observables that reflect global conditions, individual agents decide whether to modulate their bidding or even drop out of competition entirely, thus avoiding thrashing behavior, and shifting agent behavior from exploration to exploitation as the system approaches a deadline.
The main focus of the original ANT program was to develop novel approaches to negotiation in the broad domain of resource allocation. Such algorithms are susceptible to nonintuitive emergent behavior that may impact the ability of a system to fulfill its mission, and a subset of the contractors are studying these issues. This body of work is built around two complementary perspectives: complexity and dynamics. Both perspectives rely on mapping the resource allocation problem to simpler, abstract models that can be studied using analytic and simulation mechanisms. This section summarizes these models, then discusses the complexity and dynamics perspectives.
The main approaches taken to studying emergent complexity and dynamics are formal analysis and simulation. A system designed to solve a problem in a specific domain does not lend itself naturally to either approach. The details of the domain can complicate the analysis and slow down simulations, while making it difficult to distinguish fundamental characteristics of the negotiation approach from artifacts of the domain. Thus most of the complexity and dynamics work is based on abstract models that capture key aspects of distributed resource allocation while abstracting away the domain details. This section summarizes some of these models.
Boolean Satisfaction (SAT).--A prototype example of complexity is the constraint satisfaction problem, exemplified by Equation 3.6.1.1. A particular problem is formulated as a conjunction of disjunctive clauses. Each disjunctive clause has N Boolean variables , optionally negated, and the resulting problem is called an instance of N-SAT.
| Equation 3.6.1.1 |
It is customary to characterize the behavior of such systems by generating random populations of problems, each constrained only by the number of different variables and the number of disjunctive clauses.
Pseudo-Boolean Encoding.--An important variation on the Boolean SAT encoding is the pseudo-Boolean encoding, made up of inequalities over expressions whose variables are real-valued in the range of [0-1]. The dominant reasoning mechanism is not resolution (as in SAT), but cutting planes established by linear combinations and rounding. This representation has several benefits over the Boolean SAT encoding. It can express counting (for example, to account for how many resources have been used to address a need). It is exponentially more concise than SAT: the shortest resolution proof of the pigeon-hole problem is exponential, but a pseudo-Boolean representation can solve the problem in polynomial time. A major research thrust is determining whether this representation can be exploited as well as the traditional SAT approach. (Alejandro...)
Resource-Requirement Tables.--A straightforward way to represent a resource allocation problem is as a matrix whose columns represent available resources, whose rows represent available requirements, and whose cells are 1 if the resource is eligible to satisfy the requirement and 0 otherwise. Such a structure can be extended in a number of ways. For example,
Graph Coloring.--Graph coloring is a useful abstraction of many resource allocation problems. Each node in the graph represents a task, each color represents a resource, and an edge between two nodes indicates that a single resource cannot service them simultaneously. Under this model, resource allocation becomes a matter of coloring the graph to minimize the proportion of adjacent nodes that have the same color.
Minority Game.--The Minority Game (MG) is an abstract model of resource allocation that has been extensively discussed in the literature of statistical physics. Consider an odd number, N, of agents (representing tasks) and two resources labeled 0 and 1. At each time step, each agent chooses one of the resources. A resource is overloaded and unable to function if chosen by more than half of the agents. So each agent assigned to the less-used ("minority") resource at a time step receives a point, while each agent on the overloaded resource gets nothing. The best system-wide outcome is a balanced resource load, where the majority and minority populations differ by only one, so a reasonable metric of performance is the standard deviation of the resource population (where a lower standard deviation is preferred). The agents make their decisions on the basis of two kinds of information: the (public) time series G of minority resource identifiers (a string of 0's and 1's), and a set of (agent-specific private) strategies. A strategy is a map from all 2m possible m-step tails of G to a particular resource. Table 3.6.1.1 illustrates a possible strategy for m=3. For instance, if resource 0 were the minority resource on the last three steps, this strategy would recommend choosing resource 0 on the current step. An agent learns a cumulative performance rating for each of its strategies. Following each round of decisions, the agent updates the rating for each of its strategies by comparing that strategy's latest prediction with the current minority group. Then, in choosing its next move, it uses the strategy with the highest performance rating. The Altarum team, which used this game extensively in its work, has extended it in a number of ways to make it more representative of real resource allocation problems, as described in this document.
Table 2: An m=3 Strategy
| m-suffix of G | Choice |
|---|---|
| 000 | 0 |
| 001 | 1 |
| 010 | 1 |
| 011 | 0 |
| 100 | 0 |
| 101 | 1 |
| 110 | 0 |
| 111 | 1 |
This simple model is similar at many points to ANT systems:
Rate equations.--If one assumes that the time required to perform a portion of a task should equal the fraction of resources allocated to the task times the amount of time the work took, one can develop a differential equation for the rate of change of the fraction of the task remaining to be done that permits studying overall system response under different allocation rules. This approach is used by the dynamics team at Utah State University.
The complexity perspective focuses on how the computational tractability of a problem varies with the structure of the problem. Basic complexity theory recognizes that different problems can be classified on the basis of the computational effort required to solve their hardest instances (Garey and Johnson 1979). However, not all instances of a problem are of equal difficulty, and often the instances of a problem that are encountered in practice are tractable even if the problem as a whole belongs to an intractable class such as NP-complete. In general, many resource allocation problems are intractable, but the research summarized in this section seeks to understand how one can distinguish tractable instances, and provide feedback to systems that will enable them to avoid intractable regions.
As an example, consider Boolean satisfaction. When the number of disjunctive clauses is small compared with the number of variables, almost any assignment of {T,F} to the variables will satisfy the overall expression. Conversely, when the number of clauses is large compared with the number of variables, the system is likely to be overconstrained, and the probability of finding an assignment to the variables that satisfies the expression is vanishingly small. The transition between these extremes is not linear, but follows an "S"-shape. Figure 3.6.2.1, from Mitchell et al. 1991, shows the transition for 3-SAT.

Figure 3.6.2.1: Probability of Solution of 3-SAT as function of Clause/Variable Ratio
Interestingly, the effort required to determine whether a particular assignment of variables is solvable or not varies across the range of the clause/variable ratio, as shown in Figure 3.6.2.2, again for 3-SAT. The effort peaks at a ratio near 4.3.

Figure 3.6.2.2: Difficulty of Solution of 3-SAT as function of Clause/Variable Ratio
The work reported in this section focuses on such transitions.
Bugacov et al. show how a problem encoding that respects a problem's modular constraint structure can render an otherwise intractable problem solvable.
D. W. Etherington and A. J. Parkes have identified the existence of scaling thresholds other than formal phase transitions. Phase transitions (such as the one that appears at a clause/variable ratio of 4.3 in 3SAT) are independent of the algorithm one uses to solve the problem, and typically divide the problem space into three regions: an easy region where solutions are readily found, a hard region where about 50% of problems have solutions but finding them is very difficult, and a less-hard region where almost all problems are not solvable. The thresholds identified by CIRL occur within the easy region, well below the phase transition, and depend on the particular algorithm used to solve the problem.
W. Zhang has applied phase transition methods to other problem structures: the Max-SAT problem, and the traveling salesman problem.
Garey, M. R. and D. S. Johnson (1979). Computers and Intractability. San Francisco, CA, W.H. Freeman.
Mitchell, D., B. Selman, et al. (1992). Hard and Easy Distribution of SAT Problems. AAAI-92, San Jose, CA.
D. W. Etherington and A. J. Parkes. Expressivity and utility of generic representation schemes PDF
W. Zhang. Modeling of UMass' negotiation protocol and its complexity and scalability. PDF
A. Bugacov, P. Szekely, G. Pike, D. Kim, C. Domshlak, C. Gomes and B. Selman. Multi-resolution modelling and Pseudo-Boolean encoding of the SNAP scheduling problem PDF
D. W. Etherington and A. J. Parkes. Phase Transitions and Algorithmic Thresholds. PDF
W. Zhang, A. Rangan and M. Looks. "Backbone Guided Local Search for Maximum Satisfiability" PDF
W. Zhang. "Phase Transitions of the Asymmetric Traveling Salesman". PDF
A.Bugacov, D. Kim, C. Gomes and B. Selman. Modularity and Complexity Profiles in Overconstrained Resource Allocation Problems. PDF
The complexity perspective formulates a resource allocation task as a problem and focuses on the computational resources needed to solve it. A complementary perspective, the dynamics perspective, formulates the task as a system of interacting elements and focuses on the relation between the behaviors of the individual elements and the behavior of the system as a whole.
Most projects in the ANT program apply negotiation mechanisms to enable multiple entities to coordinate their resource requirements. When either the individual entities or the interactions are nonlinear, the behavior of the overall system can differ qualitatively from the sum of individual behaviors. This intuition has been captured in the slogan, "A traffic jam is not a big car." Among other differences, it moves in the opposite direction, as shown in Figure 3.6.3.1. (The red car moves from left to right, but the clusters of vehicles that form move from right to left.)

Figure 3.6.3.1: A Traffic Jam is Not a Big Car
Other systems of interacting elements can exhibit analogous emergent behavior, which may include oscillations, thrashing, and congestion. Such phenomena can have serious operational consequences, including delays, increased cost, brittleness of a plan in the face of dynamic change, a high range of variability in outcomes, suboptimal output, and so forth. Because of the potential importance of these effects, the dynamics projects in the ANT program developed a variety of mechanisms to predict, detect, and control the system-level dynamics of negotiating teams. These methods include theoretical analysis, simulation experiments with abstract models, and decision aids for specific ANT projects.
Table 3.6.3.1 summarizes the three teams whose work is reviewed in this section.
Table 3.6.3.1: Dynamics Projects in ANT
| Institution | PI's | Approach |
|---|---|---|
| Altarum Institute Ann Arbor, MI |
Van Parunak Bob Savit |
Abstract resource allocation games Phase shifts in constraint optimization Pheromone Learning |
| CIRL University of Oregon |
David Etherington Andrew Parkes |
Pseudo-Boolean constraint reasoning Coalition Info Exchange |
| Utah State University | Dan Watson Jim Powell Todd Moon |
Rate Equations Praxeic Information Theory Catastrophe Theory |
Utah State explored three approaches to understanding the dynamics of interacting systems: Rate Equations, Praxeic Information Theory, and Catastrophe Theory.
Rate equations are differential equations capturing the rate at which a task is completed. They permis studying overall system response under different rules for the allocation of resources, including the democratic model (dividing resources evenly across the tasks), the opportunistic model (favoring smaller tasks, which can be completed quickly, thus freeing up resources for other tasks), and the crisis model (favoring tasks that are closest to their deadlines).
Praxeic information theory is a decision formalism that defines the set of options for which benefits outweigh the costs, weighted by the agent’s boldness. It combines the formal objectives of optimization with the tractability of heuristics, and supports multi-agent reasoning and negotiation through distinct rejectability, selectability, and boldness measures for different participants.
Catastrophe Theory deals with the structural properties and qualitative nature of parameterized functions. It captures phenomena such as sudden jumps, hysteresis, and divergence of trajectories. An example of its application is the analysis of the time needed to complete a task as a function of both the number of available resources and the coordination overhead. Under suitable assumptions, this function is a classic instance of the cusp catastrophe.
CIRL's work on dynamics focused on two methods for preventing thrashing. Thrashing can occur either internal to an ANT coalition (as the system does continual, but doomed, reallocation), or externally (because coalitions exchange only point solutions that do not adequately convey the constraints faced by each coalition). These two causes lend themselves to distinct solutions.
Internal thrashing can be addressed by improved reasoning methods. Detecting resource allocation problems requires the ability to count, which is difficult in conventional SAT/CSP methods. CIRL has been exploring pseudo-Boolean representations that do support such reasoning, and has developed PARIS, a systematic pseudo-Boolean solver used in the SNAP demo.
External thrashing can be reduced if coalitions exchange not point solutions, but clusters of solutions that provide enough context to enable the negotiation partner to make reasonable suggestions.
Altarum explored a broad array of techniques for detecting, monitoring, and controlling the dynamics of distributed resource allocation, in both the ECM and logistics domains and in partnership with several other team members, including ISI, Vanderbilt, Kestrel, and the University of Kansas. This ebook includes a summary report describing the scope of our work (but not including the SNAP/MAPLANT demo that applies some of these techniques). Three noteworthy results from this work are the value of abstract resource allocation games for modeling the dynamics of such systems, phase shifts in constraint optimization, and a novel technique for pheromone learning and deadline management. All of these results were supported by an environment for exploring dynamic landscapes that we evolved through the course of the program.
Abstract resource allocation games are simplified models of resource allocation that lend themselves to extensive simulation experiments. Altarum focused on versions of the "minority game," in which a population of agents repeatedly allocate themselves to a limited set of resources. In the basic minority game, the population of agents is one greater than the capacity of a set of two equal-sized resources. A short paper and a longer report describe experiments with a generalization of this game in which capacity and demand vary over a wider range, including both under- and over-capacity configurations. These games are a specific instance of indirect negotiation, a novel approach to multi-agent coordination.
Phase shifts in constraint optimization are an extension of the work described in Section 3.6.2 above. That section described how constraint satisfaction problems exhibit an effort profile that shifts from easy to hard and back to easy as a control variable (such as the ratio of clauses to variables in 3SAT) is varied monotonically. Early experiments suggested that constraint optimization problems exhibited only easy-hard transitions. We were interested, then, to discover instances of easy-hard-easy transitions in constraint optimization (as opposed to easy-hard) problems, a circumstance documented here.
Pheromone learning is a set of mechanisms that enable a local agent to adapt its individual actions to the state of the overall system, without the need for direct access to the global system state. The knowledge acquired by the agent can be both spatial and temporal. Spatial knowledge concerns the relation between the agent's actions and those of other agents, such as discerning when the system has stabilized to the point that further local changes would be counterproductive. Temporal knowledge concerns the proximity of deadlines, which determines whether agents should focus on exploratory or exploitational behavior. The basic mechanisms are explained in this paper. We used some of them in the RAPSIDY solver (Resource Allocation Problem Solver Incorporating Dynamics) we constructed for use in the logistics demo described in Section 4. The effect of these mechanisms can be seen in two demonstrations for which AVI files are included in this eBook. (These files are on the order of 30-40M each.) One demonstration shows the application of pheromone learning to task matrices such as those used in the logistics demo. The other applies them to the distributed graph coloring problem, as discussed in a paper. (The graph coloring demonstration presumes the definitions and concepts described in the paper, which should be read first.)
The achievements of the dynamics research in ANT can be summarized under three heads: an extended suite of representations for studying dynamics in resource allocation, an extended theory of computational phase phenomena, and an extended toolbox for builders of autonomous negotiating systems.
The extended suite of representations includes Pseudo-Boolean systems (CIRL), Rate Equations (Utah), Abstract Resource Allocation Games (Altarum), Graph Coloring Games (Altarum), Catastrophe Theory (Utah), and Subgoal-Constrained Coalition Communication (CIRL), and is discussed further in Section 3.6.1 above.
The extended theory of phase phenomena embodies two contributions. First, it offers a more nuanced description of phase phenomena, distinguishing transitions (which are points of nonanalyticity) from shifts (which are empirical points of discontinuity) and recognizing that phase phenomena can be driven either by problem structure or features of an algorithm (the latter being termed a "threshold") (see Table 3.6.3.2). Second, it corrects the traditional lore about effort profiles in Decision vs. Optimization, showing that this profile can be algorithm-dependent, and that easy-hard-easy profiles can appear in both categories of problem.
Table 3.6.3.2: Categories of Computational Phase Phenomena
| Algorithm-Independent: Problem Focus | Algorithm-Dependent: Algorithm Focus (CIRL: "Threshold") | |
|---|---|---|
| Formal: Nonanalyticity (Altarum: "Transition") | 3SAT | Minority Game |
| Empirical: Discontinuity (Altarum: "Shift") | Local Search |
The extended toolbox includes CIRL's PARIS Pseudo-Boolean Solver, Altarum's pheromone learning mechanism, and Altarum's APSE tool and associated simulation laboratory for studying dynamical landscapes.
M.R. Garey and D. S. Johnson (1979). Computers and Intractability. San Francisco, CA, W.H. Freeman.
D.Mitchell, B. Selman, et al. (1992). Hard and Easy Distribution of SAT Problems. AAAI-92, San Jose, CA, 459-465.
H. V. D. Parunak, R. Savit, S. A. Brueckner, and J. Sauter (2001). Experiments in Indirect Negotiation. Proceedings of The AAAI Fall 2001 Symposium on Negotiation Methods for Autonomous Cooperative Systems. PDF
S. Brueckner and H. V. D. Parunak (2003). Information-Driven Phase Changes in Multi-Agent Coordination. Proceedings of Autonomous Agents and Multi-Agent Systems (AAMAS 2003), Melbourne, Australia, 950-951. PDF
D. W. Etherington and A. J. Parkes. Exploiting Solution-Space Structure. Technical Report, CIRL. PDF
J. P. D. Watson and T. Moon. Agent-Based Task Completion. Technical Report, Utah State University. PDF