Abstracts of the ISI AI Seminar Series in 2000
Michael Jordan - January 14, 2000
Lotfi A. Zadeh - January 28, 2000
Bart Selman - March 3, 2000
Christian Freksa - March 10, 2000
Thomas Dietterich - March 24, 2000
David Heckerman - April 28, 2000
Stuart Russell - May 10, 2000
Chun-Nan Hsu - June 28, 2000
Wayne Zhang - TBA
Makoto Yokoo - October 23, 2000
Acherio Martinoli - November 21, 2000
Charles Callaway - December 4, 2000
Srdhar Mahadevan - December 13, 2000
Friday, January 14, 2000
Graphical models and variational approximation
Michael Jordan
University of California, Berkeley
Probabilistic models have become increasingly prominent in recent years in artificial intelligence. General inference algorithms have been discovered that apply to a wide class of interesting and useful models known as ``graphical models'' (aka, Bayesian networks and Markov random fields). These algorithms essentially treat probability theory as a combinatorial calculus, and make creative use of graph theory to stave off the inevitable exponential growth in complexity. There is another feature of probability theory, however, which recommends it as a general tool for computational modeling. Probability involves taking averages, and when averaging is present complex models can be probabilistically simple. In this talk, I discuss variational methodology, which aims to leverage the laws of large numbers and laws of large deviations of probability theory as computational tools within a graphical model framework. I will discuss applications of the variational approach to a variety of probabilistic graphical models, including layered networks with logistic or noisy-OR nodes, coupled hidden Markov models, factorial hidden Markov models, hidden Markov decision trees, and hidden Markov models with long-range dependencies.
Friday, January 28, 2000
From Computing with Numbers to Computing with Words to Computation with Perceptions -- A Paradigm Shift
Lotfi A. Zadeh
One of the most deep-seated traditions in science has been and continues to be that of according much more respect to numbers than to words. The essence of this tradition was stated succinctly by Lord Kelvin in 1883: "I often say that when you can measure what you are talking about and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meagre and unsatisfactory kind."
In a world that is changing as rapidly as ours, no tradition can have permanence and no dogma can remain beyond challenge forever. What Lord Kelvin did not foresee is the advent of the computer age and the ballistic ascent in the capability of computers to process huge volumes of information at high speed, low cost and high reliability. Paradoxically, it is this capability that reverses the direction of inequality in the respectability of words and numbers. This is the crux of the paradigm shift that is alluded to in the title of my talk.
It is a truism that the quest for precision has led to brilliant successes. There is so much that we can do today that even Jules Verne could not have predicted. We have cellular phones and the Internet; we have eyeprint identification systems and the GPS; we can clone animals and transplant organs. But alongside the brilliant successes we see many problem-areas where progress has been slow and many problems which cannot be solved by any prolongation of existing theories, methodologies and technologies. A case in point is the problem of automation of driving in city traffic. This is easy for humans and an intractable problem for machines.
Driving in city traffic is an example of the remarkable human capability to perform a wide variety of physical and mental tasks without any measurements and any computations. Everyday examples of such tasks are parking a car, playing golf, deciphering sloppy handwriting and summarizing a story. Underlying this capability is the brain's crucial ability to reason with perceptions -- perceptions of time, distance, speed, force, direction, shape, intent, likelihood, truth and other attributes of physical and mental objects.
In science, it is a long-standing tradition to deal with perceptions by converting them into measurements. But what is becoming increasingly evident is that, to a much greater extent than is generally recognized, conversion of perceptions into measurements is infeasible, unrealistic or counterproductive. With the vast computational power at our command, what is becoming feasible is a countertraditional move from measurements to perceptions. What this implies is a major enlargement of the role of natural languages in scientific theories. This is the essence of the paradigm shift which, in my view, is likely to take place in coming years.
The theory which is put forth in my talk is focused on the development of what is referred to as the computational theory of perceptions (CTP) -- a theory which comprises a conceptual framework and a methodology for computing and reasoning with perceptions. The base for CTP is the methodology of computing with words (CW). In CW, the objects of computation are words and propositions drawn from a natural language. A typical problem in CW is the following. Assume that a function f, Y=f(X), is described in words as: if X is small then Y is small; if X is medium than Y is large; if X is large then Y is small, where small, medium and large are labels of fuzzy sets. The question is: What are the maximum and maximizing values of Y and X respectively?
The point of departure in the computational theory of perceptions is the assumption that perceptions are described as propositions in a natural language, e.g., "Michelle is slim," "it is likely to rain tomorrow," "economy is improving," "it is very unlikely that there will be a significant increase in the price of oil in the near future." In this perspective, natural languages may be viewed as systems for describing perceptions. Furthermore, computing and reasoning with perceptions is reduced to computing and reasoning with words.
To be able to compute with perceptions it is necessary to have a means of representing their meaning in a way that lends itself to computation. Conventional approaches to meaning representation cannot serve this purpose because the intrinsic imprecision of perceptions puts them well beyond the expressive power of predicate logic and related systems. In the computational theory of perceptions, meaning representation is based on what is referred to as constraint-centered semantics of natural languages (CSNL).
A concept which plays a central role in CSNL is that of a generalized constraint. Conventional constraints are crisp and are expressed as X is C, where X is a variable and C is a crisp set. In a generic form, a generalized unconditional constraint is expressed as X isr R, where X is the constrained variable; R is the constraining (fuzzy) relation which is called the generalized value of X; and isr, pronounces as ezar, is a variable copula in which the value of the discrete variable r defines the way in which R constrains X. Among the basic types of constraints are the following: equality constraints (r:=); possibilistic constraints (r:blank); veristic constraints (r:v); probabilistic constraints (r:p); random set constraints (r:rs); usuality constraints (r:u); fuzzy graph constraints (r:fg); and Pawlak set constraints (r:ps).
In constraint-centered semantics, a proposition, p, is viewed as an answer to a question, q, which is implicit in p. The meanings of p and q are represented as generalized constraints, which play the roles of canonical forms of p and q, CF(p) and CF(q), respectively. CF(q) is expressed as: X isr ?R, read as "What is the generalized value of X?" Correspondingly, CF(p) is expressed as: X isr R, read as "The generalized value of X isr R." The process of expressing p and q in their canonical forms plays a central role in constraint-centered semantics and is referred to as explicitation. Explicitation may be viewed as translation of p and q into expressions in GCL -- the Generalized Constraint Language.
In the computational theory of perceptions, representation of meaning is a preliminary to reasoning with perceptions -- a process which starts with a collection of perceptions which constitute the initial data set (IDS) and terminates in a proposition or a collection of propositions which play the role of an answer to a query, that is, the terminal data set (TDS). Canonical forms of propositions in IDS constitute the initial constraint set (ICS). The key part of the reasoning process is goal-directed propagation of generalized constraints from ICS to a terminal constraint set (TCS) which plays the role of the canonical form of TDS. The rules governing generalized constraint propagation in the computational theory of perceptions coincide with the roles of inference in fuzzy logic. The principal generic rules are: conjunctive rule; disjunctive rule; projective rule; surjective rule; inversive rule; compositional rule; and the extension principle. The generic rules are specialized by assigning specific values to the copula variable, r, in X isr R.
The principal aim of the computational theory of perceptions is the development of an automated capability to reason with perception-based information. Existing theories do not have this capability and rely instead on vconversion of perceptions into measurements -- a process which in many cases is infeasible, unrealistic or counterproductive. In this perspective, addition of the machinery of the computational theory of perceptions to existing theories may eventually lead to theories which have a superior capability to deal with real-world problems and make it possible to conceive and design systems with a much higher MIQ (Machine IQ) than those we have today.
Friday, March 3, 2000
Understanding Complexity: Recent Developments and Directions
Bart Selman
Cornell University
Recently, we have made considerable progress towards a better understanding of the nature of hard computational problems. In particular, by exploiting connections between combinatorial search problems and models from statistical physics, we now have methods that enable a much finer-grained characterization of computational complexity than the standard worst-case complexity measures. These findings provide insights into new algorithmic strategies based on randomization and distributed algorithm portfolios. I will survey the recent progress in this area and I will discuss implications for the design of more robust computational systems.
Bio
Bart Selman is an Associate Professor of Computer Science at Cornell University. He previously was a principal scientist at AT&T Bell Laboratories. He holds a Ph.D. and M.Sc. in computer science from the University of Toronto, and a M.Sc. in physics from Delft University of Technology. His research has covered many areas in artificial intelligence and computer science, including tractable inference, knowledge representation, natural language understanding, stochastic search methods, theory approximation, knowledge compilation, planning, default reasoning, and the connections between computer science and physics (phase transition phenomena). He has received four best paper awards at the American and Canadian national artificial intelligence conferences, and at the International Conference on Knowledge Representation. He holds an NSF CAREER Award and is an Alfred P. Sloan Research Fellow.
For more info, see http://www.cs.cornell.edu/home/selman.
Friday, March 10, 2000
Communicating with Autonomous Mobile Robots by Means of Schematic Maps
Christian Freksa
University of Hamburg. Currently ICSI and BISC (UCBerkeley)
freksa@icsi.berkeley.edu
http://www.informatik.uni-hamburg.de/WSV/raumkognition/
Autonomous mobile robots emerge as rather universal experimental tools for exploring percepual knowledge, its relation to knowledge representation, inference drawing about their environments, and the use of these inferences to act in a physical world. As autonomous mobile robots must act in both, the real physical environments and their formal representations, the relation between the ëreal worldí and its static and dynamic representations are of particular interest. Processes of spatial reasoning and navigation can be investigated all the way from the perception of the environment to performing actions in the environment. Also, autonomous mobile robots can be used to test and explore theories of spatial cognition developed in cognitive psychology.
I will report about two projects carried out in the framework of the German Spatial Cognition priority program funded by the Deutsche Forschungsgemeinschaft. In this program, sixteen research teams consisting of computer scientists, cognitive psychologists, geographers, and biologists cooperate on the investigation of spatial cognition - real and abstract, human and machine. In one of the projects (the ëaspect map' project) we investigate integrated spatial and non-spatial forms of representing spatial knowledge (as in geographic maps); in the other project (the 'spatial inference' project) we develop and study methods and algorithms for qualitative spatial reasoning, i.e. calculating spatial relationships without the use of numbers. In a cooperation between both projects, we study the use of schematic maps for robot instruction, combining map representations with qualitative spatial reasoning. We believe that this is an interesting and useful paradigm for studying multimodal human interaction, as maps integrate analogical spatial knowledge and propositional linguistic knowledge.
Friday, March 24, 2000
Sharing and Abstraction in Hierarchical Reinforcement Learning
Thomas Dietterich
tgd@cs.orst.edu
Reinforcement learning (RL) algorithms learn to solve sequential decision-making tasks by interacting with a Markovian environment. Existing methods treat the Markovian environment as a flat search space. While these methods have successfully solved many interesting problems, there are limits to the size of problem that can be successfully handled as a flat state space. In this talk, we will describe one approach to addressing this problem: Hierarchical reinforcement learning using the MAXQ value function decomposition. In this approach, the programmer defines a hierarchy of tasks and subtasks, and the value function (the key data structure in reinforcement learning) is decomposed hierarchically into value functions for each task and subtask. We will describe how the MAXQ decomposition permits the subtasks to ignore aspects of the state space, and hence, the subtasks become re-usable. This speeds search and learning with only minor losses in optimality.
Friday, April 28, 2000
Data Visualization at Microsoft
David Heckerman
With large amounts of data accumulating in databases and data warehouses, there is a growing need for tools that help find interesting patterns from data. One general class of such tools goes by the name "data visualization." In my talk, I will discuss data visualization techniques that have been developed at Microsoft. These tools have been used internally and will likely appear in future data analysis products. The tools are somewhat different from classic methods for data visualization which typically display raw data. The tools I will discuss first abstract the data using statistical models and then present the models to users in a (visual) way that facilitates the discovery of patterns. I will highlight visualization techniques based on two models: dependency networks -- a close cousin of the Bayesian network -- and mixture models. I will illustrate the use of these tools using data from domains including web navigation, ecommerce, TV viewing, and movie preferences.
Wednesday, May 10, 2000
Tracking Many Objects With Many Sensors
Stuart Russell
UC Berkeley
Keeping track of multiple objects over time is a problem that arises in many real-world domains. The problem is often complicated by noisy sensors and unpredictable dynamics. The data association literature describes many algorithms that solve special cases or provide heuristic approaches for various applications. This talk begins by setting up the general problem as an instance of probabilistic inference, and points out its intractability. An earlier method due to Huang and Russell is described and shown not to scale to multiple sensors. It is shown that estimation of intrinsic object properties allows scaling through a decomposition of the global likelihood. The inferenceproblem is solved by a polynomial-time approximation scheme based on Markov chain Monte Carlo simulation. Experiments with a freeway traffic simulation suggest that the method allows accurate estimation of long-range origin/destination information even when the individual links in the sensor chain are highly unreliable. If time permits, I will describe an extension of these ideas to the general problem of inference in first-order probabilistic logics. Joint work with Hanna Pasula, Mike Ostland, Ya'acov Ritov
June 28, 2000
Why Discretization Works For Naive Bayesian Classifiers
Chun-Nan Hsu
Institute of Information Science, Academia Sinica
chunnan@iis.sinica.edu.tw
http://www.iis.sinica.edu.tw/~chunnan
This paper explains why well-known discretization methods, such as entropy-based and ten-bin, work well for naive Bayesian classifiers with continuous variables, regardless of their complexities. These methods usually assume that discretized variables have Dirichlet priors. Since perfect aggregation holds for Dirichlets, we can show that, generally, a wide variety of discretization methods can perform well with insignificant difference. We identify situations where discretization may cause performance degradation and show that they are unlikely to happen for well-known methods. We empirically test our explanation with synthesized and real data sets and obtain confirming results. Our analysis leads to a lazy discretization method that can simplify the training for naive Bayes. This new method can perform as well as well-known methods in our experiment.
TBA
Phase Transitions, Computational Complexity and Flexible Problem Solving
Wayne Zhang
October 23, 2000
Distributed Constraint Satisfaction: Foundation of Cooperation in Multi-Agent Systems
Makoto Yokoo
NTT Commuication Science Laboratories
www.kecl.ntt.co.jp/csl/ccrg/members/yokoo
Abstract:
When there exist multiple agents in a shared environment, there
usually exist constraints among possible actions of these agents. A
distributed constraint satisfaction problem (distributed CSP) is a
problem to find a consistent combination of actions that satisfies
these inter-agent constraints. The research of Constraint Satisfaction
Problems has a long and distinguished history in AI as a general
framework that can formalize various application problems. Similarly, a
distributed CSP is a fundamental problem for achieving the
coordination among agents, and can formalize various application
problems in Multi-agent Systems (e.g., distributed resource allocation
problems, distributed scheduling problems, distributed interpretation
tasks, and multi-agent truth maintenance tasks). This talk will
explain the formal definition, algorithms, and MAS applications of
distributed CSPs.
Brief Personal History:
Dr. Makoto Yokoo is currently a senior research scientist in NTT
Communication Science Laboratories, Kyoto, Japan. He was a visiting
research scientist at the Department of Electrical Engineering and
Computer Science, the University of Michigan, Ann Arbor, from 1990 to
1991. His research interests include multi-agent systems,
search/constraint satisfaction, and mechanism design among agents.
He recently published a book "Distributed Constraint Satisfaction:
Foundation of Cooperation in Multi-Agent Systems" from Springer.
He also co-authored a chapter "Search algorithms for agents" of
the book "Introduction to Distributed Artificial Intelligence", which
is published from MIT Press.
November 21, 2000
Probabilistic Modelling in Collective Autonomous Robotics: From Collective Manipulation and Exploration Case Studies towards an Unified Methodology
Alcherio Martinoli
Microsystems Laboratory, Collective Robotics Group (CORO)
Caltech
Abstract:
In this talk, I will present a set of experiments which differ in the
collective task performed (collective manipulation, collective
exploration)
and in the type of the robotic platform used. The common signatures of
all
the experiments can be summarized as follows: the robots are fully
autonomous, are endowed with minimal individual capabilities, and the
control of the group is fully scalable and distributed (Swarm
Intelligence
approach). In order to systematically study and understand the key
parameters
of the robotic system (robot control, hardware morphology, sensorial
features,
number of robots, and so on) playing a crucial role in the team
performance,
we developed and applied probabilistic modelling methods to several
different
case studies in two classes of collective experiments (manipulation and
exploration). I will present results, advantages, and limitations of the
current methods and the future challenges to be faced for developing an
unified methodology at least for each class of experiment
December 4, 2000
Narrative Prose Generation
Charles Callaway
North
Carolina State University
Given a fictional world, its characters, and a sequence of events, a narrative prose generator should in realtime create full-length short stories in rich natural language. An effective Narrative Prose Generation (NPG) model should exhibit the following characteristics: (1) it should create narrative descriptions that provide scene descriptions, generate character dialogue, and relate events that take place in the fictional world; (2) it should provide methods for generating stories whose linguistic and syntactic complexity is customizable; (3) it should operate effectively in a multilingual environment; and (4) all of these elements should function in realtime.
We will describe the Author NPG architecture, which addresses the design criteria above. The Author architecture has been implemented in StoryBook, a narrative prose generator that produces multi-page narratives. To compose stories, StoryBook takes a story plan consisting of the actors, scenes, props and temporally ordered events as input from a narrative planner. It then uses a sentence planner to create sentential descriptions, revises them, and then employs a unification-based functional systemic grammar to create narratives in the language of choice. StoryBook offers features not found in other text generation systems, including generating prose that is multiple pages in length, and it integrates a number of aspects of text generators that have previously been studied in isolation.
December 13, 2000
Time, Value, and Memory: A Framework for Autonomous Learning and Sequential Decision-Making
Sridhar
Mahadevan
Michigan State University
Autonomous agents, whether biological or synthetic, are embedded in an external environment which they primarily experience and act on sequentially. The sequential nature of perception and behavior raises a fundamental set of challenges in determining optimal courses of action for achieving long-term goals. The consequences of a specific action may not be experienced until many steps later. Furthermore, perceptual constraints may hide some of the information needed to make appropriate decisions unambiguously. Finally, the computational complexity of making optimal decisions may be intractable. This talk describes a general mathematical framework for modeling autonomous learning and sequential decision-making, based on three fundamental building blocks: time, value, and memory. Time refers to the structure of events underlying decisions. Values predict the future, and reflect both long-term goals and uncertainty in perception and action. Memory summarizes the past: it is an essential component of action in perceptually aliased situations.
Recent algorithms we developed will be presented that exploit hierarchy and modularity to represent temporally extended actions, multi-agent task coordination, and memory organization. These algorithms are tested on several case studies which illustrate the interdisciplinary scope of the framework, including selective visual attention, multi-agent manufacturing and scheduling, and spatial navigation.
[AI Seminar | Schedule for Speaker | Upcoming Seminars ]
[ About AI Seminars | Useful Information ]
Maintained by Fanny Mak