Abstracts of the ISI AI Seminar Series in 1997
The concept of the comprehensive computerized patient record was propounded in the 1960's, but still does not exist today. Causes for this failure include the sheer difficulty of capturing data of great complexity without interfering with the health care process. They also, however, include a tradition of idiosyncratic system architectures and implementations that prove unmovable from one institution to another, and an inability to exploit the rapid pace of development of commercial record keeping systems for other industries.
We are implementing a series of electronic medical record systems that demonstrate the ability to exploit the World Wide Web to gain efficiency of implementation, ease of extension and interoperation with other Web-based technoogies. The first, shown in 1994, was a results-reporting system that makes the clinical data repository at Boston Children's Hospital accessible via the Web. The second, W3-EMRS, is a multi-institutional system that defines a virtual shared record on top of legacy hospital-specific systems. We demonstrate a use of this approach to share access from the emergency room to information at three Boston-area hospitals. In addition, two other implementations of the same architecture serve to integrate existing systems at recently-merged hospitals. Although our initial results are very encouraging, ultimate difficulties will be semantic rather than architectural or technological: details of the data at different institutions simply do not mean the same things, even when described by the same terms.
In the longer term, we are moving in the direction of life-long active
personal health information systems that center the maintenance of health
information on the individual. I will describe early experiments that we
are conducting in this direction.
Causal models, regardless of how they are acquired or represented,
provide reasoning agents with the powerful capability of predicting
the outcome of an enormous number of implicit actions, too numerous to
be specified explicitly, each involving a perturbation or a
reconfiguration of the agent's environment. Such capability explains
why the acquisition of causal models by humans is accompanied with the
sense of gaining "deep understanding" or "being in control", and how
humans can process sentences in which actions appear as modalities
(e.g., do(p), "increase taxes", "make him laugh") or sentences phrased
counterfactually (e.g., "B would be better if A were different") I
will describe a simple formalism which permits us to reason with
actions as modalities and to answer counterfactual queries.
In this talk, I will argue that the combination of generative and
case-based planning is a powerful technique to solve complex planning
problems. First, I will briefly review what are the sources of the
power of this approach in a fully-automated planning system. I will
focus on the plan representation requirements to effectively store,
retrieve, and reuse plans based on the underlying plan
rationale. Second, I will introduce how this approach is being
successfully extended to a mixed-initiative planning environment and
discuss the most important issues encountered. In particular, I will
present the techniques developed to capture the user's planning
rationale, to retrieve planning cases of potential relevance to the
user, and to guide the user through the reuse process. I will
illustrate with a demonstration of the work developed as part of a
technology integration experiment with MITRE within the DARPA Planning
Initiative program. Finally, I will discuss the several remaining open
questions and corresponding future research directions.
Planning operations for an autonomous agent is a much-studied problem. Most solutions require that the agent's domain be fully known at the time of planning and fixed during execution. For many domains, however, this is not possible. For example, consider a mobile robot equipped with a sensor, instructed to drive to a goal location with no initial map information. One approach to the problem is to produce an initial plan based on all known, believed, assumed, and estimated domain information, and then begin executing the plan. Whenever new information is learned about the domain (e.g., via the sensor), the information is logged and remaining portion of the plan is recomputed. This approach has been shown to yield excellent average case performance. The difficulty is primarily computational: replanning is expensive. In this talk, I present a search algorithm called D* (Dynamic A*) that can be used for real-time replanning. The algorithm is functionally equivalent to searching from scratch for each new piece of domain data, except that it is hundreds of times faster. D* uses an incremental graph theory approach to minimize recomputations. The algorithm was demonstrated on an actual mobile robot driving across unknown terrain. The approach was extended to planning for a fleet of vehicles tasked with achieving a coordinated set of goals, and work is underway to generalize the approach to STRIPS-like planning.
Developing a KDD solution to a given application problem is a complex
process. On the one hand, an initial problem description is typically only
vaguely defined. Therefore, there is a strong need to support the
refinement of a problem description in a systematic way. For that purpose
we propose to use the notion of task decomposition as known from Knowledge
Engineering as well as the notion of pre-/postconditions to characterize
the functionality of a task. On the other hand, for solving the identified
subtasks one has to associate appropriate techniques, e.g. clustering or
classification algorithms, with them. In order to support this
selection process we propose to characterize the functionality of the
available techniques in a corresponding way. Finally, we think that reusing
successfully applied KDD solutions will provide means for developing new
high quality KDD solutions with less effort.
The Informedia Digital Video Library Project at Carnegie Mellon University is creating large digital libraries of video and audio data available for full content retrieval by integrating natural language understanding, image processing, speech recognition and information retrieval. These digital video libraries allow users to explore multi-media data in depth as well as in breadth. The Informedia system automatically processes and indexes video and audio sources and allows selective retrieval of short video segments based on spoken queries. Interactive queries allow the user to retrieve stories of interest from all the sources that contained segments on a particular topic. Informedia will display representative icons for relevant segments, allowing the user to select interesting video paragraphs for playback.
Speech recognition is a key component, together with language processing, image processing and information retrieval. During the Informedia library creation, speech recognition helps create time-aligned transcripts of spoken words as well as integrate closed-captioned text if available. During library exploration by a user, speech recognition allows the user to query the system by voice, making the interaction simpler, more direct and immediate. Carnegie Mellon's Sphinx-II large vocabulary continuous speech recognition system provides the foundation for this PC-based application.
Natural language processing is needed to segment the data into paragraphs. In addition, natural language processing is used for the creation of summaries used for titles and video "skims", as well as for aspects of information retrieval.
Image processing identifies scene breaks, and creates representative key frames for each scene as well as each video paragraph. In addition, image understanding technologies allow the user to search for similar images in the database.
Information retrieval is used to allow retrieval of all text data, either from text transcripts, speech-recognition generated transcripts, OCR or human annotations.
The dramatic benefits of Informedia allow users to efficiently
navigate the complex information space of video data, without time
consuming linear access constraints. Thus Informedia provides a new
dimension in information access to video, audio and text material.
I will describe LISA, a model of Learning and Inference with Schemas
and Analogies. LISA performs structure mapping as a form of
guided pattern matching on structured, distributed representations.
The resulting mappings are recorded and used to constrain future
mappings. This mechanism is accompanied by a mechanism for
learning new structures in an unsupervised fashion. The result is a
system that performs retrieval from long-term memory, analogical
mapping, inference, and schema induction, all as special cases of the
same basic operations. LISA's operation and account of empirical
findings will be described.
This talk presents an overview of the Brahms simulation program-its
origin, purposes, design, and use. I will emphasize how Brahms differs
from traditional business process modeling tools by representing
activities of groups, and not only tasks of individual workers. Brahms models represent different communication media (e.g., databases,
voicemail, documents), conversations, meetings, circumstantial
interactions of people and technology, social relations, novice-expert
differences, and spatial influences on behavior. I will explain different
ways that Brahms models may be developed and used as tools for work
systems design: for presentation and comparison of points of view, for
what-if analysis, for teaching, and as a workbench for scientists. My
current work on Brahms at NYNEX in the Business Network
Architectures project emphasizes its value for helping social scientists
and software engineers work together.
A conceptual model of a database is a specification of objects, attributes, and their relationships contained in the database. Although understanding such a model is a crucial step in many applications, obtaining it from legacy databases is a challenging task. A given database may have missing schema information (such as keys and foreign keys) and contain noisy data. In this talk, we will present a method to discover a conceptual model from a relational database and represent it in an Entity-Relationship(ER) model. The main ideas are: discovering characteristics of data that are critical for model building, filtering out the irrelevant characteristics caused by noisy data, and generating ER model by reversing the well-understood process of converting a ER model to a relational database. Our goal is to help users to recover and understand the conceptual data model of a large and even badly designed database, so that further data processing tasks (such as data mining or database integration) can be performed. This method is implemented in LDL (Logical Definition Language) and C++, and it is tested on some man-made databases. We will analyse the results of our experimentation and discuss related work and future research directions.
Xuejun Wang is the recipient of an ISI Graduate Fellowship.
The help desk industry, like many other industries, is facing increasing challenges in knowledge management. Customer help and sales services in this industry are extremely knowledge-intensive, often provided by people who receive varying levels of training and knowledge update in the life time of an organization. These services must provide real-time, interactive problem solving capabilities for their customers. Solutions provided by the service representatives must cater to individual customer's needs. The central issue to be addressed here is one of enterprise knowledge management. At Simon Fraser University, we have been developing a suite of tools, known as CaseAdvisor, for knowledge acquisition, management and reuse using case based reasoning. We pay special attention to knowledge acquisition and update from large and unstructured knowledge sources. To address this problem we have developed algorithms for case base maintenance using information retrieval. In addition, we have designed an interactive problem solving method for reusing the acquired knowledge effectively by taking advantage of the best features in rule-based reasoning, decision tree reasoning, case based reasoning and constraint satisfaction. An important feature of our solution is knowledge compression and validation, designed so that the resulting knowledge base can be scaled up well. These features and tools are offered both on the PC and the Internet. In this talk, I will discuss these problems and explain our solutions. I will highlight some of the benefits of our approaches using examples drawn from realistic help desk applications which we are involved in. I will also present a demonstration of our CaseAdvisor system in action.
NOTE: This talk is part of the Distinguished Lecture Series of the USC Computer Science Department. It will be held at the USC Campus at noon, you can find more details and an abstract here.
Tuesday, October 21, 1997
Software Agents: The Next Generation
Jeff Brashaw
University of West Florida
Software agents are entities that function continuously and autonomously in a particular environment that is often inhabited by other agents and processes. Ideally, such agents learn from their experiences, communicate and cooperate with people and with other agents, and, as required, move from place to place within private networks and on the public Internet. In this presentation, we will look at the history and future of agent technology, and illustrate some of those trends with reference to current work on the KAoS agent architecture and its applications in aerospace and medicine.
Bio
Jeff received a B.A. in Psychology at the University of Utah and a Ph.D. in Cognitive Science at the University of Washington. Named a Fulbright Senior Research Scholar in 1993, he spent twelve months at the European Institute of Cognitive Sciences and Engineering (EURISCO) in Toulouse, France. He is currently a visiting associate professor at the Institute for Human and Machine Cognition at the University of West Florida and a Senior Principal Scientist at the Research and Technology Division of Boeing Information and Support Services, leading the Intelligent Agent Technology program. He also co-leads a group at the Fred Hutchinson Cancer Research Center that is developing technology to assist with long-term post-transplant care of bone marrow transplant patients. He has edited the books Knowledge Acquisition as a Modeling Activity (with Ken Ford, John Wiley, 1993), and Software Agents (AAAI Press/The MIT Press, 1997).
Monday, November 3, 1997
On Corpus-Based Statistics-Oriented Approaches to Machine Translation and Two-Way Training for Knowledge Acquisition
Keh-Yih Su
National Tsing Hua University, Taiwan
Knowledge acquisition and domain adaptation are the major bottlenecks in real commercialized machine translation systems; they are therefore important topics in developing an operational system. The corpus-based statistical-oriented (CBSO) approach for developing a highly parameterized MT system is thus the prospective approach to the next generation MT systems. Furthermore, traditional one-way approach (either rule-based or statistical approaches) in acquiring the translation knowledge is one major reason for producing target translations that are too literal to a native speaker. In this presentation, we therefore briefly introduce the corpus-based statistics-oriented approach to machine translation in general, and address a two-way training approach for acquiring various translation knowledge so that the translation of a source sentence falls within the grammar of the target language, and, thus, preventing the generation of literal translation.
Friday, November 21, 1997
Adaptive Behavior and Learning in Groups of Interacting Autonomous Agents
Maja Mataric
Computer Science Department and the Neuroscience Program
University of Southern California
Our work has focused on developing methodologies for synthesizing and analyzing group behavior and learning in situated agents. The structured bottom-up behavior-based approach we have chosen is motivated by the desire to understand and harness the complex dynamics that result from simple local interactions between agents in distributed systems. The approach utilizes a biologically-inspired notion of basis behaviors as a substrate for control and learning, and removes the abstraction barrier between the individual and collective levels of interaction. This talk will overview the approach and focus on its role in enabling and facilitating learning in distributed systems. We will demonstrate how simple behaviors and communication mechanisms can be applied to effectively decrease locality and credit assignment problems in systems with multiple concurrent learning agents.
Maja Mataric is an assistant professor in the Computer Science Department and the Neuroscience Program at the University of Southern California. She joined USC in September 1997, after two and a half years as an assistant professor in the Computer Science Department and the Volen Center for Complex Systems at Brandeis University. She received a PhD in Computer Science and Artificial Intelligence from MIT in 1994. She has worked at NASA's Jet Propulsion Lab, the Free University of Brussels AI Lab, LEGO Cambridge Research Labs, GTE Research Labs, the Swedish Institute of Computer Science, and ATR. Her Interaction Lab conducts research on the dynamics of interaction in complex adaptive systems including multi-agent systems ranging from a group of 26 mobile robots to economies and ecologies. Her work covers the areas of control and learning in intelligent situated agents, and cognitive neuroscience modeling of visuo-motor skill learning through imitation.
[ Information Sciences
Institute | AI Seminar | Schedule for Speaker | Upcoming Seminars ]
[ About AI Seminars | Useful Information
