Yolanda Gil (PI), Awards and Grants
-
An Analytical Framework for Provenance-Rich Social Knowledge Collection.
National Science Foundation (NSF).
Grant number IIS-1117281.
September 2011 - August 2014.
Yolanda Gil (PI).
This project will investigate a new generation of provenance-rich social knowledge collection
systems that will greatly improve the ability of people to create online communities of interest
and share information. The research will transform the state of the art in social content
collection in several important ways. First, social knowledge collection systems will be
augmented to support contributors to structure factual content, so that information can be
aggregated to answer reasonably interesting albeit simple factual queries. We will build on
a semantic wiki framework to allow users to create structured factual content as
object-property-value triples. It will not assume pre-defined ontologies, but rather
develop algorithms that analyze current content and suggest opportunities for structuring
contributions so they can be aggregated to answer simple queries. Second, they will include
detailed provenance records that reflect how the content was created, allowing contributors
to enter alternative viewpoints and enabling consumers to make quality and trust judgments.
The research will include developing algorithms that derive trust metrics from the provenance
records, and to allow users to define views on the content based on provenance criteria.
It will create novel approaches to propagate trust across content topics and categories
and complement existing algorithms that propagate trust in social networks. Third, the
systems will proactively guide contributors to invest effort where it is most needed,
developing novel algorithms to detect knowledge gaps, and by allowing users to define
queries that will be used to drive further contributions.
-
Discovery Informatics.
National Science Foundation (NSF).
Grant number IIS-1151951.
September 2011 - August 2012.
Yolanda Gil (PI).
In order to address the ambitious research agenda put forward by many science disciplines,
many challenges must be addressed in the areas of information sciences, intelligent systems,
and human-computer interaction. Data modeling and integration still require large investments
of scientist time and effort. The scientific literature grows so quickly in many areas that
it becomes unmanageable for scientists. Many aspects of the scientific discovery process are
often largely manual and could be automated, improved, or made more efficient. Better
interfaces for collaboration, visualization, and understanding would significantly improve
scientific practice. The goal of this project is to produce a report outlining the
opportunities that scientific discoveries present to information sciences and intelligent
systems as a new area of research called discovery informatics.
-
Workflow-Net
: Cybersecurity through Nimble Task Allocation:
Workflow Reasoning for Mission-Centered Network Models.
Air Force Office of Scientific Research (AFOSR).
Grant number FA9550-11-1-0104.
June 2011 - March 2015.
Yolanda Gil (PI).
Traditional cybersecurity has focused on techniques to analyze and eliminate vulnerabilities
in a network, often in response to actual security breaches of previously unknown weaknesses.
Recognizing that in practice network operations can never be fully secure, a major focus of
recent research is on intrusions that are assumed to be on-going in the network by one or
more malicious parties. In this new view on cybersecurity, a key desired capability is to
be able to accomplish a mission even while the network is compromised and subject to
deception. However, traditional network models lack a representation of the mission and
of how network resources are utilized to accomplish various aspects of the mission.
In this project, we will investigate a new approach to develop a general framework for
representing models of mission goals and tasks, and to exploit those models to make a
mission more robust to deception operations co-occurring in the network. These
mission-centered network models (MCNMs) will build on and extend current two-layered
(logical/physical) network models by integrating a new layer of task-level representations
of the mission into those models. In this new task-oriented layer, a mission can be
characterized as a set of goals, each accomplished by a set of interdependent tasks
that place requirements on the network resources. The system can then dynamically
control the mappings of those tasks onto network resources using a variety of algorithms
that take into account which resources are currently compromised. As a result, a mission
can be protected from ongoing intrusion and deception activities by dynamically reallocating
resources as they become compromised and by examining provenance records of task outcomes to
determine their reliance on compromised resources. MCNMs can be used to determine which
resources are critical for any given mission, to prioritize the use of uncompromised resources,
to accomplish and estimate the trust on mission tasks when resources are compromised, and to
determine the practical impact on the mission of deception activities. MCNMs will enable a
new approach to cybersecurity in network-based operations.
-
W-SHARING:
Towards Shared Repositories of Computational Workflows
National Science Foundation (NSF).
Grant number IIS-0948429.
September 2009 - August 2011.
Yolanda Gil (PI).
Scientific computing has entered a new era of scale and sharing with
the arrival of cyberinfrastructure for computational experimentation.
A key emerging concept is scientific workflows, which provide a
declarative representation of scientific applications as complex
compositions of software components and the dataflow among them.
Workflow systems manage their execution in distributed resources,
track provenance of analysis products, and enable rapid
reproducibility of results. In current cyberinfrastructure, there are
well understood mechanisms for sharing data, instruments, and
computing resources. This is not the case for sharing workflows,
though there is an emerging movement for sharing analysis processes in
the scientific community. In this grant, we are investigating
computational mechanisms for sharing workflows as a key missing
element of cyberinfrastructure for scientific research. We are
exploring three major research topics. First, we are eliciting new
requirements that workflow sharing poses over current techniques to
share software tools and libraries. Second, we want to understand how
shared workflow catalogs should be designed. Existing shared data
catalogs are a successful model, but software artifacts require
different representations and access functions. Finally, we are
studying what sharing paradigms might be appropriate for scientific
communities, exploring environments ranging from traditional
server-based architectures to wikis to Web 2.0 social sites.
-
PedWorkflow: Workflows for Assessing Student Learning
National Science Foundation (NSF).
Grant number IIS-0917328.
September 2009 - August 2011.
Jihie Kim (PI), Gisele Ragusa (co-PI), Erin Shaw (co-PI), and Yolanda Gil (co-PI).
As on-line learning becomes more popular and is increasingly
integrated in engineering courses, instructors
become overwhelmed with the amount of information
that they have to process.
For example,
discussion boards support collaborative interaction and reflective
problem solving, but instructors need to monitor the student
discussions
in order to adress questions and
corrections as well as for grading student participation.
The goal of this project is to create a novel workflow environment
to support efficient assessment of student learning through the design
and composition of assessment workflows. The workflows will support
data analysis and will be re-usable across curricula and instructors.
-
Designing Scientific Software One Workflow at a Time
National Science Foundation (NSF).
Grant number CCF-0725332.
October 2007 - September 2011.
Ewa Deelman (PI) and Yolanda Gil (co-PI).
Much of science today relies on software to make new discoveries.
This software embodies scientific analyses that are frequently
composed of several application components and created collaboratively
by different researchers. Computational workflows have recently emerged as
a paradigm to manage these large-scale and large-scope scientific analyses.
Workflows represent computations that are often executed in geographically
distributed settings, their interdependencies, their requirements and their
data products. The design of these workflows is at the core of today's
scientific discovery processes and must be treated as scientific products
in their own right. The focus of this research is to develop the foundations
for a science of design of scientific processes embodied in the new artifact
that is the computational workflow. The work will integrate best practices
and lessons learned in existing workflow applications, and extend them in
order to define and formalize design principles of computational workflows.
This work will result in a fundamentally new approach to designing workflows
that will greatly improve the scientific software design methodology by
defining and formalizing design principles, and by familiarizing the
scientific community with these effective workflow design processes.
-
Plato: Phased-Learning through Analyzing Teaching and Observation
DARPA Bootstrapped Learning (BL) program.
Grant number HR0011-07-C-0060, subcontract to SRI International.
August 2007-July 2011.
ISI co-PIs: Paul Cohen and Yolanda Gil.
The goal of this project is to develop an electronic student
that can learn from a teacher using different methods of natural instruction.
We will contribute the strategies to learn from being told by the teacher
a broad range of generalities about
process knowledge. These general descriptions will be tested by the learner with examples
and practice of those processes. We will use Interdependency Models to relate the individual
teacher instructions, check the consistency with the student's prior knowledge,
and detect gaps in the stated instruction that could be filled through practice.
-
Windward
: Scalable Knowledge Discovery Through Grid Workflows
Air Force Research Laboratory (AFRL).
Grant number FA8750-06-C-0210.
September 2006 - December 2008.
Yolanda Gil (PI). ISI co-PIs: Paul Cohen and Ewa Deelman.
Distributed workflows are emerging as a key technology to conduct large-scale
and large-scope scientific applications in earthquake science, physics,
astronomy, and many other sciences. In this new project, we will investigate
the use of workflow technologies for Artificial Intelligence applications with
a particular focus on data analysis and knowledge discovery tasks. Based on
the data to be analyzed, an initial workflow template is formed by selecting
from a library of known-to-work compositions of general-purpose machine
learning algorithms. The workflow template is specialized through knowledge-based
selection and configuration of algorithms. Finally, the workflow is mapped to
available resources and restructured to improve execution time. Data analysis
and knowledge discovery applications will benefit from the automation, scale,
and distributed data and resource integration supported by distributed workflow
systems. We will also conduct new research in important aspects of workflow
systems. To what extend can we represent complex algorithms and their subtle
differences so that they can be automatically selected and configured to
satisfy the stated application requirements? Can we develop learning
techniques that improve the performance of the workflow system by exploiting
an episodic memory of prior workflow executions? What mechanisms will be
needed to support autonomous and robust execution of concurrent workflows
over continuously changing data?.
-
NSF Workshop on Challenges of Scientific Workflows
National Science Foundation (NSF).
Grant number IIS-0629361.
May 2006 - October 2007.
Yolanda Gil (PI) and Ewa Deelman (co-PI).
In recent years, workflows have emerged as a paradigm for conducting large-scale
scientific analyses. The structure of a workflow specifies what analysis
routines need to be executed, the data flow amongst them, and relevant
execution details. Workflows provide a systematic way to capture scientific
methodology and provide provenance information for their results. Robust and
flexible workflow creation, mapping, and execution are largely open research
problems. Under this project, Ewa Deelman and Yolanda Gil chaired an
invitation-only workshop on "Challenges of Scientific Workflows" at the
National Science Foundation. The aim of this workshop was to bring
together IT researchers and practitioners working on a variety of aspects
of workflow management as well as domain scientists that use workflows for
day-to-day data analysis and simulation. The National Science Foundation
expects a final report with recommendations to the community regarding the
challenges of scientific workflows and their role in cyber infrastructure
planning for 21st century science and engineering research and education.
-
C4ML
: Metareasoning for Integrated Learning
DARPA Integrated Learning (IL) program.
Grant number FA8650-06-C-7606, subcontract to BBN Technologies.
May 2006 - July 2008.
ISI co-PIs: Paul Cohen and Yolanda Gil.
In this project we will develop a learning metareasoner to coordinate
the activities of many learners in an integrated system that learns
procedural knowledge from user demonstrations and past knowledge.
A learning metareasoner is a problem solver that has explicit representations
of its current learning state, learning goals, and has metareasoning methods to
accomplish those goals. The learning metareasoner will assess its progress
based on four criteria: capability, confidence, coverage, and competence (C4).
-
Intelligent Optimization of Parallel and Distributed Applications
National Science Foundation (NSF).
Grant number CSR-0615412.
August 2006 September 2009.
Principal Investigators: Mary Hall (PI), Kristina Lerman (co-PI),
Ewa Deelman (co-PI), Aichiro Nakano (co-PI), Joel Saltz (co-PI).
ISI co-PIs: Yolanda Gil.
This project will develop a domain-specific programming system supporting
Petascale application optimization of molecular dynamics simulation, in which
applications will be viewed as workflows consisting of composable components to
be mapped to a diversity of machine resources. The application components will
be viewed as dynamically adaptive algorithms for which there exist a set of
variants and parameters that can be chosen to develop an optimized implementation.
A variant describes a distinct implementation of a code segment, perhaps even a
different algorithm. A paramater is an unbound variable that affects application
performance. By encoding an application in this way, we can capture a large set
of possible application mappings with a very compact representation. Because
the space of mappings is prohibitively large, the system captures and utilizes
domain knowledge from the domain scientists and designers of the compiler,
run-time and performance models to prune most of the possible implementation.
Knowledge representation and machine learning techniques utilize this domain
knowledge and past experience to navigate the search space efficiently.
Incorporating cognitive search techniques and taking advantage of parallel
resources, these alternative implementations are searched automatically by
tools to find a high-quality implementation.
-
MathTrust
: Mathematical Analysis of Trust and Deception.
Air Force Office of Scientific Research (AFOSR).
Grant number FA9550-06-1-0031.
December 2005 - May 2009.
Yolanda Gil (PI).
Information systems such as the Web often include open information sources that have very varying quality, and may be subject to deception. The information is often of unknown origins and there is often no prior history with many of the sources that may be used to assess their reputation. This project will investigate how to represent, learn, and characterize the reputation and reliability of sources in an information system that collects from users theit individual trust ratings and derives over time their collective consensus trust. This project will analyze the factors that affect trust in sources, study how to capture user feedback, and develop algorithms to derive source reputation. Based on this model of trust, a mathematical analysis of source trust and deception will relate formally a number of salient factors that influence trust in information systems.
-
Intelligent Design and Optimization of Parallel and Distributed Applications
National Science Foundation (NSF) Computer Science Research program.
Grant number CNS-0509517.
July 2005 - December 2006.
Mary Hall (PI), Kristina Lerman (co-PI), Ewa Deelman (co-PI), Aichiro Nakano (co-PI), Joel Saltz (co-PI).
This project will explore automatic mapping of applications to parallel systems consisting of tens of thousands of processors. This project includes expert domain scientists in molecular dynamics simulation and phylogenetics, who have been developing scalable algorithms that can handle large irregular data sets using high-end computing platforms. Many of these algorithms are hand-tuned and optimized for particular target architectures. The goal of the project is to develop expressive representations of optimization parameters, appropriate learning techniques for exploring the combinatoric optimization space, and automated mapping techniques for performance optimization.
-
Towards Cognitive Grids
: Knowledge-Rich Grid Services for Autonomous
Workflow Refinement and Robust Execution.
National Science Foundation (NSF) Shared Cyberinfrastructure program.
Grant number SCI-0455361.
December 2004 - November 2006.
Ewa Deelman (PI), Yolanda Gil (co-PI).
This research combines Artificial Intelligence and Distributed Computing
techniques to create knowledge-rich workflow services that can support the
execution of large-scale scientific workflows. The main foundation will be
provided by expressive formal representations of the application workflow
and of the execution environment. These representations will support
resource selection that will enhance application performance, resource
reservation based on anticipated workflow needs, workflow repair
capabilities in case of failures or in case of new resources coming on line.
-
CALO-KA
: Interactive Acquisition of User Advice in a Cognitive Assistant that Learns and Organizes (CALO)
DARPA Personalized Assistant that Learns (PAL) program.
Grant number NBCHD030010, subcontract to SRI International.
May 2003 - May 2008.
USC/ISI co-PIs: Yolanda Gil, Jerry Hobbs, Craig Knoblock.
This work is part of a very large integrated effort involving more than twenty universities and other research institutions throughout the US in order to develop personalized assistants that learn.
The goal of our research is to develop novel
techniques to assist users in specify new knowledge for an automated assistant
that learns to improve its performance over time.
This requires new
research on acquiring advice through natural language interaction,
operationalizing user advice into procedural and task knowledge,
dialogue management to ask follow up questions about the
implications and potential side effects of the user advice,
generalizing user advice based on past experience,
and expanding the set of terms known to the system when
confronted with an unexpected situation.
This work also involves designing a meta-reasoning architecture
that includes a memory structure to index past experiences,
reasoning about the implications of changes to the system's
current knowledge, anticipating user requests and
potential failures and opportunities, and self-motivated
learning goals to prompt focused knowledge acquisition.
-
Just-In-caSe just-in-Time Information Analysis
Grant number N66001-03-C-8006.
December 2002 - December 2005.
PI: Yolanda Gil.
The goal of this research is to develop a Web-based environment
for information analysis that provides an emerging self-organization
of knowledge
through the use of natural language and machine learning techniques, including
topic detection and similarity-based
clustering.
The system will exploit this emerging
organization to support users to work in new topics, to debate
alternative hypotheses, and to locate trustworthy open web sources.
-
SCEC/IT:
An Information Infrastructure for System-Level Earthquake Research.
National Science Foundation (NSF), ITR Large Grant.
Grant number EAR-0122464.
September 2001 - September 2006.
PI: Thomas Jordan.
USC/ISI co-PIs: Carl Kesselman, Yolanda Gil, Hans Chalupsky.
UCSD co-PI: Jean Bernard Minster.
SDSC co-PI: Reagan Moore.
This NSF Large ITR project is a collaboration with members of the Southern California Earthquake Center (SCEC) and funds a variety of information technologies to support earthquake research,
including computational grid, digital libraries, ontologies and knowledge representation, planning, and interactive acquisition.
-
TRELLIS: Capturing and Exploiting Semantic Relationships
for Information and Knowledge Management.
Air Force Office of Scientific Research (AFOSR).
Award Number F49620-00-1-0337.
August 2000 - November 2003.
Yolanda Gil (PI).
TRELLIS is an interactive environment that will allow
users add their observations,
opinions, and conclusions as they analyze information by making semantic
annotations to documents and other on-line resources. This is in essence
a knowledge acquisition problem, where the
user is adding new knowledge to the system based on their expertise as
they analyze information.
-
TEMPLE: Template Enhancement through Knowledge Acquisition.
DARPA
Active Templates (AcT) program.
Award Number F30602-00-2-0513.
April 2000 - April 2003.
Yolanda Gil (PI).
The proposed work will develop an acquisition interface
for planning knowledge that relies on script-based wizards
to guide users in adding planning constraints and preferences.
-
PHOSPHORUS: A Knowledge and Experience-Based Agent
Capabilities Matcher.
DARPA
Control of Agent-Based Systems (CoABS) program.
Award Number F30602-97-C-0068.
June 1999 - December 2002.
Yolanda Gil (co-PI) and Robert MacGregor (co-PI).
We are developing Phosphorus, a knowledge-based matcher that accepts
a user's description of a needed service as input and responds with a
ranked list of agents that have the capability to provide that service.
The Phosphorus matcher will exploit subsumption,
goal reformulation, and partial match. The
matcher will also be experienced-based, using learning techniques to
improve the utility of its matches over time.
-
KASPER: Knowledge Acquisition for Solving Problems.
DARPA Rapid Knowledge Formation (RKF) program.
Subcontract to SRI, Award Number N66001-00-C-8018.
April 2000 - May 2003.
Yolanda Gil (PI).
This project will develop, in an integration
effort with other
research groups, tools to enable domain experts
to extend knowledge bases by using natural language interfaces,
commonsense reasoning, and analogy-based reasoning.
Our group's contributions
focus on tools to formulate follow-up
questions when users have not provided
sufficient knowledge, and on the acquisition
of problem solving and process knowledge.
The integrated system will be tested by two challenge
problems designed by DARPA. One problem is based on
how our system will acquire graduate level knowledge
of biology from a textbook and then answer the questions
at the end of the chapter. Another challenge
problem will require developing
expert-level knowledge-based
techniques for advanced genome annotation and
exploitation for pathogen countermeasures.
-
KAMM: Knowledge Acquisition for Objective Grammars in MasterMind Editor.
Air Force Research Laboratory's Joint Defense Planner (JDP) program.
Award Number F30602-97-C-0118.
April 2000 - September 2000.
Yolanda Gil (co-PI) and Pedro Szekely (co-PI).
This project is developing an editor that allows users to
change and extend an initial grammar of objectives.
This editor is integrated with the MasterMind objectives editor and
includes acquisition wizards developed with EXPECT knowledge acquisition
techniques.
The editor was delivered in October 2000, and is now being integrated
into the Global Command and Control System (GCCS) and is
expected to be delivered to Air Operation Centers around the world
by August 2001.
-
SHERPA: Knowledge Acquisition for Large Knowledge Bases -
Integrating Problem-Solving Methods and Ontologies into Applications.
DARPA
High Performance Knowledge Bases (HPKB) program.
Award Number F30602-97-1-0195.
April 1997 - December 2000.
Bill Swartout (co-PI) and Yolanda Gil (co-PI).
This work extends the ISI EXPECT architecture to include
several novel approaches including the derivation and use of knowledge
Interdependency Models, script-based knowledge acquisition, the
integration of natural language techniques in knowledge acquisition
tools, and the use of background knowledge to guide users in adding new
knowledge to a system. We participated in the HPKB annual Challenge
Problems, as well as in the Knowledge Acquisition Critical Component
Experiment held at the Army Battle Command Battle Lab in Ft Leavenworth,
KS, in August 1999. Several Army officers successfully
used EXPECT's knowledge
acquisition tools to extend the knowledge base.
-
INSPECT-II: An Air Campaign Planning Evaluation Aid.
DARPA Joint Forces Air Component Commander (JFACC) program.
Award Number F30602-97-C-0118.
April 1997 - September 2000.
Yolanda Gil (co-PI) and Bill Swartout (co-PI).
We extended the INSPECT air campaign
plan critiquing tool that we had previously developed.
Current funding is supporting thechnology transition
to the Joint Defense
Planner, which is on the path to becoming an integral part of TBMCS.
INSPECT was originally developed under the
DARPA Rome Laboratory Planning Initiative,
and was demonstrated at the first
US Air Force Expeditionary Force Experiment (EFX-98).
-
ROSETTA: Ontology-Based Agent Communication.
DARPA
Information Systems Office (ISO) Technology Integration
Experiment program.
Award Number F30602-97-1-0195.
July 1999 - July 2000.
Yolanda Gil (co-PI) and Robert MacGregor (co-PI).
The purpose of DARPA ISO Technology Integration Experiments is to
investigate high-payoff links across DARPA ISO programs.
Rosetta is a prototype message translation system that
supports communication between heterogeneous agents using ontology
merging technology and exploiting ontologies
developed under the HPKB program
to address inter-agent
communication issues of central relevance to CoABS.
-
EXPECT-II: A User-Centered Environment for the Development and
Adaptation of Knowledge-Based Planning Aids.
DARPA / Rome Planning Initiative (ARPI).
Award Number DABT 63-95-C-0059.
May 1995 - May 1999.
Yolanda Gil (co-PI) and Bill Swartout (co-PI).
EXPECT is a user-centered environment to
develop and maintain knowledge bases.
It includes knowledge acquisition tools,
problem solving and reasoning modules, and
a facility to generate natural language paraphrases
of its knowledge. EXPECT was used to develop
plan evaluation and critiquing systems
for logistics planning and for air campaign planning.
We participated in the Fourth Integrated
Feasibility Demonstration (IFD-4) of the Planning
Initiative, held at US Air Force Air Combat Command in June 1996
and in the Multi-Agent Planning, Visualization, and Simulation (MAPViS)
integrated demonstration in 1998.