Machine Learning and Applications

Focusing on fundamental research, including AI robustness, adversarial machine learning, anti-spoofing, domain adaptation and federated learning, and applied research in application areas such as biomedical sciences, biometric authentication, computational social science and cybersecurity. 

People

Wael Abd-Almageed - Jose Luis Ambite - Aram Galstyan - Shrikanth Narayanan - Jay Pujara - Filip Ilievski - Muhao Chen - Xuezhe Ma - Greg Ver Steeg - Ke-Thia Yao - Mohammad Rostami

Projects

  • Commonsense reasoning
    Multi-modal open world grounded learning and inference
  • CORAL
    Combined Representations for Adept Learning (DARPA Learning with Less Labels)
  • QUASAR
    QUantum Assisted Sampling for MAchine LeaRning (DARPA)
  • Secure Heterogeneous Learning Federation with Information‑Theoretic Guarantees (DARPA)
  • BATL
    Biometrics Authentication with a Timeless Learner (IARPA)
  • LR2
    Learning Robust Representations (DARPA)
  • AI2AI
    AI Investigating AI (Keston Award)
  • DARPA Cooperative Secure Learning SHELFI
    Secure Heterogeneous Learning Federation with Information-Theoretic Guarantees
robot hand touching computer keyboard
human and robot profiles face to face

Natural Language Processing

Focusing on low resource machine translation, multilingual representation learning, transfer learning, dialogue, decision-making, question answering, summarization, ontologies, information retrieval, text decipherment.
Visit Website

People

Jon May - Muhao Chen - Xuezhe Ma - Xiang Ren - Ulf Hermjakob - Elizabeth Boschee - Marjorie Freedman - Jerry Hobbs - Scott Miller - Nanyun (Violet) Peng 

Projects

  • CLEAR
    Cross-Lingual Event and Argument Retrieval (IARPA BETTER)
  • CORAL
    Combined Representations for Adept Learning (DARPA Learning with Less Labels)
  • ELICIT
    A System for Extracting and Organizing Causal Information (DARPA Causal Exploration)
  • ELISA
    Exploiting Language Information for Situational Awareness (DARPA LORELEI)
  • EvidxExtraction
    Evidence Extraction Systems for the Molecular Interaction Literature (NIH R01)
  • LESTAT
    Learning Event Schema Temporally and Transmodally (DARPA KAIROS)
  • MICS
    Machine Intelligence from Common Sense (DARPA MCS)
  • SARAL
    Summarization and Domain-Adaptive Retrieval Across Languages (IARPA MATERIAL)

Knowledge Graphs

Using AI and machine learning techniques to construct and exploit large-scale knowledge bases and to induce taxonomies from data. Notable applications include probabilistic models for scientific reproducibility, incorporating extractions from scientific articles and scientific networks of citation and reference, and business knowledge graphs characterizing innovation and competition using web data and regulatory filings.
Visit Website

People

Jay Pujara - Filip Ilievski - Ke-Thia Yao - Muhao Chen - Craig Knoblock - Jose Luis Ambite

Projects

colored shapes connected by lines
complex charts

Scientific Data Analysis and Discovery

Using interactive knowledge capture, intelligent user interfaces, semantic workflows, provenance, and collaboration; large-scale data integration and analysis of biomedical data (including sensor, environmental, neuroimaging, clinical and genetic data) and (paleo)climate data.
Visit Website

People

Yolanda Gil - Deborah Khider - Jose Luis Ambite

Projects

  • WINGS
    A semantic workflow system that assists scientists with the design of computational experiments
  • MINT
    Integrating scientific models
  • DISK
    Automating the discovery of scientific models
  • LinkedEarth
    Paleoclimate data and analysis

    • autoTS
      Automating time series analysis
    • PreSto
      Paleoclimate Reconstruction Storehouse
  • Scientific/geoscience paper of the future
    Encouraging scientists to publish papers with the associated products of their research
  • P4ML
    A phased performance-based pipeline planner for automated machine learning
  • ASSET
    A sketching project to accelerate scientific workflows
  • OntoSoft
    A software metadata registry to describe scientific software in a user-friendly manner
  • Organic data science
    Resolving science processes through an open framework that facilitates participation
  • OPMW-Prov
    Tracking the provenance of scientific experiments and their executions
  • NHGRI
    National Human Genome Research Institute

    • PAGE
      Population Architecture using Genomics and Epidemiology Coordinating Center
  • NeuroBridge
    Automating the discovery of scientific models
  • NIMH
    National Institute of Mental Health

    • NRGR
      Repository and Genomics Resource

Multi-modal Understanding

Including image and video understanding for deepfake detection, visual misinformation identification, identifying manipulated scientific literature and multimedia analysis, face recognition, biometric anti-spoofing, and robust AI; table understanding to automate exploitation of millions of tables on the web focusing on automatic layout detection, semantic modeling, table retrieval, table summarization, entity linking, and fact-checking.

People

Wael Abd-Almageed - Jay Pujara - Muhao Chen

Projects

  • Table Understanding
    Understanding the structure and semantics of tables (DARPA)
  • DiSPARITY
    Digital, Semantic and Physical Analysis of media integRITY (DARPA)
profiles of person to person talking
profile of person with objects coming out of their head

Common Sense Representation and Reasoning

Using cognitively-inspired computational paradigms for evaluating commonsense AI (including those based on large-scale language models) to create and solve new challenge tasks based on logical axioms and numeracy;   human-centric dialog agents that maximize metrics of human utility alongside algorithmic utility in task-focused dialogs; game-theoretic simulators for poker, Monopoly, and wargames that enable refinement and evaluation of theories of novelty for general AI agents.
Visit Website

People

Mayank Kejriwal - Jay Pujara - Filip Ilievski - Muhao Chen - Xiang Ren

Projects

  • Commonsense reasoning
    Multi-modal open world grounded learning and inference
  • CSKG
    The commonsense knowledge graph

Computational Social Science

With emphasis on structure detection and pattern matching in unusual complex systems with hidden information (e.g., human trafficking, dark money networks); large-scale, contextualized social media analysis (e.g., in the context of natural disasters) including analysis involving non-verbal tokens such as emojis; computational social science methods for quantifying socio-demographically segmented impacts of COVID-19 on wellbeing, technological inequity, and vaccine hesitancy; applied AI in industrial applications, such as e-commerce.

People

Emilio Ferrara - Kristina Lerman - Fred Morstatter - Goran Muric - Keith Burghardt

Projects

  • Polarization in the context of COVID-19
    The pandemic has exacerbated echo chambers, paving the way for the rampant spread of misinformation.
  • Science of growth
    This project explores the impact of growth on various human behaviors, such as how the growth of cities impact infrastructure, how message boards increase interactions, or the growth of institutions impact collaborations.
  • DARPA Influence Campaign Awareness and Sensemaking (INCAS)
    • Early Detection of Influence Indicators with Machine Intelligence (EDIFICE)
      Project to detect influence campaigns in foreign social media
    • Universal Population Segmentation and Characterization Algorithms for OnLine Environments (UPSCALE)
      Project to analyze influence campaigns in foreign social media
  • VENICE
    VErifyiNg Implicit Cultural modEls: Infer cultural causal relationships; store in queryable knowledge graph. Verify relationships via real-world interviews with people in foreign country.
Dark human shadow in front of convoluted pink figure
Robot holding balance scale

AI Robustness and Safety

Detecting and addressing performance disparities, enhancing robustness against adversarial inputs, measuring representation accuracy, analyzing information vectors and quality, predictive evaluation, and distributed assessment methodologies.

People

Fred Morstatter - Kristina Lerman - Jay Pujara - Keith Burghardt

Projects

  • Evaluating AI System Robustness
    Many widely-used AI tools contain embedded representational skews. These projects focus on quantifying and assessing the downstream impacts of these calibration issues, including investigation of systematic failures in named entity recognition systems when processing different identifiers.
  • Improving Data Representativeness
    Enhancing system robustness can be approached through model-agnostic methods, enabling the application of state-of-the-art models to appropriately balanced datasets. This project explores optimal methodologies for various data categories.
  • Security Vulnerabilities in Calibration
    Strategic inputs can deliberately affect AI system behavior and output distributions. We investigate the effectiveness of such techniques in manipulating system robustness and develop safeguards to protect AI systems from these vulnerabilities.