Machine Learning and Applications

Focusing on fundamental research, including AI robustness, adversarial machine learning, anti-spoofing, domain adaptation and federated learning, and applied research in application areas such as biomedical sciences, biometric authentication, computational social science and cybersecurity. 

People

Wael Abd-Almageed - Jose Luis Ambite - Aram Galstyan - Shrikanth Narayanan - Jay Pujara - Filip Ilievski - Muhao Chen - Xuezhe Ma - Greg Ver Steeg - Ke-Thia Yao - Mohammad Rostami - Mohamed Hussein - Leonidas Spinoulas - Hengameh M. Dastjerdi

Projects

  • Commonsense reasoning
    Multi-modal open world grounded learning and inference
  • CORAL
    Combined Representations for Adept Learning (DARPA Learning with Less Labels)
  • QUASAR
    QUantum Assisted Sampling for MAchine LeaRning (DARPA)
  • Secure Heterogeneous Learning Federation with Information‑Theoretic Guarantees (DARPA)
  • BATL
    Biometrics Authentication with a Timeless Learner (IARPA)
  • LR2
    Learning Robust Representations (DARPA)
  • AI2AI
    AI Investigating AI (Keston Award)
  • DARPA Cooperative Secure Learning SHELFI
    Secure Heterogeneous Learning Federation with Information-Theoretic Guarantees

Natural Language Processing

Focusing on low resource machine translation, multilingual representation learning, transfer learning, dialogue, decision-making, question answering, summarization, ontologies, information retrieval, text decipherment.

People

Jon May - Muhao Chen - Xuezhe Ma - Xiang Ren - Ulf Hermjakob - Elizabeth Boschee - Marjorie Freedman - Jerry Hobbs - Scott Miller - Nanyun (Violet) Peng - Ralph Weischedel

Projects

  • CLEAR
    Cross-Lingual Event and Argument Retrieval (IARPA BETTER)
  • CORAL
    Combined Representations for Adept Learning (DARPA Learning with Less Labels)
  • ELICIT
    A System for Extracting and Organizing Causal Information (DARPA Causal Exploration)
  • ELISA
    Exploiting Language Information for Situational Awareness (DARPA LORELEI)
  • EvidxExtraction
    Evidence Extraction Systems for the Molecular Interaction Literature (NIH R01)
  • LESTAT
    Learning Event Schema Temporally and Transmodally (DARPA KAIROS)
  • MICS
    Machine Intelligence from Common Sense (DARPA MCS)
  • SARAL
    Summarization and Domain-Adaptive Retrieval Across Languages (IARPA MATERIAL)

Visit Website

Knowledge Graphs

Using AI and machine learning techniques to construct and exploit large-scale knowledge bases and to induce taxonomies from data. Notable applications include probabilistic models for scientific reproducibility, incorporating extractions from scientific articles and scientific networks of citation and reference, and business knowledge graphs characterizing innovation and competition using web data and regulatory filings.

People

Jay Pujara - Pedro Szekely - Filip Ilievski - Ke-Thia Yao - Muhao Chen - Hans Chalupsky - Craig Knoblock - Jose Luis Ambite

Projects

Visit Website

Scientific Data Analysis and Discovery

Using interactive knowledge capture, intelligent user interfaces, semantic workflows, provenance, and collaboration; large-scale data integration and analysis of biomedical data (including sensor, environmental, neuroimaging, clinical and genetic data) and (paleo)climate data.

People

Yolanda Gil - Deborah Khider - Jose Luis Ambite

Projects

  • WINGS
    A semantic workflow system that assists scientists with the design of computational experiments
  • MINT
    Integrating scientific models
  • DISK
    Automating the discovery of scientific models
  • LinkedEarth
    Paleoclimate data and analysis
    • autoTS
      Automating time series analysis
    • PreSto
      Paleoclimate Reconstruction Storehouse
  • Scientific/geoscience paper of the future
    Encouraging scientists to publish papers with the associated products of their research
  • P4ML
    A phased performance-based pipeline planner for automated machine learning
  • ASSET
    A sketching project to accelerate scientific workflows
  • OntoSoft
    A software metadata registry to describe scientific software in a user-friendly manner
  • Organic data science
    Resolving science processes through an open framework that facilitates participation
  • OPMW-Prov
    Tracking the provenance of scientific experiments and their executions
  • NHGRI
    National Human Genome Research Institute
    • PAGE
      Population Architecture using Genomics and Epidemiology Coordinating Center
  • NeuroBridge
    Automating the discovery of scientific models
  • NIMH
    National Institute of Mental Health
    • NRGR
      Repository and Genomics Resource

Visit Website

Multi-modal Understanding

Including image and video understanding for deepfake detection, visual misinformation identification, identifying manipulated scientific literature and multimedia analysis, face recognition, biometric anti-spoofing, and robust AI; table understanding to automate exploitation of millions of tables on the web focusing on automatic layout detection, semantic modeling, table retrieval, table summarization, entity linking, and fact-checking.

People

Wael Abd-Almageed - Jay Pujara - Pedro Szekely - Muhao Chen - Mohamed Hussein

Projects

  • Table Understanding
    Understanding the structure and semantics of tables (DARPA)
  • DiSPARITY
    Digital, Semantic and Physical Analysis of media integRITY (DARPA)

Common Sense Representation and Reasoning

Using cognitively-inspired computational paradigms for evaluating commonsense AI (including those based on large-scale language models) to create and solve new challenge tasks based on logical axioms and numeracy;   human-centric dialog agents that maximize metrics of human utility alongside algorithmic utility in task-focused dialogs; game-theoretic simulators for poker, Monopoly, and wargames that enable refinement and evaluation of theories of novelty for general AI agents.

People

Mayank Kejriwal - Jay Pujara - Filip Ilievski - Muhao Chen - Xiang Ren

Projects

  • Commonsense reasoning
    Multi-modal open world grounded learning and inference
  • CSKG
    The commonsense knowledge graph

Visit Website

Computational Social Science

With emphasis on structure detection and pattern matching in unusual complex systems with hidden information (e.g., human trafficking, dark money networks); large-scale, contextualized social media analysis (e.g., in the context of natural disasters) including analysis involving non-verbal tokens such as emojis; computational social science methods for quantifying socio-demographically segmented impacts of COVID-19 on wellbeing, technological inequity, and vaccine hesitancy; applied AI in industrial applications, such as e-commerce.

People

Emilio Ferrara - Kristina Lerman - Fred Morstatter - Goran Muric - Keith Burghardt

Projects

  • Polarization in the context of COVID-19
    The pandemic has exacerbated echo chambers, paving the way for the rampant spread of misinformation.
  • Science of growth
    This project explores the impact of growth on various human behaviors, such as how the growth of cities impact infrastructure, how message boards increase interactions, or the growth of institutions impact collaborations.
  • DARPA Influence Campaign Awareness and Sensemaking (INCAS)
    • Early Detection of Influence Indicators with Machine Intelligence (EDIFICE)
      Project to detect influence campaigns in foreign social media
    • Universal Population Segmentation and Characterization Algorithms for OnLine Environments (UPSCALE)
      Project to analyze influence campaigns in foreign social media
  • VENICE
    VErifyiNg Implicit Cultural modEls: Infer cultural causal relationships; store in queryable knowledge graph. Verify relationships via real-world interviews with people in foreign country.

AI Fairness

Detecting and mitigating bias, robustness against adversarial attacks, identifying cultural values, polarization and misinformation, forecasting, and crowdsourcing. Notable case studies include a study of gender bias in 19th century English literature using natural language processing (NLP) methods and a study of how state-of-the-art named entity recognition approaches systematically fail to identify female names.

People

Fred Morstatter - Kristina Lerman - Jay Pujara - Keith Burghardt

Projects

  • Auditing fairness of AI systems
    Many off-the-shelf AI tools have societal biases embedded within them. These projects focus on identifying and measuring the impact of these biases.
  • Improving fairness in data
    Improving fairness can benefit from being model-agnostic, which allows for the best models to be applied to data of interest. This project explores what methods work best for various types of data.
  • Adversarial Attacks on Fairness
    Adversarial attacks attempt to intentionally skew the impact of a machine learning system. We investigate the efficacy of such attacks in the context of fairness.