Computer Vision/Image Processing

Face recognition

VISTA leads the way in face recognition, combining algorithms from computer graphics, deep learning and computer vision to develop technology with huge implications for security and commerce.

Sponsored by IARPA’s Janus program, we are developing face recognition systems designed to teach computers how to recognize people “in the wild,” through photos and videos captured in uncontrolled settings.

Using powerful algorithms and software to process large amounts of data, our face recognition researchers are “teaching” computers how to identify a person with variations in lighting, pose, age and facial expression. 

Key aspects of our approach include: deep learning, neural networks, transfer-learning to improve in-domain performance, locality-sensitive hashing and a novel representation called SQSH. 


In computer security, biometrics relates to automatic authentication of individuals using measurable human characteristics, such as fingerprints, face recognition and iris recognition.  Biometric authentication is used as a form of identification and access control at facilities across the commercial and government spectrum.

The BATL (Biometric Authentication with a Timeless Learner) project aims to create systems and algorithms resilient to presentation attacks (spoofing) of biometric authentication systems, including the face, iris and fingerprint. Partners include TREX Enterprises, Idiap Research Institute, TU Darmstadt and Northrup Grumman.

Media Forensics

Forensic image processing involves the computer restoration and enhancement of surveillance imagery to assess its truthfulness or integrity.

Sponsored by DARPA’s Medifor Program, ISI’s DiSPARITY project aims to characterize signs of manipulated images and video. To achieve this, the team considers important indicators such as pixel-level attributes, the physics of the scene, and the semantics and genealogy of the image or video asset.

Challenges include the wide variety of image-capturing devices, increasing sophistication of manipulation tools and techniques (including rapid advances in computer graphics technology such as Photoshop), the emergence of generative adversarial networks (GANs), and the sheer volume of analyzable data.

Optical Character Recognition

Optical Character Recognition (OCR) is a key technology for scanning books, signs, documents and other real-world texts into digital form for historical purposes, for policy purposes (e.g., census documents), and enterprise intelligence/efficiencies.

Leveraging recent neural network advances in the fields of computer vision and speech recognition, VISTA researchers are developing a new OCR system from scratch. The goal is to shift from statistical hidden Markov models (HMMs) to a more efficient neural network-based system.  

The team’s combination of convolutional neural networks (CNNs) and long short-term memory (LSTM) recurrent networks demonstrated top performance in a pilot United States Census program – achieving 79 per cent accuracy on last-name recognition for handwritten names from the 1990's census – and on the challenging MADCAT Arabic handwriting recognition dataset. Both of these efforts are described in papers presented at the International Conference on Document Analysis (ICDAR), 2017.

VISTA researchers have also developed a novel text detection algorithm, that models the text detection problem as a three-class problem rather than as a binary classification problem.  By expanding current capabilities, the team aims to create systems capable of recognizing complex document layouts.