Informatics Systems Research


ISRD develops systems that support the unique challenges of managing Big Data in eScience. Researchers need lightweight, user friendly tools to capture, manage, and curate data throughout the scientific discovery process.

To that end, ISRD has developed the Discovery Environment for Relational Information and Versioned Assets (DERIVA) platform. DERIVA is an ecosystem of open source tools that together provide a Digital Asset Management System (DAMS) that addresses the needs of biomedical researchers. DERIVA supports:

  • Acquisition and characterization of diverse scientific assets
  • Model-driven organization and discovery of assets
  • Storage and retrieval of eScience data assets
  • Aggregation and exchange of data collections
  • Rights management/access control

via the following technologies:

  • Multi-tenant relational data service for domain models (ERMREST) with a client library (ERMRESTJS),
  • Object storage service for assets (HATRAC),
  • Suite of adaptive user interface applications (CHAISE),
  • Suite of utilities for ingest and export of assets and metadata (IObox),
  • Asset aggregation package format (BDBag), and
  • Shared authentication layer (WebAuthN)


ERMREST (rhymes with "earn rest") is a general relational data storage service for web-based, data-oriented collaboration. It allows general entity-relationship modeling of data resources manipulated by RESTful access methods. A client library - EMREST JS - is also available.



ERMREST JS is the client library for ERMREST.

Git Repo:


Hatrac (pronounced "hat rack") is a simple object storage service for web-based, data-oriented collaboration. It presents a simple HTTP RESTful service model with hierarchical data naming, access control suitable for collaboration, trivial support for browser-based applications, referential stability for immutable data, atomic binding of names to data and consistent use of distributed data.

Git Repo:


CHAISE is a model-driven web interface (more formally a user agent) for data discovery, analysis, visualization, editing, sharing and collaboration over tabular data (more specifically relational data) served up as Web resources by the ERMrest service. Chaise dynamically renders relational data resources based on a small set of baseline assumptions, combined with its rendering heuristics, and finally user preferences in order to support common user interactions with the data. Chaise is developed in JavaScript, HTML, CSS, and runs in most modern Web browsers. It is the front-end component of the suite of tools including ERMREST, HATRAC, and IObox.

Git Repo:


IObox is a collection of Extract, Transform, Load (ETL) utilities for ERMrest+Hatrac. Extract: Connects to a data source (or read from a file) and generates a data "bag". Transform: Takes a "bag", runs transformations and outputs an updated "bag". Load: Takes a "bag" and loads it into a data sink.

Git Repo:


Used with IOBox (above), BDBag is a specification for asset aggregation packages - simple yet powerful mechanisms for specifying, sharing, and managing complex, distributed, large datasets - or “bags”. These combine a simple and robust method for describing data collections (BDBags), data descriptions (Research Objects), and simple persistent "minimal identifiers" (Minids) to create a powerful ecosystem of tools and services for big data analysis and sharing.

Git Repo:


WebAuthN is a compact, modular authentication provider framework written to support Python-based, RESTful Web services and is used by ERMREST and HATRAC. It allows deployment-time configuration of several alternative identity and attribute provider modules to establish client security contexts for Web requests by talking to a local or remote provider.

Git Repo: