ISRD develops systems that support the unique challenges of managing Big Data in eScience. Researchers need lightweight, user friendly tools to capture, manage, and curate data throughout the scientific discovery process.
To that end, ISRD has developed the Discovery Environment for Relational Information and Versioned Assets (DERIVA) platform. DERIVA is an ecosystem of open source tools that together provide a Digital Asset Management System (DAMS) that addresses the needs of biomedical researchers. DERIVA supports:
- Acquisition and characterization of diverse scientific assets
- Model-driven organization and discovery of assets
- Storage and retrieval of eScience data assets
- Aggregation and exchange of data collections
- Rights management/access control
via the following technologies:
- Multi-tenant relational data service for domain models (ERMREST) with a client library (ERMRESTJS),
- Object storage service for assets (HATRAC),
- Suite of adaptive user interface applications (CHAISE),
- Suite of utilities for ingest and export of assets and metadata (IObox),
- Asset aggregation package format (BDBag), and
- Shared authentication layer (WebAuthN)
ERMREST (rhymes with "earn rest") is a general relational data storage service for web-based, data-oriented collaboration. It allows general entity-relationship modeling of data resources manipulated by RESTful access methods. A client library - EMREST JS - is also available.
ERMREST Git Repo: https://github.com/informatics-isi-edu/ermrest
ERMREST JS is the client library for ERMREST.
Hatrac (pronounced "hat rack") is a simple object storage service for web-based, data-oriented collaboration. It presents a simple HTTP RESTful service model with hierarchical data naming, access control suitable for collaboration, trivial support for browser-based applications, referential stability for immutable data, atomic binding of names to data and consistent use of distributed data.
IObox is a collection of Extract, Transform, Load (ETL) utilities for ERMrest+Hatrac. Extract: Connects to a data source (or read from a file) and generates a data "bag". Transform: Takes a "bag", runs transformations and outputs an updated "bag". Load: Takes a "bag" and loads it into a data sink.
Used with IOBox (above), BDBag is a specification for asset aggregation packages - simple yet powerful mechanisms for specifying, sharing, and managing complex, distributed, large datasets - or “bags”. These combine a simple and robust method for describing data collections (BDBags), data descriptions (Research Objects), and simple persistent "minimal identifiers" (Minids) to create a powerful ecosystem of tools and services for big data analysis and sharing.
Git Repo: https://github.com/ini-bdds/bdbag
WebAuthN is a compact, modular authentication provider framework written to support Python-based, RESTful Web services and is used by ERMREST and HATRAC. It allows deployment-time configuration of several alternative identity and attribute provider modules to establish client security contexts for Web requests by talking to a local or remote provider.