Publications

Democratizing data science through data science training

Abstract

The biomedical sciences have experienced an explosion of data which promises to overwhelm many current practitioners. Without easy access to data science training resources, biomedical researchers may find themselves unable to wrangle their own datasets. In 2014, to address the challenges posed such a data onslaught, the National Institutes of Health (NIH) launched the Big Data to Knowledge (BD2K) initiative. To this end, the BD2K Training Coordinating Center (TCC; bigdatau.org) was funded to facilitate both in-person and online learning, and open up the concepts of data science to the widest possible audience. Here, we describe the activities of the BD2K TCC and its focus on the construction of the Educational Resource Discovery Index (ERuDIte), which identifies, collects, describes, and organizes online data science materials from BD2K awardees, open online courses, and videos from scientific …

Date
September 10, 2025
Authors
John Darrell Van Horn, Lily Fierro, Jeana Kamdar, Jonathan Gordon, Crystal Stewart, Avnish Bhattrai, Sumiko Abe, Xiaoxiao Lei, Caroline O’Driscoll, Aakanchha Sinha, Priyambada Jain, Gully Burns, Kristina Lerman, José Luis Ambite
Journal
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
Volume
23
Pages
292