SciKnowMine Release Workshop - Bridging BioNLP and Biocuration
Biological Natural Language Processing ('BioNLP') holds great promise to support and accelerate biocuration (organizing published biomedical knowledge into online resources such as databases) but has not yet generated viable open technology for use within the community. This is an area of active research and is the subject of shared evaluations such as 'BioCreative 4'. As the closing meeting of an NSF-funded infrastructure project (called 'SciKnowMine', #0849977), we intend to (A) present an implementation of a system for document triage that we are currently deploying to the Mouse Genome Informatics (MGI) system, (B) present and develop a strategic plan for open-source community-driven tools that bridge between curators committed to improving the quality of their informatics resources and computer science specialists developing novel NLP technology.
We illustrate this concept in Figure 1 below and seek to define and provide viable points of interaction between the two communities that can facilitate and support biocuration in active database systems.
Figure 1: Schema for the development of bridge technology between biocurators and NLP researchers.
The workshop will run from USC's Information Sciences Institute and we will present talks from key stakeholders in this activity: (A) the developers of the SciKnowMine system; (B) NLP computer scientists developing novel applications based on text from articles curated by MGI, (C) from a number of biomedical databases themselves to describe their requirements and from other stakeholders (such as publishers, and the organizers of the BioCreative evaluation). We will present a half-day of presentations followed by panel discussions to plan and develop open-source strategies for developing biocuration systems based on the SciKnowMine platform.
Date: Monday, August 19th, 2013
Location: 'Bayview room', Marina del Rey Mariott. 4100 Admiralty Way. Marina Del Rey, CA 90292.
Contact and Organizer: Gully Burns ([email protected])
Complete Video Coverage of the Workshop
The program mainly consisted of 25-minute long presentations with 5 minutes for questions. We finished the day with an informative group discussion focussed on setting up collaborative projects and next steps.
Full PDF download of the program + abstracts
All videos are shared on USC/ISI's YouTube Channel
Meeting Attendees
Attendees consisted of NLP specialists, infrastructure creators, biocurators and students.
- Gully Burns (ISI, Biomedical Knowledge Engineering Group)
-
Ellen Riloff (U Utah, Riloff Group)
-
Maximilian Häussler (UCSC, Genome Browser)
-
Kevin Bretonnel Cohen (U Colorado, Hunter Group)
-
James Gung (U Colorado Boulder, CLEAR Group)
-
Robin Haw (Ontario Institute for Cancer Research, Reactome Group)
-
Donghui Li (Stanford University, TAIR Group)
-
Paul Lloyd (Stanford University, Cherry Group)
-
Cecilia Arighi (University of Delaware, Protein Information Resource)
-
Jack Gardiner (University of Arizona & Iowa State University, MaizeDB)
-
William Gunn (Mendeley)
-
Janan Eppig (Jackson Labs, Mouse Genome Informatics)
-
Jim Kadin (Jackson Labs, Mouse Genome Informatics)
-
Harold Drabkin (Jackson Labs, Mouse Genome Informatics)
-
Judy Blake (Jackson Labs, Mouse Genome Informatics)
-
Weisong Liu (Medical College of Wisconsin, Rat Genome Database)
-
Jim Hu (Texas A&M University, Hu Group)