Invited Lecture: INCF 2013 - Using experimental design to design neuroinformatics data structures
Workshop: SciKnowMine 2013 - Bridging BioNLP and Biocuration
Biological Natural Language Processing ('BioNLP') holds great promise to support and accelerate biocuration (organizing published biomedical knowledge into online resources such as databases) but has not yet generated viable open technology for use within the community. This is an area of active research and is the subject of shared evaluations such as 'BioCreative 4'. As the closing meeting of an NSF-funded infrastructure project (called 'SciKnowMine', #0849977), we held a workshop to (A) present an implementation of a system for document triage that we are currently deploying to the Mouse Genome Informatics (MGI) system, (B) present and develop a strategic plan for open-source community-driven tools that bridge between curators committed to improving the quality of their informatics resources and computer science specialists developing novel NLP technology. The meeting was well-attended by many experts from both communities and in-keeping with the vision of this blog of examining the issues inherent in developing scientific breakthroughs by explicitly describing the paradigms that different disciplines inhabit, the workshop was fully designed around the theme of finding connecting points between these two inter-dependent paradigms.
"Introducing paradigms as a viable structural guide for biomedical knowledge engineering"
Following Thomas Kuhn's seminal 1962 book in which he introduced the notion of scientific paradigms, we here describe a computational methodology that leverages that concept in a concrete formulation. I describe this approach partially as a methodology for framing and scoping the knowledge representation and analysis work necessary to build tools to serve a specific community. However this approach also has technical implications that are relevant to semantic web representations, the use of workflows and reasoning and the way that we derive content from existing scientific artefacts. We will explore this viewpoint in the context of a well defined domain problem (Biomarker studies of neurodegenerative diseases) with the strategic intent of developing a practical, scoped view of biomarker data that could serve as the basis of corollary work within AI computer science groups.
"Organizing the world’s scientific knowledge to make it universally accessible and powerful: Building the Breakthrough Machine"
Video Link: http://www.youtube.com/watch?v=DU5HRck4bn4
Not all information is created equal. Accurate, innovative scientific knowledge generally has an enormous impact on humanity. It is the source of our ability to make predictions about our environment. It is the source of new technology (with all its attendent consequences, both positive and negative). It is also a continuous source of wonder and fascination. In general, the value and power of scientific knowledge is not reflected in the scale and structure of the information infrastructure used to house, store and share this knowledge. Many scientists use spreadsheets as the most sophisticated data management tool and only publish their data as PDF files in the literature. In this high-level talk, we describe a powerful, new knowledge engineering framework for describing scientific observations within a broader strategic model of the scientific process. We describe general open-source tools for scientists to model and manage their data in an attempt to accelerate discovery. Using examples focussed on the high-value challenge problem: finding a cure for Parkinson's Disease, we present a high-level strategic approach that is both in-keeping with Google's vision and values and could also provide a viable new research that would benefit from Google's massively scalable technology. Ultimately, we present an informatics research initiative for the 21st century: 'Building a Breakthrough Machine".