Artificial Intelligence

OntoGene & SASEBio: biomedical text mining research at UZH

Friday, February 14, 2014, 11:00am - 12:00pm PDTiCal
11th fl Large CR (rm 1135)
AI Seminar
Fabio Rinaldi, MSc, PhD - University of Zurich



There are vast amounts of knowledge encoded in the scientific literature which could be made more easily accessible and useful to a broader range of users through the application of more effective software tools. Text mining is a new discipline which seeks to provide ways to find, extract and manipulate the knowledge which still remains to a large extent hidden in the literature.

Text mining tools can already provide a very effective way to extract some specific types of information, but are not yet so advanced that their results can be used without human verification by domain experts. Therefore one very promising area of application of text mining technologies is within the process of database curation.

The need to efficiently retrieve key information derived from experimental results, and published in the scientific literature, is of fundamental importance in biology. In order to help biologists, as well as in some cases medical practitioners, to efficiently find such information in the enormous quantity of published articles, several public and private institutions fund the construction and maintenance of specialized databases, which have the role to collect specific knowledge items and provide them in an easily accessible format. There are several dozens of such databases, each specializing in a particular domain of the life sciences [1].

In this talk I will describe text mining activities conducted by my research group at the University of Zurich (OntoGene: The OntoGene group is supported by the Swiss National Science Foundation (project SASEBIO: Semi-Automated Semantic Enrichment of the Biomedical Literature) and by Roche Pharmaceuticals. The SASEBio project focuses in particular on applications of text mining technologies to the process of biomedical database curation.

The OntoGene team has participated in several competitive evaluations of biomedical text mining technologies, obtaining competitive results in all of them. Some of these results will be discussed in the talk. Additionally, I will present ODIN (OntoGene Document Inspector), an interactive tool which allows database curators to leverage upon the results of the OntoGene text mining system and use them in their curation tasks.

[1] Xosé M. Fernández-Suárez, Daniel J. Rigden, and Michael Y. Galperin. The 2014 nucleic acids research database issue and an updated NAR online molecular biology database collection. Nucleic Acids Research, 42(D1):D1-D6, 2014

Short Bio

Fabio Rinaldi is the leader of the OntoGene research group at the University of Zurich and the principal investigator of the SASEBio project. He holds an MSc in Computer Science (University of Udine, Italy) and a PhD in Computational Linguistics (University of Zurich, Switzerland). He is author of more than 100 scientific publications (including 19 journal papers) dealing with topics such as Ontologies, Text Mining, Text Classification, Document and Knowledge Management, Language Resources and Terminology.

« Return to Events