Publications
NiW: Converting Notebooks into Workflows to Capture Dataflow and Provenance.
Abstract
Interactive notebooks are increasingly popular among scientists to expose computational methods and share their results. However, it is often challenging to track their dataflow, and therefore the provenance of their results. This paper presents an approach to convert notebooks into scientific workflows that capture explicitly the dataflow across software components and facilitate tracking provenance of new results. In our approach, users should first write notebooks according to a set of guidelines that we have designed, and then use an automated tool to generate workflow descriptions from the modified notebooks. Our approach is implemented in NiW (Notebooks into Workflows), and we demonstrate its use by generating workflows with third-party notebooks. The resulting workflow descriptions have explicit dataflow, which facilitates tracking provenance of new results, comparison of workflows, and sub-workflow mining. Our guidelines can also be used to improve understandability of notebooks by making the dataflow more explicit.
- Date
- January 1, 1970
- Authors
- Lucas AMC Carvalho, Regina Wang, Yolanda Gil, Daniel Garijo
- Conference
- K-CAP Workshops
- Pages
- 12-16