Seminars and Events
Texera: An Open-Source System for Cloud-Based Collaborative Data Science and AI/ML Using Workflows
Event Details
Speaker: Chen Li, UC Irvine
Location: Virtual via Zoom
Join Zoom Meeting
https://usc.zoom.us/j/95166882238?pwd=id334Bxxz7ZULMFpYWuHEppmFKlfUd.1
Meeting ID: 951 6688 2238
Passcode: 2025
Since 2016 our team at UC Irvine has been developing the Texera open-source system (texera.io), with the goal of enabling a cloud-based platform to support collaborative data science, AI, and ML. It allows users with various backgrounds, including those with limited coding skills, domain scientists, and ML experts, to conduct AI-centric data science with a collaboration experience similar to Google Docs. After eight years of development, the system has a rich set of features, such as shared editing, shared execution, version control, commenting, debugging, user-defined functions in multiple languages (e.g., Python, R, Java), and support of state-of-the-art AI/ML techniques. Its backend parallel engine enables scalable computation on large data sets using computing clusters. It allows bioinformaticians to elastictly request resources from AWS to form a cluster to run computationally intensive jobs. It also supports community-based sharing of resources including datasets and workflows. In this talk, we will give an overview of the system, and focus on research challenges encountered in the development and our solutions. We will show use cases in both education and scientific communities.