Contextual Bandits in a Collaborative Environment

Friday, February 9, 2018, 3:00 pm - 4:00 pm PSTiCal
Conf. Rm #1135
This event is open to the public.
NL Seminar
Hongning Wang (University of Virginia)

Abstract: Contextual bandit algorithms provide principled online learning solutions to find optimal trade-offs between exploration and exploitation with companion side-information. They have been extensively used in various important practical scenarios, such as display advertising and content recommendation. A common practice estimates the unknown bandit parameters pertaining to each user independently. This unfortunately ignores dependency among users and thus leads to suboptimal solutions, especially for the applications that have strong social components.

In this talk, I will introduce our newly developed collaborative contextual bandit algorithm, in which the adjacency graph of users is leveraged to share context and payoffs among neighboring users during online updating. We rigorously prove an improved upper regret bound of the proposed collaborative bandit algorithm comparing to conventional independent bandit algorithms. More importantly, we also prove that user dependency relation is only needed to be time-invariant, such that a sublinear upper regret bound is still achievable in such an algorithm. This enables online user dependency estimation. Extensive experiments on both synthetic and three large-scale real-world datasets verified the improvement of our proposed algorithm against several state-of-the-art contextual bandit algorithms. In addition, I will also cover our recent progress in online matrix factorization, optimizing user long- term engagement, and bandit learning in a non-stationary environment.

Bio: Dr. Hongning Wang is now an Assistant Professor in the Department of Computer Science at the University of Virginia. He received his Ph.D. degree in computer science at the University of Illinois at Champaign-Urbana in 2014. His research generally lies in the intersection among machine learning, data mining and information retrieval, with a special focus on computational user intent modeling. His work has generated over 40 research papers in top venues in data mining and information retrieval areas. He is a recipient of 2016 National Science Foundation CAREER Award and 2014 Yahoo Academic Career Enhancement Award.

« Return to Upcoming Events