Learning from Biased Data

Friday, May 24, 2019, 11:00 am - 12:00 pm PDTiCal
10th floor conference room (1016)
This event is open to the public.
AI Seminar
Kristina Lerman, USC-ISI
Video Recording:

Social data is often heterogeneous – generated by diverse subgroups, each with its own traits and behaviors. The correlations between the traits, outcomes, and how the data is collected, can bias the data and confound analysis. One consequence of bias is Simpson’s paradox, in which a population-level trend in aggregate data disappears or reverses when the same data is disaggregated by its underlying subgroups.  As a result, models learned on biased data will not generalize correctly to new populations or fail with specific subgroups. Despite these challenges, I argue that data biases are a source of information, not just an issue that requires correction, as their presence reveals important behavioral differences within the population. When correctly identified, biases can be used as a tool for data-driven discovery. I describe recent computational efforts to address this problem, including an algorithm that systematically disaggregates data to identify similarly-behaving subgroups. I illustrate with examples that show how accounting for biases in data can lead to new insights into social behavior, including recognizing the role of bounded rationality in online information diffusion and identifying the impact of cognitive depletion on performance.


Kristina Lerman is a Principal Scientist at the University of Southern California Information Sciences Institute and holds a joint appointment as a Research Associate Professor in the USC Computer Science Department. Trained as a physicist, she now applies network analysis and machine learning to problems in computational social science, including crowdsourcing, social network and social media analysis.  Her recent work on modeling and understanding cognitive biases in social networks has been covered by the Washington Post, Wall Street Journal, and MIT Tech Review    


« Return to Upcoming Events