Publications

Probabilistic Visitor Stitching on Cross-Device Web Logs

Abstract

Personalization -- the customization of experiences, interfaces, and content to individual users -- has catalyzed user growth and engagement for many web services. A critical prerequisite to personalization is establishing user identity. However the variety of devices, including mobile phones, appliances, and smart watches, from which users access web services from both anonymous and logged-in sessions poses a significant obstacle to user identification. The resulting entity resolution task of establishing user identity across devices and sessions is commonly referred to as ``visitor stitching.'' We introduce a general, probabilistic approach to visitor stitching using features and attributes commonly contained in web logs. Using web logs from two real-world corporate websites, we motivate the need for probabilistic models by quantifying the difficulties posed by noise, ambiguity, and missing information in deployment …

Date
March 4, 2026
Authors
Sungchul Kim, Nikhil Kini, Jay Pujara, Eunyee Koh, Lise Getoor
Conference
Proceedings of the 26th International Conference on World Wide Web
Pages
1581-1589