Seminars and Events

ISI Natural Language Seminar

Computational Models of Language Change from Diachronic Text

Event Details

Natural languages undergo change over time. Modeling language change can help uncover latent social factors that modulate change; in particular, to answer key social science questions such as who talks to whom, who leads, and who follows. In this talk, I’ll present our work over the years that uses timestamped (also called diachronic) text to link social influence or leadership with language change, combining methods from computational linguistics, machine learning, and network science. First, I show that network influence exerted through strong ties leads to higher adoption of non-standard terms on Twitter’s communication network. Next, I propose a method to identify documents at the forefront of semantic change and further show that such documents are more influential in terms of the citations they get across two domains — a collection of legal documents and a set of scientific abstracts. Finally, I introduce a method to induce a semantic leadership network between 19th century abolitionist newspapers that helps emphasize quantitatively the important role played by Black and women editors in the abolitionist movement. The combination of these studies demonstrates the variety of domains in which the study of language change is relevant and how computational modeling can help determine the latent influence and leadership relations between sources of interest.

Speaker Bio

Sandeep Soni is a PhD candidate at the Georgia Institute of Technology. His research interests are in computational social science and digital humanities with an emphasis on using text as data and computational linguistics methods. His PhD thesis is focussed on developing methods to use language change as a way to systematically infer latent influence relationships from language data. He is currently on the job market looking for postdoc and permanent positions.