Understanding content popularity and detection of malicious behavior in online social media

Friday, November 3, 2017, 10:30 am - 11:30 am PSTiCal
6th floor large conference room (note this will be a remote interview)
This event is open to the public.
AI Seminar - Interview talk
Suman Kalyan Maity

In this talk, I would like to focus on two important aspects in online social media - i) understanding the content popularity of social media entities like hashtags in Twitter, topics in Quora, tags in StackOverflow ii) understanding characteristics of malicious behavior in OSN. Emergence and adoption of social media entities is an interesting phenomena. I will particularly focus on emergence and adoption of hashtag compounding (two hashtags merged together to form a new hashtag). For example, the e-commerce company Amazon used #AmazonPrimeDay to promote the discounted sale of its product. The hashtag is a compound of #Amazon and #PrimeDay whereas the individual hashtag #PrimeDay was also popular. So, there is a trade-off whether to use hashtag compounds or the uncompounded constituents. Hashtag compounds also serve the communicative intents like political campaign hashtags (#PresidentTrump = #President + #Trump). Hashtag compounding also happens spontaneously. These hashtags are generally conversational or personal themed hashtags like #TheBestFeelingInARelationship (#TheBestFeeling + #InARelationship), #ThrowbackThursday (#Throwback + #Thursday), #ComeOnNowDontLie (#ComeOnNow + #DontLie). While some of these compounds gain a high frequency of usage over time (even higher than the individual constituents) many of them soon lost into oblivion. We focus and investigate the reasons behind the above observations and propose a prediction model that can identify with 77.07% accuracy if a pair of hashtags compounding in the near future shall become popular. This technique has strong implications to trending hashtag recommendation since newly formed hashtag compounds can be recommended early, even before the compounding has taken place. Next, I will discuss detection of malicious content focusing on spam detection and rumor spread and propagation in Twitter. In this context, we develop a semi-supervised biased random walk based embedding framework that can efficiently detect Spammer nodes in the network. Further, for rumor detection, we design a deep learning framework using LSTM and CNN architecture that leverage both content of the posts and the propagation tree of retweets and replies and achieve a high accuracy of ~90% that outperforms the existing methods.

Suman Kalyan Maity is currently an IBM Ph.D. Fellow at Dept. of Computer Science and Engineering, Indian Institute of Technology (IIT) Kharagpur, India. Prior to that he was a Microsoft Research India Ph.D. fellow from 2014-2016. He has received his Masters in Computer Science and Engineering from IIT Kharagpur, India in 2014 with a thesis on "Aspects of Opinion Formation in Social Networks". His current research interest lies in the intersectional area of computational social science, NLP and Machine Learning.


« Return to Upcoming Events