SportsBuzzer: Detecting Events at Real Time in Twitter using Incremental Clustering
AbstractIn the recent past, twitter users are highly regarded as social sensors who can report events and Twitter has been widely used to detect social and physical events such as earthquakes and traffic jam. Real time event detection in Twitter is the process of detecting events at real time from live tweet stream as soon as an event has happened. Real time event detection from sports tweets, such as Cricket is an interesting, yet a complex problem. Because, an event detection system needs to collect live sports tweets and should rapidly detect key events such as boundary and catch at real-time when the game is ongoing. In this paper, a novel framework is proposed for detecting key events at real time from live tweets of the Cricket sports domain. Feature vectors of live tweets are created using TF-IDF representation and tweet clusters are discovered using Locality Sensitive Hashing (LSH) where the post rate of each cluster based on the volume of tweets is computed. If the post rate is above the predefined threshold, then a key event recognized from that cluster using our domain specific event lexicon for Cricket sports. The predefined threshold helps to filter out small spikes in the tweets volume. The proposed real-time event detection algorithm is extensively evaluated on 2017 IPL T20 Cricket live tweets using ROC evaluation measure. The experimental results on the performance of the proposed approach show that the LSH approach detects sports events with nearly 90% true positive rate and around 10% false positive rate. The results have also demonstrated the influence of different parameters on the accuracy of the event detection.
(2) Atefeh, F and Khreich, W. A survey of techniques for event detection in twitter. Computational Intelligence, 2015. 31(1): p. 132-164.
(4) Zhao, S., Zhong, L., Wickramasuriya, J and Vasudevan, V. Human as real-time sensors of social and physical events: A case study of twitter and sports games. ArXiv preprint, 2011. arXiv:1106.4300.
(6) Hasan, M., Orgun, M. A and Schwitter, R. A survey on real-time event detection from the Twitter data stream. Journal of Information Science, 2017. 0165551517698564.
(8) Y. Qu., C. Huang., P. Zhang and J. Zhang. Microblogging after a major disaster in China: a case study of the 2010 Yushu earthquake. In Proc. ACM 2011 conference on Computer supported cooperative work, 2011.
(10) J. Sankaranarayanan., H. Samet., B. E. Teitler., M. D. Lieberman and J. Sperling. TwitterStand: news in tweets. In Proc. ACM SIGSPATIAL, 2009.
(14) Becker, H., F. Chen., D. Iter., M. Naaman and L. Gravano. Automatic identification and presentation of Twitter content for planned events. In Proc. International AAAI Conference on Weblogs and Social Media, Barcelona, Spain, 2011.
(15) Becker, H., M. Naaman and L. Gravano. Selecting quality Twitter content for events. In Proc. International AAAI Conference on Weblogs and Social Media, Barcelona, Spain, 2011b.
(18) Gu, H., X. Xie, Q. Lv, Y. Ruan and L. Shang. ETree: Effective and
(24) F. Alvanaki., M. Sebastian., K. Ramamritham and G. Weikum.
(26) Shane Fitzpatrick. Improving new event detection in social streams. 2014. Master Thesis.
(29) J. Hannon., K. McCarthy., J. Lynch and B. Smyth. Personalized and automatic social summarization of events in video. In Proc. ACM IUI, 2011.
(30) D. Chakrabarti and K. Punera. Event Summarization using Tweets. In Proc. AAAI ICWSM, 2011.
(35) M. Hasan., M.A. Orgun and R. Schwitter. TwitterNews: real time event detection from the Twitter data stream. PeerJ PrePrints, 2016.
(36) M. A. Russell. Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites. O'Reilly Media Inc, 2011.
(37) M. S. Charikar. Similarity estimation techniques from rounding algorithms. In Proc. 34th Annual ACM Symposium on Theory of Computing, Montreal, Quebec, Canada, 2002. p. 380-388.