A large fraction of internet social media content is found in thousands of specialized communities that are hosted by news outlets, typically in the form of reader forums or comments on news articles. The users of the such a site are said to form a vertical social community (VSC), because they deeply engage with a single media source. While each VSC is tiny compared to broad communities such as Facebook, they are important because they expose how different segments of society feel about various world events. This can be a very useful resource for downstream intelligence and predictive analytics. However, current web crawlers cannot effectively access VSCs. Thus their data is invisible to search engines, and remains hidden from analytics tools. The goals of this project are to enable effective access to vertical social communities coalesced at news reports online, and to mine their comments and debates. This project will provide researchers with tools to collect data from these communities and analyze them. The educational component of the project includes the involvement of graduate and undergraduate student training and research and the incorporation of research projects and results in courses
The researchers will develop algorithms to unearth the content generated at thousands of vertical social communities and make their content transparently accessible to data management and analytics tools. The researchers will develop novel deep learning techniques for content detection, and build a novel scalable end-to-end system for real-time access and collective mining of these communities, capable of handling large parallel data streams based on shifting ideas. The specific algorithms will include user population estimation, bootstrap communication patterns for automatic crawling of content, and fine-grained sentiment analysis for intelligence and predictive analytics. Software tools will be made available to researchers in academe and industry. Distribution of free, open-source software for implementing the techniques developed will enhance existing research infrastructure
Cannot Predict Comment Volume of a News Article before (a few) Users Read It. The International AAAI Conference on Web and Social Media. 2021.
Predicting Personal Opinion on Future Events with Fingerprints. International Conference on ComputationalLinguistics, (COLING'20). Dec. 2020. [acceptance rate: 33.4%]
Claim Verification under Positive Unlabeled Learning. The international conference series on Advances in Social Network Analysis and Mining (ASONAM'20). Dec. 2020.
Birds of a Feather Flock Together: Satirical News Detection via Language Model Differentiation. International Conference on Social Computing, Behavioral-Cultural Modeling & Prediction. 2020.
Stance Prediction for Contemporary Issues: Data and Experiments. The 8th International Workshop on Natural Language Processing for Social Media. July 9-10, 2020.
Pro/Con: Neural Detection of Stance in Argumentative Opinion. International Conference on Social Computing, Behavioral-Cultural Modeling & Prediction. 2019
On the Dynamics of User Engagement in News Comment Media, WIREs Data Mining and Knowledge Discovery. 2019.