Natural Language Processing for the Privacy of Internet Users

Shomir Wilson
Assistant Professor, Department of Electrical Engineering and Computer Science
University of Cincinnati
SERC 306
Wednesday, December 6, 2017 - 11:00
Although research shows that internet users care about their privacy, they do not have the ability to read and understand the privacy policies of all the websites they visit or all the apps they use. Fixing this gap in online notice and choice is the goal of the Usable Privacy Policy Project, an NSF-funded project to extract salient details from privacy policies and present them to internet users in ways that are responsive to their needs. I will present my ongoing work as the lead for the project's natural language processing and crowdsourcing efforts. Our results show that crowdworkers can answer questions about privacy policies with high accuracy and automated methods can identify important details in policy texts, such as statements about data collection and users' privacy options. I will then present some vignettes from my research on online social network privacy and entity linking, along with a long-term goal of "user-oriented natural language processing" to break down the most complex texts that people are obligated to read and automatically find the details that affect them the most.