List of Papers
These are the papers that we will cover this semester.
Please sign up for a presentation by sending me an email with your preferences indicating the number of the paper.
You can list three papers in order of preference so that we will be able to resolve conflicts faster.
NOTE: Some of you may be interested in other topics, e.g., graph related work, than the ones in the list below. If you prefer to present a paper on a different topic, please look over the accepted
papers in the leading DB conferences in the last 2 years. Choose up to 3 papers and rank them in your order of preference. We will choose one that will best fit the scope of the class and your interest.
Presentation schedule is here in pdf.
Presentation guidelines are here.
- Spanner: Google's Globally-Distributed Database. Corbett et al. [read it]
Presenter: Ziqi Wan
- Actively Soliciting Feedback for Query Answers in Keyword Search-Based Data Integration. Zhepeng Yan, Nan Zheng, Zachary G. Ives, Partha Pratim Talukdar, Cong Yu. [read it]
Presenter: TBD
- Finding with the Crowd. Das Sarma, Anish and Parameswaran, Aditya and Garcia-Molina, Hector and Halevy, Alon. [read it]
Presenter: Noor Albarakati
- Online Ordering of Overlapping Data Sources. Mariam Salloum, Xin Luna Dong, Divesh Srivastava, Vassilis J. Tsotras: [read it]
Presenter: TBD
- Highly Available Transactions: Virtues and Limitations. Peter Bailis, Aaron Davidson, Alan Fekete, Ali Ghodsi, Joseph M. Hellerstein, Ion Stoica. [slides]
Presenter: TBD
- GeoScope: Online Detection of Geo-Correlated Information Trends in Social Networks. Ceren Budak, Theodore Georgiou, Divyakant Agrawal, Amr El Abbadi.
[slides]
Presenter: Djordje Gligorijevic
- Gestural Query Specification. Arnab Nandi, Lilong Jiang, Michael Mandel. [read it]
Presenter: Shuang Liang
- From Data Fusion to Knowledge Fusion. Xin Luna Dong, Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Kevin Murphy, Shaohua Sun, Wei Zhang.
[read it]
Presenter: Meng Xiangwei
- Incremental Record Linkage. Anja Gruenheid, Xin Luna Dong, Divesh Srivastava. [read it]
Presenter: TBD
- Retrieving Regions of Interest for User Exploration. Xin Cao, Gao Cong, Christian S. Jensen, Man Lung Yiu.
[read it]
Presenter: Eshaa Ajaey Dhall
- Biperpedia: An Ontology for Search Applications. Rahul Gupta, Alon Halevy, Xuezhi Wang, Steven Euijong Whang, Fei Wu
[read it]
Presenter: Ozkan Kilic
- epiC: an Extensible and Scalable System for Processing Big Data. Dawei Jiang, Gang Chen, Beng Chin Ooi, Kian-Lee Tan, Sai Wu
[read it]
Presenter: Thomas Mirowski
- Tracking Entities in the Dynamic World: A Fast Algorithm for Matching Temporal Records. Yueh-Hsuan Chiang, AnHai Doan, Jeffrey F. Naughton
[read it]
Presenter: Qingyuan Liu
- Scalable and Adaptive Online Joins. Mohammed Elseidy, Abdallah Elguindy, Aleksandar Vitorovic, Christoph Koch
[read it]
Presenter: Spoorthy Kalahasthi
- Exemplar Queries: Give me an Example of What You Need. Davide Mottin, Matteo Lissandrini, Yannis Velegrakis, Themis Palpanas
[read it]
Presenter: Chen Shen
- CrowdFill: Collecting Structured Data from the Crowd. Hyunjung Park and Jennifer Widom. [read it]
Presenter: Longfei Wu
- Corleone: Hands-Off Crowdsourcing for Entity Matching. Chaitanya Gokhale, Sanjib Das, AnHai Doan, Jeffrey Naughton, Narasimhan Rampalli, Jude Shavlik, Jerry Zhu.
[read it]
Presenter: Shanshan Zhang
- A Sample-and-Clean Framework for Fast and Accurate Query Processing on Dirty Data.
Jiannan Wang, Sanjay Krishnan, Michael Franklin, Ken Goldberg, Tim Kraska, Tova Milo.
[read it]
Presenter: YiYuan Zhang
- Fusing Data with Correlations.
Ravali Pochampally, Anish Das Sarma, Xin Dong, Alexandra Meliou, Divesh Srivastava.
[read it]
Presenter: Jun Yang
- Discovering Queries based on Example Tuples.
Yanyan Shen, Kaushik Chakrabarti, Surajit Chaudhuri, Bolin Ding, Lev Novik.
[read it]
Presenter: Dhruv Rajeshkumar Rami
- Schema-free SQL.
Fei Li, Tianyin Pan, H. V. Jagadish.
[read it]
Presenter: TBD
- InsightNotes: Summary-Based Annotation Management in Relational Databases.
Dongqing Xiao and Mohamed Eltabakh.
[read it]
Presenter: Hongxu Zhang
- A Probabilistic Model for Linking Named Entities in Web Text with Heterogeneous Information Networks.
Wei Shen, Jiawei Han, Jianyong Wang.
[read it]
Presenter: Xi Hang Cao
- Modeling Entity Evolution for Temporal Record Matching.
Yueh-Hsuan Chiang, AnHai Doan, Jeffrey Naughton.
[read it]
Presenter: Jelena Stojanovic
- Resolving Conflicts in Heterogeneous Data by Truth Discovery and Source Reliability Estimation.
Qi Li, Yaliang Li, Jing Gao, Bo Zhao, Wei Fan, Jiawei Han.
[read it]
Presenter: Chao Han
- A temporal context-aware model for user behavior modeling in social media systems. Hongzhi Yin, Bin Cui, Ling Chen, Zhiting Hu, and Zi Huang.
[read it]
Presenter: Ning Wang