List of Papers

These are the papers that we will cover this semester. Please sign up for a presentation by sending me an email with your preferences indicating the number of the paper. You can list three papers in order of preference so that we will be able to resolve conflicts faster.
NOTE: Some of you may be interested in other topics, e.g., graph related work, than the ones in the list below. If you prefer to present a paper on a different topic, please look over the accepted papers in the leading DB conferences in the last 2 years. Choose up to 3 papers and rank them in your order of preference. We will choose one that will best fit the scope of the class and your interest.
Presentation schedule is here in pdf.
Presentation guidelines are here.

  1. Spanner: Google's Globally-Distributed Database. Corbett et al. [read it]
    Presenter: Ziqi Wan
  2. Actively Soliciting Feedback for Query Answers in Keyword Search-Based Data Integration. Zhepeng Yan, Nan Zheng, Zachary G. Ives, Partha Pratim Talukdar, Cong Yu. [read it]
    Presenter: TBD
  3. Finding with the Crowd. Das Sarma, Anish and Parameswaran, Aditya and Garcia-Molina, Hector and Halevy, Alon. [read it]
    Presenter: Noor Albarakati
  4. Online Ordering of Overlapping Data Sources. Mariam Salloum, Xin Luna Dong, Divesh Srivastava, Vassilis J. Tsotras: [read it]
    Presenter: TBD
  5. Highly Available Transactions: Virtues and Limitations. Peter Bailis, Aaron Davidson, Alan Fekete, Ali Ghodsi, Joseph M. Hellerstein, Ion Stoica. [slides]
    Presenter: TBD
  6. GeoScope: Online Detection of Geo-Correlated Information Trends in Social Networks. Ceren Budak, Theodore Georgiou, Divyakant Agrawal, Amr El Abbadi. [slides]
    Presenter: Djordje Gligorijevic
  7. Gestural Query Specification. Arnab Nandi, Lilong Jiang, Michael Mandel. [read it]
    Presenter: Shuang Liang
  8. From Data Fusion to Knowledge Fusion. Xin Luna Dong, Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Kevin Murphy, Shaohua Sun, Wei Zhang. [read it]
    Presenter: Meng Xiangwei
  9. Incremental Record Linkage. Anja Gruenheid, Xin Luna Dong, Divesh Srivastava. [read it]
    Presenter: TBD
  10. Retrieving Regions of Interest for User Exploration. Xin Cao, Gao Cong, Christian S. Jensen, Man Lung Yiu. [read it]
    Presenter: Eshaa Ajaey Dhall
  11. Biperpedia: An Ontology for Search Applications. Rahul Gupta, Alon Halevy, Xuezhi Wang, Steven Euijong Whang, Fei Wu [read it]
    Presenter: Ozkan Kilic
  12. epiC: an Extensible and Scalable System for Processing Big Data. Dawei Jiang, Gang Chen, Beng Chin Ooi, Kian-Lee Tan, Sai Wu [read it]
    Presenter: Thomas Mirowski
  13. Tracking Entities in the Dynamic World: A Fast Algorithm for Matching Temporal Records. Yueh-Hsuan Chiang, AnHai Doan, Jeffrey F. Naughton [read it]
    Presenter: Qingyuan Liu
  14. Scalable and Adaptive Online Joins. Mohammed Elseidy, Abdallah Elguindy, Aleksandar Vitorovic, Christoph Koch [read it]
    Presenter: Spoorthy Kalahasthi
  15. Exemplar Queries: Give me an Example of What You Need. Davide Mottin, Matteo Lissandrini, Yannis Velegrakis, Themis Palpanas [read it]
    Presenter: Chen Shen
  16. CrowdFill: Collecting Structured Data from the Crowd. Hyunjung Park and Jennifer Widom. [read it]
    Presenter: Longfei Wu
  17. Corleone: Hands-Off Crowdsourcing for Entity Matching. Chaitanya Gokhale, Sanjib Das, AnHai Doan, Jeffrey Naughton, Narasimhan Rampalli, Jude Shavlik, Jerry Zhu. [read it]
    Presenter: Shanshan Zhang
  18. A Sample-and-Clean Framework for Fast and Accurate Query Processing on Dirty Data. Jiannan Wang, Sanjay Krishnan, Michael Franklin, Ken Goldberg, Tim Kraska, Tova Milo. [read it]
    Presenter: YiYuan Zhang
  19. Fusing Data with Correlations. Ravali Pochampally, Anish Das Sarma, Xin Dong, Alexandra Meliou, Divesh Srivastava. [read it]
    Presenter: Jun Yang
  20. Discovering Queries based on Example Tuples. Yanyan Shen, Kaushik Chakrabarti, Surajit Chaudhuri, Bolin Ding, Lev Novik. [read it]
    Presenter: Dhruv Rajeshkumar Rami
  21. Schema-free SQL. Fei Li, Tianyin Pan, H. V. Jagadish. [read it]
    Presenter: TBD
  22. InsightNotes: Summary-Based Annotation Management in Relational Databases. Dongqing Xiao and Mohamed Eltabakh. [read it]
    Presenter: Hongxu Zhang
  23. A Probabilistic Model for Linking Named Entities in Web Text with Heterogeneous Information Networks. Wei Shen, Jiawei Han, Jianyong Wang. [read it]
    Presenter: Xi Hang Cao
  24. Modeling Entity Evolution for Temporal Record Matching. Yueh-Hsuan Chiang, AnHai Doan, Jeffrey Naughton. [read it]
    Presenter: Jelena Stojanovic
  25. Resolving Conflicts in Heterogeneous Data by Truth Discovery and Source Reliability Estimation. Qi Li, Yaliang Li, Jing Gao, Bo Zhao, Wei Fan, Jiawei Han. [read it]
    Presenter: Chao Han
  26. A temporal context-aware model for user behavior modeling in social media systems. Hongzhi Yin, Bin Cui, Ling Chen, Zhiting Hu, and Zi Huang. [read it]
    Presenter: Ning Wang