Reading List

These are the papers that we will cover this semester. Please sign up for a presentation by sending me an email with your preferences indicating the number of the paper. You can list three papers in order of preference so that we will be able to resolve conflicts faster.
NOTE: Some of you may be interested in other topics, e.g., graph related work, than the ones in the list below. If you prefer to present a paper on a different topic, please look over the accepted papers in the leading DB conferences in the last 2 years. Choose up to 3 papers and rank them in your order of preference. We will choose one that will best fit the scope of the class and your interest.
Presentation schedule is here in pdf.
Presentation guidelines are here.

  1. The Case for Learned Index Structures. Kraska et al. [read it]
    Additional Material: Tutorial; Medium article
    Presenter: TBD
  2. Learned Index Benefits: Machine Learning Based Index Performance Estimation. Shi et al/ [read it]
    Presenter: TBD
  3. Text-to-SQL: NLP Community
    • Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data. Hazoom et al. [read it]
      Presenter: Parisa Aghazadeh
    • Towards Complex Text-to-SQL in Cross-Domain Database with Intermediate Representation. Elgohary et al. [read it]
      Presenter: Saiyun DONG
  4. Text-to-SQL: DB Community
    • Few-shot Text-to-SQL Translation using Structure and Content Prompt Learning. Gu et al. [read it]
      Presenter: TBD
    • CatSQL: Towards Real World Natural Language to SQL Applications. Fu et al. [read it]
      Presenter: Hardik Sharma
    • MT-Teql: Evaluating and Augmenting Neural NLIDB on Real-world Linguistic and Schema Variations. Ma et al. [read it]
      Presenter: TBD
    • ScienceBenchmark: A Complex Real-World Benchmark for Evaluating Natural Language to SQL Systems. Zhang et al. [read it]
      Presenter: TBD
    • Additional Material, Tutorial. Özcan et al.
      Presenter: TBD
  5. Automatic DBMS Tuning
    • DB-BERT: A Database Tuning Tool that "Reads the Manual. Immanuel Trummer et al. [read it]
      Presenter: Cindy Zastudil
    • DB-GPT: Large Language Model Meets Database. Zhou et al. [read it]
      Presenter: Sayantan Kundu
  6. Query Optimization
    • Learning to Optimize Join Queries With Deep Reinforcement Learning. Krishnan et al. [read it]
      Additional Resource
      Presenter: Andy Gnias
    • Neo: A Learned Query Optimizer. Markus et al. [read it]
      Presenter: TBD
    • An Inquiry into Machine Learning-based Automatic Configuration Tuning Services on Real-World Database Management Systems. Van Aken et al.read it
      Presenter: Fahim Bashar
    • Plan-Structured Deep Neural Network Models for Query Performance Prediction. Ryan Marcus and Olga Papaemmanouil.read it
      Presenter: TBD
  7. Other
    • PromptEM: Prompt-tuning for Low-resource Generalized Entity Matching Wang. et al. read it
    • How Large Language Models Will Disrupt Data Management. Fernandez et al. read it
      Presenter: Thanh Nguyen