Reading List

These are the papers that we will cover this semester. Please sign up for a presentation by sending me an email with your preferences indicating the number of the paper. You can list three papers in order of preference so that we will be able to resolve conflicts faster.
NOTE: Some of you may be interested in other topics, e.g., graph related work, than the ones in the list below. If you prefer to present a paper on a different topic, please look over the accepted papers in the leading DB conferences in the last 2 years. Choose up to 3 papers and rank them in your order of preference. We will choose one that will best fit the scope of the class and your interest.
Presentation schedule is here in pdf.
Presentation guidelines are here.

  1. Query Specific Rank Fusion for Image Retrieval. Shaoting Zhang, Ming Yang, Timothee Cour, Kai Yu, and Dimitris N. Metaxas, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015. [read it]
    Presenter: Heng Fan
  2. Image Retrieval using Scene Graphs. Justin Johnson, Ranjay Krishna, Michael Stark, Li-Jia Li, David A. Shamma, Michael S. Bernstein, Li Fei-Fei. [read it]
    Presenter: Fan Yang
  3. Comparing Twitter and Traditional Media using Topic Models. Wayne Xin ZhaoAffiliated withPeking University, Jing Jiang, Jianshu Weng, Jing He, Ee-Peng Lim, Hongfei Yan, Xiaoming Li. [read it]
    Presenter: Lihong He
  4. Online multimodal deep similarity learning with application to image retrieval. Pengcheng Wu, Steven C.H. Hoi, Hao Xia, Peilin Zhao, Dayong Wang, and Chunyan Miao, 2013: [read it]
    Presenter: Peng Chu
  5. The role of research leaders on the evolution of scientific communities. Bruno Leite Alves, Fabrício Benevenuto, and Alberto H.F. Laender. 2013. [slides]
    Presenter: TBD
  6. Google's Deep Web crawl. Proc. VLDB Endow. Jayant Madhavan, David Ko, Łucja Kot, Vignesh Ganapathy, Alex Rasmussen, and Alon Halevy. 2008. [slides]
    Presenter: Gaurangkumar Shaileshbhai Patel
  7. Search result diversification in resource selection for federated search. SIGIR 2013. Dzung Hong, Luo Si. [read it]
    Presenter: TBD
  8. Mining Search Engine Query Logs via Suggestion Sampling. Proc. VLDB Endow. Ziv Bar-Yossef and Maxim Gurevich. 2008 [read it]
    Presenter: Abrar Alrumayh
  9. Linking Virtual and Real-World Identities. ISI. Y. Alsarkal, Y. Zhou, N. Zhang. 2016 [read it]
    Presenter: Yagni Urvishbhai Patel
  10. Mining a Search Engine's Corpus Without a Query Pool. CIKM. Mingyang Zhang, Nan Zhang, Gautam Das. 2013. [read it]
    Presenter: TBD
  11. Learning to Query: Focused Web Page Harvesting for Entity Aspects. Y. Fang, V. W. Zheng, and K. C.-C. Chang. In ICDE 2016. [read it]
    Presenter: Rajpreet Kaur Gulati
  12. Towards Social Data Platform: Automatic Topic-focused Monitor for Twitter Stream. R. Li, S. Wang, and K. C.-C. Chang. PVLDB. 2013. [read it]
    Presenter: Mohammad Alqudah
  13. Enabling Entity-Centric Document Filtering by Meta-Feature-based Feature Mapping. M. Zhou and K. C.-C. Chang. In CIKM 2013. [read it]
    Presenter: TBD
  14. Online ordering of overlapping data sources. In VLDB, 2014. Mariam Salloum, Xin Luna Dong, Divesh Srivastava, Vassilis J. Tsotras. [read it]
    Presenter: TBD
  15. Characterizing and selecting fresh data sources. In VLDB, 2014. Theodoros Rehatsinas, Xin Luna Dong, Divesh Srivastava. [read it]
    Presenter: TBD
  16. Inverted Index Compression Using Word-Aligned Binary Codes. Inf. Retr. Vo Ngoc Anh and Alistair Moffat. 2005. [read it]
    Presenter: Alkhansaa Abdulrahman Abuhashim
  17. Searching the workplace web. In WWW. Ronald Fagin, Ravi Kumar, Kevin S. McCurley, Jasmine Novak, D. Sivakumar, John A. Tomlin, and David P. Williamson. 2003. [read it]
    Presenter: TBD
  18. Inferring and using location metadata to personalize web search. In SIGIR. Paul N. Bennett, Filip Radlinski, Ryen W. White, and Emine Yilmaz. 2011. [read it]
    Presenter: Jinzhu Deng
  19. Multi-Stage Math Formula Search: Using Appearance-Based Similarity Metrics at Scale. Richard Zanibbi (Rochester Institute of Technology), Kenny Davila (Rochester Institute of Technology), Andrew Kane (University of Waterloo), Frank WM Tompa (University of Waterloo). 2016. [read it]
    Presenter: TBD
  20. Robust and Collective Entity Disambiguation through Semantic Embeddings. Stefan Zwicklbauer (University of Passau), Christin Seifert (University of Passau), Michael Granitzer (University of Passau). 2016 [read it]
    Presenter: TBD
  21. When does Relevance Mean Usefulness and User Satisfaction in Web Search? Jiaxin Mao, Yiqun Liu, Ke Zhou, Jian-Yun Nie, Jingtao Song, Min Zhang, Shaoping Ma, Jiashen Sun, Hengliang Luo. 2016. [read it]
    Presenter: TBD
  22. Collective Entity Resolution with Multi-Focal Attention. Amir Globerson, Nevena Lazic, Soumen Chakrabarti, Amarnag Subramanya, Michael Ringaard and Fernando Pereira. [read it]
    Presenter: Aniruddha Maiti
  23. Query Expansion with Locally-Trained Word Embeddings. Fernando Diaz, Bhaskar Mitra and Nick Craswell. 2016. [read it]
    Presenter: Meng Ye
  24. DIADEM: Thousands of Websites to a Single Database. PVLDB. Tim Furche, Georg Gottlob, Giovanni Grasso, Xiaonan Guo, Giorgio Orsi, Christian Schallhart, Cheng Wang. 2014. [read it]
    Presenter: Benjamin North