Semester Long Project
There are two projects. You need to choose one of the project for the semester. Choose the project that fits your programming skills better. Both projects will be carried out in phases.
Project 1: Playing with Web data.
In this project you will be given a large collection of data collected from the Web and you will be required to carry out various tasks according to the topics covered in class.
Project Phase | Topic | Duration |
Phase 1: | Text Processing. | Oct. 19, 2016 (two weeks) |
Phase 2 | Tokeninzation/Lemmatication/Normalization | Nov. 2, 2016 (two weeks) |
Phase 3 | Term Weighting Schemes | Nov. 17, 2016 (two weeks) |
Phase 4 | Evaluation | |
Project 2: Crawling the Deep Web.
In this project you will carry out tasks that will allow you to understand the main differences between Deep Web and Surface Web.
Project Phase | Topic | Duration |
Phase 1: | Deep Web Crawling. | Oct. 19, 2016 (two weeks) |
Phase 2 | Extract Structured Data | Nov. 2, 2016 (two weeks) |
Phase 3 | Deep Web Crawling for a second source | Nov. 17, 2016 (two weeks) |
Phase 4 | TBA | |