Query Optimization.
In this phase of the project you will execute a number of queries on your databases, report the initial running times, and give the steps to improve the execution time of each query.
Query 1:
Give the total number of authors in DBLP that have at least one of their publication records in Microsoft Academic Search (MAS).
Query 2:
Give the distribution of author count per matched publication records in Microsoft Academic Search (MAS). This query must return a relation with two columns Number of Matched Publications and Author Count. For example, a row of the form (10, 20) in this relation means that 20 authors have 10 of their publications matched against MAS, while a row (5, 100) tells us that 100 authors have 5 of their publication records matched against MAS. Analyze the query with and without the clause "ORDER BY Number of Matched Publications."
Query 3:
Same as Query 2, but include the condition that only authors whose names contain a given substring, say 'Zhang', must be considered.
Query 4:
Same as Query 3, but include the condition that only authors whose NOTE field contain a given substring, say 'University', must be considered.
Deliverables
Give the SQL query for each query.
Include a screen shot per query that shows the execution of the query and includes the execution time.
Include a screen shot with the Execution Plan per query.
Describe the steps you took to optimize each query.
Include screen shots per query show the gained improvement in the execution time (show the results, the new execution time and Execution Plan).
This is your last report. Include the updates from all the previous phases.
Start early!
Collaborate! Compare and discuss your approaches. The end product is expected to be an individual effort!