Phase 4: Repeat Study on the Performance of a DBMS
Requirements
- Implement the following procedure for inserting/updating the abstract field
Iterate over the set of publications in your database.
If a publication has an empty ambstrct field, then add one as follows.
Generate a random number between 500 and 2000.
Create a string of that length (e.g, "aaaa....a").
Update the field abstract.
Make sure that between this and the previous procedure your database will have at least 100K publications with non-empty abstract fields.
- Run the following type of queries on your DB instance.
simple query: SELECT * FROM table
single table, complex conditions: SELECT * FROM table WHERE condition
condition must have at least 5 clauses related with AND, OR, and NOT.
join query and complex conditions: use at least 3 tables.
sorting (ORDER BY): SELECT attr1, attr2, attr3 FROM table ORDER BY some attributes.
aggregate functions: SELECT SUM(), AVG(), MAX(), COUNT() FROM table(s) WHERE some condition GROUP BY attr(s). You should use all of them.
add HAVING to the previous query
Record the following information in a table per query: Execution Time, CPU Utilization, Memory Utilization, Threads Used.
- Execute the queries in the folloing manners.
same query multiple time in a row (say 5 times). Record each time the information about and give the averages.
execute the queries in a random order. Repeat several times, at least 5 times.
Record each time the information about a query. Give the averages.
Deliverables
Record the state of DB Serve before you execute any of the querie.
Give the queries.
Give a detailed tables of results.
Give averages and standard deviations.
Analyze the results. Comment on the observed behaviour for the queries.
Compare this study against your previous one. Explain why the observed differences.
Start early!
Collaborate! Compare and discuss your approaches. The end product is expected to be an individual effort!