Assignment 2

NOTES:
This assignment is meant to give you practice with foundations of database systems: Storage and Indexing, Disks and Files, Tree Indexing, and Hash Indexing.

Problem 1

  1. What is sequential flooding of the buffer pool?
  2. Give an example of sequential flooding of the buffer pool.

Problem 2

Consider the instance of the Sailors relation in the following figure, sorted by NAME. For the purposes of this question, assume that these tuples are stored in a sorted file in the order shown; the first tuple is on page 1, the second tuple is also on page 1; and so on. Each page can store up to three data records; so the fourth tuple is on page 2. For a search key value k, assume the following alternatives for a data entry with the search key value k in an index file:
  1. Alternative 1: A data entry is an actual data record (with search key value k).
  2. Alternative 2: A data entry is a (k, rid) pair, where rid is the record id (address) of a data record with search value k.
  3. Alternative 3: A data entry is a (k, rid-list) pair, where rid-list is a list of records ids of data records with search key value k.
An instance of Sailors.
SIDNAMEAGESALARY
53Andy2510K
54Dave3020K
44David4030K
32Eli4520K
20John5010K
EXPLAIN what the data entries in each of the following indexes contain. If the order of entries is significant, say so and EXPLAIN why. If such an index cannot be constructed, say so and explain why.
  1. An unclustered index on name using Alternative (1).
  2. An unclustered index on name using Alternative (2).
  3. An unclustered index on name using Alternative (3).
  4. A clustered index on name using Alternative (1).
  5. A clustered index on name using Alternative (2).
  6. A clustered index on name using Alternative (3).
  7. An unclustered index on salary using Alternative (1).
  8. An unclustered index on salary using Alternative (2).
  9. An unclustered index on salary using Alternative (3).
  10. A clustered index on salary using Alternative (1).
  11. A clustered index on salary using Alternative (2).
  12. A clustered index on salary using Alternative (3).
  13. Consider the query SELECT * FROM Sailors WHERE name LIKE 'da%'. Which of the above options is the most efficient in executing this query? Explain your choice.
  14. Consider the query SELECT AVG(salary) FROM Sailors. Which of the above options is the most efficient in executing this query? Explain your choice.

Problem 3

Consider a disk with the following specifications: sector size = 1024, 4000 tracks per surface, 100 sectors per track, 10 double-side platters, average seek time of 10 msec, and the disk platters rotate 7,200 rpm (revolutions per minute).
  1. What is the capacity of a track in KBs (an KB = 1024 bytes)?
  2. What is the capacity of each platter surface?
  3. What is the capacity of the disk?
  4. How many cylinders does the disk have?
  5. What is the maximum rotational delay?
  6. If an entire track of data can be transfer per revolution, what is the transfer rate?

Problem 4

Consider a B+-tree of order 2 (i.e., d = 2 and max. keys=4). Insert the following keys in order:
  1. 10, 20, 30, 40, 50, 60, 70, 80, 90. Show the obtained B+-tree. Call this tree R.
  2. Insert the following set of keys in order in R: 15, 25, 35, 45, 55, 65, 75, 85, 95, 100. Call this tree R1. What is the first key added that causes the B+-tree R to grow to height 3? Show the tree R1.
  3. Show the tree R after deleting the following keys in order: 70, 90, and 10