Solutions For HW5-CS 6033 Fall 2024
Solutions For HW5-CS 6033 Fall 2024
Q1 → OS-SELECT
Q2 → OS-RANK
Q3 → Successor in OS
Q4 → OS Variant
Q5 → Interval
Q6 → B-Tree
Q7 → Food Report
Q8 → Hospital Room Availability
Q1 → OS-SELECT
(b) OS-SELECT(T.root.right, 5)
Now x is the node with key 41, and i = 5. Since the size of 41’s left tree is 5, r = 5 + 1 = 6 > 5, so
we come to the next recursive call OS-SELECT(x.left, 5).
(c) OS-SELECT(T.root.right.left, 5)
Now x is the node with key 30. Since the size of 30’s left tree is 1, 1 + 1 = 2 < 5, we come to the
next recursive call OS-SELECT(x.right, 5 - 2).
(d) OS-SELECT(T.root.right.left.right, 3)
Now x is the node with key 38 and i = 3. Since the size of 38’s left tree is 1, r = 1 + 1 = 2 < 3, we
come to the next recursive call OS-SELECT(x.right, 3 - 2).
(e) OS-SELECT(T.root.right.left.right.right, 1)
Now the x is the node with key 39, r = 1 = i, so return this node, this is the 18th key.
Q2 → OS-RANK
We begin with the target node, whose key is 38, compute r = 38’s left size + 1 = 1 + 1 = 2. Then
set y point’s to the target node, and then we come into the while loop.
a) Because the node 38 is node 30’s right child: r = r + "30’s left size" + 1 = 2 + 1 + 1 = 4,
and set y as node 38’s parent (node key = 30). Then, we come into the next loop.
b) Because node 30 is node 41’s left child. we don’t need to change the value of r and set
y as node 30’s parent (node key = 41). Then, we come into the next loop.
c) Because the node 41 is node 26’s right child: r = r + "26’s left size" + 1 = 4 + 12 + 1 =
17. Then, we come into next loop, set y as node 41’s parent (node key = 26).
Because node y (node key = 26) is the root, the loop ends. Return the r. So the rank of key 38 is
17.
Q3 → Successor in OS
The 𝑖th successor of node 𝑥 should have a rank equal to the sum of 𝑖 and the rank of node 𝑥.
So, we find the rank of node 𝑥 in 𝑂(log(𝑛)) time (using 𝑂𝑆−𝑅𝐴𝑁𝐾(𝑇,𝑥)) and then find the element
which has rank 𝑟=𝑖+𝑟x (using 𝑂𝑆−𝑆𝐸𝐿𝐸𝐶𝑇(𝑇.𝑟𝑜𝑜𝑡,𝑟)), where 𝑟x is the rank of node 𝑥. Since both
methods take 𝑂(log(𝑛)) the total runtime of 𝑂𝑆−𝑆𝑈𝐶𝐶𝐸𝑆𝑆𝑂𝑅(𝑇,𝑥,𝑖) should be 𝑂(log(𝑛)).
In this new augmentation, we store the rank of the element in its subtree (which is 𝑥.sub_𝑟𝑎𝑛𝑘
=number of nodes in the left subtree+1) instead of the size of the element (which is 𝑥.𝑠𝑖𝑧𝑒
=𝑥.𝑙𝑒𝑓𝑡.𝑠𝑖𝑧𝑒+𝑥.𝑟𝑖𝑔ℎ𝑡.𝑠𝑖𝑧𝑒+1).
The run time for insertion will remain Θ(log(𝑛)) as our additions take constant time and will not
impact the run time. 𝑂𝑆−𝑆𝐸𝐿𝐸𝐶𝑇 also takes Θ(log(𝑛)) as we have only slightly changed the
structure. 𝑂𝑆−𝑅𝐴𝑁𝐾 is also Θ(log(𝑛)) as we start at a node in the tree, and traverse up till the
root, in a singular path.
Which is more efficient: augmenting the data structure by adding at each node the size of
the subtree of which it is the root, or augmenting the data structure by adding at each node
the rank of that node of the subtree of which it is the root?
On comparing the runtimes of the methods in both cases, we see that there is no obvious
improvement in one case over the other. So, both approaches have the same efficiency.
Q5 → Interval
Our existing Interval Tree algorithm returns an interval overlapping i or T.nil if no such interval
exists. We can modify this algorithm to go left in the tree after an overlapping interval is found
instead of returning that interval. We move left as only in the left can we have smaller low
endpoints than the current overlapping interval.
Q6 → B-Tree
Since the B-tree uses “preemptive” splits root in the memory, we need to split the root (includes
3 writes). And then we need to search for 3 nodes. Finally, we need to write the leaf that we
inserted our key back into the disk (1 write). So, there are 7 accesses in total.
Q7 → Food Report
We can use an augmented RB-Tree ordered by date to solve this problem. To access the
information for each node, 𝑃𝑅𝑂𝐷𝑈𝐶𝐸𝐷, 𝑇𝑂𝑆𝑆𝐸𝐷, 𝐸𝐴𝑇𝐸𝑁 let’s use 𝑁𝑜𝑑𝑒[𝑥], where
𝑥 ∈ {𝑃𝑅𝑂𝐷𝑈𝐶𝐸𝐷, 𝑇𝑂𝑆𝑆𝐸𝐷, 𝐸𝐴𝑇𝐸𝑁}. The extra information we add at each node will be the sum
of 𝑥 ∈ {𝑃𝑅𝑂𝐷𝑈𝐶𝐸𝐷, 𝑇𝑂𝑆𝑆𝐸𝐷, 𝐸𝐴𝑇𝐸𝑁} of that node and both the left and right subtrees , that is
at each node we will store 𝑁𝑜𝑑𝑒. 𝑠𝑢𝑚[𝑥] = 𝑁𝑜𝑑𝑒. 𝐿𝑒𝑓𝑡. 𝑠𝑢𝑚[𝑥] + 𝑁𝑜𝑑𝑒. 𝑅𝑖𝑔ℎ𝑡. 𝑠𝑢𝑚[𝑥] + 𝑁𝑜𝑑𝑒[𝑥],
where 𝑥 ∈ {𝑃𝑅𝑂𝐷𝑈𝐶𝐸𝐷, 𝑇𝑂𝑆𝑆𝐸𝐷, 𝐸𝐴𝑇𝐸𝑁}.
To report values, we can take advantage of the sum value, and add as we go down the tree
finding the left and right boundary dates. Note that 𝑑 1 and d2 need not be part of the tree. Firstly
start from the root, we need to find the first node "Node" that has a date in range( 𝑑1 , d2). Since
this is the first node we met, all valid nodes will be included in the subtree of Node. But not all
nodes in the subtree of Node are in this range.
Then we need to find two paths - one starts at Node.Left and the other starts at Node.Right.
Let's take the left path as an example: cur is the node we are checking
Every time we check if cur.date is in the required range (since cur.date < Node.date < 𝑑2 , we
only need to compare it with 𝑑1 ).
If cur.date = 𝑑1 : we can stop searching, because we know that all nodes in cur.Left's subtree are
invalid and all nodes in cur.Right's subtree are valid. We just need to add "cur.Right.sum[x]" to
our total.
If cur.date > 𝑑1 : we don't need to check cur.right, the subtree root at cur.right and cur are valid.
We just need to add "cur.Right.sum[x]" to our total. And we still need to check the cur.Left tree. If
cur.date < 𝑑1 we don't need to check cur.Left. What we can say for sure is that all items in the
subtree root at cur. Left are invalid and the current node is not valid. But we still need to check
cur.Right tree.
Time complexity: since we check from root to the leaf node in the left subtree, the time
complexity is O(log n). Similarly the time complexity from the root to the right subtree is another
O(log n), overall we have O(log n) + O(log n) = O(log n) complexity.
Q8 → Hospital Room Availability
This question can be solved using an augmented red black tree. Similar to how we find the rank
of a node by storing extra information in every node in the form of the size of the subtree rooted
at that node. In this case, we can consider the room number as the key and augment every
node with the number of rooms available in the subtree rooted at that node.
INIT(D,A):
Let D be an Augmented Red Black Tree
for r in A
AUGMENTED-RB-TREE-INSERT(D,r)
AUGMENTED-RB-TREE-INSERT(D,r)
// Search for where the new node should be inserted. If the room is
occupied, there is no modification to the search. But if room is not
occupied, modify search to add 1 to the value of num_rooms_available for
all nodes along the path. Insert the new node as the child of an existing
node with the value of num_rooms_available as 1 or 0 accordingly.
RB-TREE-SEARCH-AND-INSERT-MODIFIED(D,r)
// Rebalance the tree while maintaining the num_rooms_available as per the
rotations.
REBALANCE-RB-TREE(D)
In this way, we can create an augmented red black tree where every node has
num_rooms_available in the subtree rooted at that node.
We are having O(1) extra operations to maintain the number of rooms available. So creation of
this Augmented RB Tree will have the same time complexity as creating a normal RB Tree
which is O(n).
Next we have to find the number of available rooms in a given range. We can count how many
rooms are available among the rooms lower than the lower bound of the range. Then we can
count how many rooms are available among the rooms lower than or equal to the higher bound
of the range. The difference between the two would give the total count of available rooms in
the range.
In FIND_LOWER(root,a), we only have to traverse the height of the tree in the worst case i.e
O(h) which is O(logn). COUNT_AVAILABLE uses that function twice and will also have O(logn).
In the next part, we have to find the first unoccupied room in a given range and mark the room
as occupied. Let us again set D.NIL.num_rooms_available = 0.
We can determine whether a given room is occupied or not by checking if there is a difference
between the number of rooms available at the node and the sum of the rooms available in the
left and right subtree.
Lowest common ancestor of 2 nodes can be found by going from node A to the root and storing
the elements in the path in a hash table and while going from node B to the root, check the
hash table if it is present.
The smallest available room in a tree that has node x as root can be found. If the left of the
node has rooms available, go left. Else check node x or go right.
We can first find the lowest common ancestor between the nodes l and h. We can check left of
the ancestor, then the ancestor and then right of the ancestor to find the first available room in
the range. The time complexity of ADMIT(l,h) will be O(logn) as the maximum number of nodes
traversed will be in the range of the height of the tree.