Numerical Based On Indexing: Problem 1.2
Numerical Based On Indexing: Problem 1.2
Problem 1.1
Consider a disk with a sector size of 512 bytes, 1,000 tracks per surface, 100 sectors per track, 5
double-sided platters and a block size of 2,048 bytes. Suppose that the average seek time is 5 msec,
the average rotational delay is 5 msec, and the transfer rate is 100 MB per second. Suppose that a file
containing 1,000,000 records of 100 bytes each is to be stored on such a disk and that no record is
allowed to span two blocks.
a) How many records fit onto a block?
2048/100 = 20. We can have at most 20 records in a block.
b) How many blocks are required to store the entire file? If the file is arranged sequentially
(according to the ‘next block concept’) on disk, how many cylinders are needed?
1,000,000/20=50,000 blocks are required to store the entire file.
A track has 25 blocks, a cylinder has 25*10=250 blocks. Therefore, we need 50,000/250=200
cylinders to store the file sequentially.
c) How many records of 100 bytes each can be stored using this disk?
The disk has 1000 cylinders with 250 blocks each, i.e. it has 250,000 blocks. A block contains
20 records. Thus, the disk can store 5,000,000 records.
d) If blocks of the file are stored on disk according to the next block concept, with the first block
on block 1 of track 1, what is the number of the block stored on block 1 of track 1 on the next
disk surface?
There are 25 blocks in each track. It is block 26 on block 1 of track 1 on the next disk surface.
f) What is the time required to read the file in random order? Note that in order to read a record,
the block containing the record has to be fetched from disk.
In random access, the read of every block requires an average seek time of 5 msec, an average
rotational delay of 5 msec and a transfer time of 2K/100M sec = 0.02 msec, i.e. 10.02 msec.
Therefore, the time to read the entire file is 50,000*10.02msec ~ 500sec.
Problem 1.2
Suppose that a file has r = 100,000 STUDENT records with the following fields:
NAME (30 bytes), SSN (9 bytes), ADDRESS (40 bytes), PHONE (9 bytes), BIRTHDATE (8 bytes),
SEX (1 byte), MAJORDEPTCODE (4 bytes), MINORDEPTCODE (4 bytes), CLASSCODE (4 bytes,
integer), and DEGREEPROGRAM (3 bytes).
The average record size R = R fixed + R variable = 106 + 18.7 = 124.7 bytes
The total number of bytes needed for the whole file is r * R = 100,000 * 124.7 = 12,470,000 bytes.
Problem 1.3
Consider a disk with block size B = 512 bytes. A block pointer is P = 6 bytes long, and a record
pointer is PR = 7 bytes long. A file has r = 30,000 EMPLOYEE records of fixed length. Each record
has the following fields: Name (30 bytes),Ssn (9 bytes), Department_code (9 bytes), Address (40
bytes), Phone (10 bytes), Birth_date (8 bytes), Sex (1 byte), Job_code (4 bytes), and Salary (4 bytes,
real number). An additional byte is used as a deletion marker.
b. Calculate the blocking factor bfr and the number of file blocks b, assuming an unspanned
organization.
Blocking factor bfr = floor (B/R) = floor (512/115) = 4 records per block
Number of blocks needed for file = ceiling(r/bfr) = ceiling (30000/4) = 7500
c. Suppose that the file is ordered by the key field Ssn and we want to construct a primary
index on Ssn. Calculate
(i) The index blocking factor bfri (which is also the index fan-out fo)
Index record size R i = (V SSN + P) = (9 + 6) = 15 bytes
Index blocking factor bfr i = fo = floor (B/R i) = floor (512/15) = 34
(ii) The number of first-level index entries and the number of first-level index blocks
Number of first-level index entries r1 = number of file blocks b = 7500 entries
Number of first-level index blocks b1 = ceiling (r1 / bfr i) = ceiling (7500/34) = 221 blocks
(v) The number of block accesses needed to search for and retrieve a record from the file—
given its Ssn value—using the primary index
Number of block accesses to search for a record = x + 1 = 3 + 1 = 4
d. Suppose that the file is not ordered by the key field Ssn and we want to construct a
secondary index on Ssn. Repeat the previous exercise (part c) for the secondary index and
compare with the primary index.
(Same as C )
e. Suppose that the file is ordered by the nonkey field Department_code and we want to
construct a clustering index on Department_code that uses block anchors (every new value of
Department_code starts at the beginning of a new block). Assume there are 1,000 distinct
values of Department_code and that the EMPLOYEE records are evenly distributed among
these values. Calculate
(i) the index blocking factor bfri (which is also the index fan-out fo)
Index record size Ri = (V DEPARTMENTCODE + P) = (9 + 6) = 15 bytes
Index blocking factor bfr i = (fan-out) fo = floor (B/Ri) = floor (512/15)= 34 index records per
block
(ii) The number of first-level index entries and the number of first-level index blocks
No. of first-level index entries r1 = no. of distinct DEPARTMENTCODE values= 1000 entries
Number of first-level index blocks b1 = ceiling(r 1 /bfr i) = ceiling (1000/34) = 30 blocks
(v) The number of block accesses needed to search for and retrieve all records in the file that
have a specific Department_code value, using the clustering index (assume that multiple blocks
in a cluster are contiguous).
Number of block accesses to search for the first block in the cluster of blocks = x + 1 = 2 + 1 = 3
The 30 records are clustered in ceiling (30/bfr) = ceiling (30/4) = 8 blocks.
Hence, total block accesses needed on average to retrieve all the records with a given
DEPARTMENTCODE = x + 8 = 2 + 8 = 10 block accesses.